Efficient way to Sync large amount of data over the network? RRS feed

  • Question

  • I am writing an application in C# which will do the following:

    I have approx 3TB of data(files) which I need to sync from my server on to an external drive(size is approx 3TB) which is attached to my workstation. Is there a way I can estimate how much data can be copied into my external drive before actual sync (using SyncFx)? Also, how can I estimate the time required for the sync and size of metadata? 

    I am thinking of syncing  few files(for example say 1000 files) recursively, instead of all files together(but for that I have to find an accurate way of knowing how much data can be copied to the external drive).So that the process would be more efficient and wont hog the network and my machine.

    Any inputs on this approach. 

    • Edited by arm007 Thursday, February 9, 2012 2:26 PM
    Thursday, February 9, 2012 2:25 PM

All replies

  • if you're after estimating the size of the changed files detected before applying them, you can run the sync in PreviewMode, listen to one of the events, grab the list of files detected, then their sizes and total them. it's an expensive operation though.

    also, i dont think the file sync has batching capabilities, so to batch the files you might have to use Filters instead. For example you can sync all files whose filenames starts with letter A then so on...

    Thursday, February 9, 2012 2:42 PM
  • Yeah I thought of PreviewMode option but its too expensive in this case.

    Well I didnt think of using Filters. Instead, I was thinking of creating a temporary folder server side where I will copy few files based on date time and then sync this temporary folder with my external drive(recursively). Is there a way I can Filter out file based on timestamp?

    Having bacthing option in file sync would have been really nice! 

    Thursday, February 9, 2012 3:00 PM
  • i dont think the filters works with timestamps...on second thought, the filter approach might not work.

    isnt copying to a temporary folder more expensive? it's a write operation.

    Thursday, February 9, 2012 3:09 PM
  • I just found out in docs that might work for seperating file on timestamp "AttributeExcludeMask" property of FileSyncScopeFilter.

    Copying is on same machine so I dont think it will matter much.

    Thursday, February 9, 2012 3:18 PM
  • Well I spoke to quickly....AttributeExcludeMask wont give me timestamp of a file...damn
    Thursday, February 9, 2012 3:21 PM
  • i reckon the read-only operation of PreviewMode will be faster than the read/write/enumerate file copy approach.

    Friday, February 10, 2012 5:36 AM
  • How can that be done?
    Friday, February 10, 2012 2:10 PM
  • you mean the PreviewMode? its a property you set in the provider.
    Friday, February 10, 2012 3:09 PM
  • Opps got your point .... I have to test it first to see which one is less expensive!
    Friday, February 10, 2012 9:26 PM
  • Hey JuneT,

    I am doing an Upload. After setting the PreviewMode = true for both the source and the destination I am synchronizing.  How do I know how many files I need to Upload (i.e. copy from source to destination)?
    Do you have a sample code? 

    Monday, February 13, 2012 8:59 PM
  • PreviewMode is simulating a sync without actually doing a sync, so you can subscribe to its events. For example, you can subscribe to the DetectedChanges event to check what files were detected.

    Tuesday, February 14, 2012 1:10 AM