locked
Questions About Managed NTFS Sample RRS feed

  • Question

  • I've been working on my own version of the NTFS sync that adds sub folders and only creates a single sync file. It seems pretty straight forward so far but I ran into one issue that I wanted to ask about.

     

    Right now in the DetectChanges method, the FindLocalFileChanges method is called each time. While this doesn't break anything, if you have more changes than the batch size then this method will be called each time (even though in theory there shouldn't be any new local changes to detect). Does this really need to be called each time?

     

    The same thing applies on the destination when you call ProcessChangeBatch.

     

    Another question is around the first time you call FindLocalFileChanges, it takes about 30 seconds when you have around 1k files and uses 100% of the CPU. Is it recommended that you process this on a background thread (and can we get a simple example)?

     

    Last question is that there is no tombstone cleanup implemented in the sample. Is there any examples of how this should work or is this more specific to the implementation?

     

    Thanks!

     

     

    • Moved by Max Wang_1983 Thursday, April 21, 2011 10:20 PM forum consolidation (From:SyncFx - Technical Discussion [ReadOnly])
    Thursday, December 13, 2007 12:11 AM

Answers

  • Hi Bryant,

     

    To give you some background, the NTFS sample is intended to illustrate the basic mechanics of the various operations required of a sync provider and the general method for invoking synchronization. As such, in constructing the sample we favor concerns such as simplicity and clarity over performance.

     

    In the particular case you ask about, you could avoid calling FindLocalFileChanges() on each call. However, you will need to be careful if you do so. It is quite possible that concurrent changes happen to the file system during synchronization. If these changes are not reflected by updating the local versions on the destination, then potentially conflicts will not be detected. Invoking FindLocalFileChanges() before applying changes mitigates this to some degree although there is still a window in which potential changes will be missed. To be very robust, one would need to do optimistic concurrency control on the destination, verifying under some held lock that the local file has not changed immediately prior to applying changes overtop (and skipping applying change if a destination change is detected). Such a method is used in the file sync provider. 

     

    Also the file sync provider uses the SQL CE metadata store rather than the simple store in the sample. The sample metadata store is optimized for clarity rather than efficiency. As the number of items scales, the SQL CE metadata store is much more efficienct.

     

    In general, writing a full feature file sync provider is non-trivial due to hierarchy and the lack of built-in change detection. Also the file system is not transacted which makes the work of the provider more difficult.

     

    Some aspects not addressed in the sample that would need to be considered include:

    -Handling of moves

    -Handling scenarios where parents arrive at the destination after children

    -Handling of various hierarchy related conflicts (update-delete) etc.

    -Handling of concurrent changes (as mentioned above)

     

    Please note this is not a comprehensive list.

     

    In general, if you have file sync scenarios we recommend using the in-the-box file sync provider which handles these cases. If you have scenarios that the built-in provider cannot handle, we're certainly interested in learning about those.

     

    Regarding tombstone clean-up, the general mechanics are the same from provider to provider. Via some policy (wall clock based (i.e. older than x days), or something else) choose what tombstones you'd like to remove. Then, remove all tombstones with versions less than your threshold and update the forgotten knowledge accordingly.

     

    Hope this helps,

    Thanks,

    Neil

    Thursday, December 13, 2007 10:32 PM

All replies

  • Hi Bryant,

     

    To give you some background, the NTFS sample is intended to illustrate the basic mechanics of the various operations required of a sync provider and the general method for invoking synchronization. As such, in constructing the sample we favor concerns such as simplicity and clarity over performance.

     

    In the particular case you ask about, you could avoid calling FindLocalFileChanges() on each call. However, you will need to be careful if you do so. It is quite possible that concurrent changes happen to the file system during synchronization. If these changes are not reflected by updating the local versions on the destination, then potentially conflicts will not be detected. Invoking FindLocalFileChanges() before applying changes mitigates this to some degree although there is still a window in which potential changes will be missed. To be very robust, one would need to do optimistic concurrency control on the destination, verifying under some held lock that the local file has not changed immediately prior to applying changes overtop (and skipping applying change if a destination change is detected). Such a method is used in the file sync provider. 

     

    Also the file sync provider uses the SQL CE metadata store rather than the simple store in the sample. The sample metadata store is optimized for clarity rather than efficiency. As the number of items scales, the SQL CE metadata store is much more efficienct.

     

    In general, writing a full feature file sync provider is non-trivial due to hierarchy and the lack of built-in change detection. Also the file system is not transacted which makes the work of the provider more difficult.

     

    Some aspects not addressed in the sample that would need to be considered include:

    -Handling of moves

    -Handling scenarios where parents arrive at the destination after children

    -Handling of various hierarchy related conflicts (update-delete) etc.

    -Handling of concurrent changes (as mentioned above)

     

    Please note this is not a comprehensive list.

     

    In general, if you have file sync scenarios we recommend using the in-the-box file sync provider which handles these cases. If you have scenarios that the built-in provider cannot handle, we're certainly interested in learning about those.

     

    Regarding tombstone clean-up, the general mechanics are the same from provider to provider. Via some policy (wall clock based (i.e. older than x days), or something else) choose what tombstones you'd like to remove. Then, remove all tombstones with versions less than your threshold and update the forgotten knowledge accordingly.

     

    Hope this helps,

    Thanks,

    Neil

    Thursday, December 13, 2007 10:32 PM
  • Thanks for the response!

     

    So then would this sample not be a good starting point for writing syncing applications? Should I be using the file sync provider you mention?

     

    I'm working on building a local to remote file sync (one-way only) and had been using the sample as my starting point. I just ran into an issue where I can't pass the SyncKnowledge across the WCF boundry and was trying to figure out what to do next. Should I be looking somewhere else?

     

    Thanks!

    Saturday, December 15, 2007 2:29 AM