none
Need help understanding how framework handles batching RRS feed

  • Question

  • I have another question/problem with batch processing.  My prototype is running the MSF Sync Framework 2.0 using code extracted from the WebSharingAppDemo-SqlProviderEndToEnd application, using WCF as the transport mechanism, and SqlServer 2005 for the database.  I have two client databases and one server database, all running in the same instance of SqlServer 2005.  The entire prototype is running on my PC.

    What I am seeing is that when the number of records involved is small, everything works just fine.  However, if I create something like 10,000 records and then attempt to sync them, I encounter the same basic error:

    Sync thread for New_ProtoClient1 is starting...

    Exception: The process cannot access the file 'f25001cd-188b-4650-8926-6b980660740f.batch' because it is being used by another process.

    Sync thread for New_ProtoClient1 is completed.

    If I look at my batching directory, I can easily see that this file exists, but it also has a length of zero bytes.  This strikes me as odd and quite possibly incorrect.  I am currently using "default" batch settings from the demo application:

    Dim binding As New WSHttpBinding()
    binding.ReaderQuotas.MaxArrayLength = 100000
    binding.MaxReceivedMessageSize = 10485760
    Dim factory As New ChannelFactory(Of Contracts.WebSync.ISync)(binding, New EndpointAddress("http://localhost:8080/SyncService"))
    Me.proxy = CType(factory.CreateChannel(), Contracts.WebSync.ISync)

    Typically, what happens is that we process about 530 batches (which seems excessive for only 10,000 fairly small records) and then the error occurs and the sync terminates.

    Since both the client and the sync service applications are running in the same PC, is there some possibility that these two processes are colliding? So far, I don't see any evidence in my service code that the service is attempting to write a batch file of any kind (the 10,000 records are being synced up to the client from the server database). However, if this is a problem I can switch my server database and my sync service to another system.

    Is there something I should be reading that would explain this process better? In particular, how to tune for performance? What would happen if we disabled batching? I'm not sure that is a good idea for us. We have identified a worst-case sync scenario of about 500,000 total records to be synced, so I am definitely going to need to understand batching better.

    Thanks for any thoughts, suggestions, or ideas you can offer.

     

    Monday, April 12, 2010 7:46 PM

Answers

All replies

  • PuzzledBuckeye,

    I know I came across this problem a while back but I don't remember exactly what I did to fix it so, I am going to guess at what I did (meaning I am picking what I think is the change that I have done that probably addressed this):

    First off, the code that is the examples for writing to and reading from files is not "optimal". You see, because of the nature of housekeeping within the .NET Framework, and the fact that there is a distinct lack of explicit closing of files and streams or use of "using" to implicitly housekeep (and therefore close) the files and streams, you can end up with situations where files have not been closed, even though the variable may have been out of scope.

    I would suggest that all of your file and stream i/o is handled within "using" statements. This way you get the file or stream closed when you want it to happen.

    It's my guess that this is how I fixed this problem. Sorry I cannot be more precise.

    Even if that is not the source of this, it's the right thing to do anyway.

    While on this sort of subject, you should also check all of your DirectoryInfo variables. You see in many ofthe samples, you can get into situations where a non-existent file yields true for DirectoryInfo.Exists. This is because the value is only established at creation time or any time that the Refresh method is called. So after deleting a file, if you have a global DirectoryInfo (as some examples do), the file will still exist even though it doesn't!

    HTH

    Steve

    Monday, April 12, 2010 10:11 PM
  • have you checked this if its the same scenario? http://social.microsoft.com/Forums/en-US/syncdevdiscussions/thread/1dac63f5-e952-4bb7-aa70-45de61a605d4

    you may want to lookup Steve's post on batching with Blobs as well. If any of the blob's exceeds the batch size, sync terminates.

    also, with batching, it's the transmission that is batched, change application is not batched.

    So if you have 10 batches in your sync for example in an upload, the 10 batches has to be uploaded before everything is applied.

    Internally, changes are selected and stored in a dataset, the dataset converted to a DatasetSurrogate, then the DataSetSurrogate serialized to a file then gets sent. On the receiving side, the file is reverted back to DataSetSurrogate, DataSetSurrogate to Dataset, then applied.

    you can check out more details here: http://blogs.msdn.com/mahjayar/archive/2009/09/16/msf-v2-ctp2-deep-dive-memory-based-batching.aspx

    Monday, April 12, 2010 10:11 PM
    Moderator
  • Do you have any anti virus software installed? We have seen this error happen when the anti virus scanner is scanning the files and the sync runtime tries to delete it. If you have any AV installed then please exclude the sync batching folders from its scan location or configure it to ignore .sync files. For securty sake I will recommend the first apporach of excluding the sync directory.

    If thats not the case then can you please post the complete stack trace of the exception. You might also enable verbose batching to see the error logs from the batching component.


    Maheshwar Jayaraman - http://blogs.msdn.com/mahjayar
    Tuesday, April 13, 2010 2:25 AM
    Moderator
  • We have a winner!!  I followed one of the links that JuneT posted and it mentioned the exact same problem.  In that posting, it was pointed out that there is a coding error in the example application (you might want to fix that sometime soon).  I applied the fix to my code and everything is now working.

    While I was poking around in there, I double-checked that I had all of my file accesses wrapped up inside of "Using" blocks as recommended by Steve.  I do recall seeing that earlier and fixing a couple of them, but I wanted to make sure I had them all fixed.

    Also, thanks for giving me a better description of how batch syncing works.  Somehow I missed that bit of information about all of the changes being downloaded first and then applied, but that could be really important in our application because of the way we deploy a new client.  Right now, a client installs the software which in turn creates an empty database, and then they load their entire database from the initial sync.  This will not be a problem for a new client site, but a client who is reloading their application on a new piece of hardware could get over 500,000 records.  Obviously, I now know I need to test this specific situation to see what happens.

    Again, thanks for the assistance; this whole process has been a real challenge but I am continuing to make progress.

     

    Tuesday, April 13, 2010 2:12 PM
  • While I was poking around...

    Somehow I missed that bit of information about all of the changes being downloaded first and then applied, but that could be really important in our application because of the way we deploy a new client...... a client who is reloading their application on a new piece of hardware could get over 500,000 records.  Obviously, I now know I need to test this specific situation to see what happens.


    Puzzled:

    Well, you will certainly find some interesting things happen with a 500,000 record sync if you are using WCF!

    You see, the server, having served up the 500,000 updates in a bunch of batches, will be sitting around doing nothing. Consequently, after the configured idle time (I forget the name of the value), the WCF session will cease to exist. Meanwhile, your client is busy applying changes.

    Some time later, your client gets done and issues an EndSession() call, only to find that an exception occurs since there is no longer a WCF session. The SyncOrchestrator doesn't serve up any stats in this case (even though all of your applies worked). So you must deal with this in the proxy's EndSession() handling. Simply expect a fault (try/catch), but otherwise ignore it (log it if you like). Then SyncOrchestrator is happy and you get your stats with the successful sync.

    And another gotcha. Let's say you set your batch size to 100k. If there is a single record in that 500,000 that is bigger than your batch size, then your session will fail. Maybe after you did 499,000 change downloads. The only current remedy is to increase the batchsize to be bigger that your largest record. But that defeats the concept of the batching. Ha, but I digress since I already complained er I mean posted about that one.

    HTH

    Steve

    Tuesday, April 13, 2010 6:42 PM
  • One clarification around the need to wait till we get the last batch. This is the way the runtime guarantees integrity of the data for the scope across two ends. We are actively investigating on the need to commit batches on the destination as they arrive but you have to realize that commit data in chunks may leave your database in a inconsistent state (for example all orders have commited but their corresponding orderlineitems not yet commited) for a short period of time. For some apps this may be ok but for some data integrity is more important than incremental/visible progress.

    The batching runtime also supports resuming of an interrupted sync session. In case your runtime crashed and sync is restarted the runtime will reuse the batch files it had spawned in the earlier failed session (caveat: certain conditions have to be satisfied for reuse). Please refer to blog http://blogs.msdn.com/mahjayar/archive/2009/11/16/msf-v2-deepdive-batching-directory-and-batch-files.aspx for more info on batch file reuse.

    Maheshwar


    Maheshwar Jayaraman - http://blogs.msdn.com/mahjayar
    Tuesday, April 13, 2010 6:49 PM
    Moderator