none
Another Batch Processing Issue RRS feed

  • Question

  • I have SFx 2.0 solution running over WCF and IIS 6.0. It is essentially a hub and spoke topology. Sync is always controlled from the client. On the client is a SqlSyncProvider (to Sql Server 2005 Express) with a RelationalSyncProvider-based proxy. On the server side is a service with a SqlSyncProvider (to Sql Server 2008). SyncOrchestrator sits inthe middle, running on the client.

    Batching was all working fine. Batchsize at client and server was 500. Then we had some exceptions occur which was because there was a 2MB record which would not fit in a 500k batch. So the batchsize was upped to 3000 on both ends and the problem went away.

    Well what has actually happened is that when we went to the 3000 batch size, batching stopped being used when uploading from client to server. We did not notice this at first because data in that direction is mainly light and apparently would fit within the maximum WCF message length (with the miriad of settings that that entailed, but they were all set to 6MB including the HttpRuntime param as well).

    That was until we got a 18,000 record sync to upload. The changeset for this was a message larger than 6mb and so we got WCF exceptions. Analyzing the problem we discovered that the upload batching was not working.

    By reducing the size of the batch, we found that we could get the batching to work again.

    Here are the results (C=client,S=server, yes=batching was used):

    C=4000,S=2103 -> no
    C=4000,S=2102 -> yes
    C=3000,S=2103 -> no
    C=3000,S=2102 -> yes
    C=2500,S=2103 -> no
    C=2103,S=2103 -> no
    C=2102,S=2103 -> yes
    C=2000,S=2103 -> yes
    C=2000,S=2103 -> yes
    C=1000,S=2103 -> yes
    

    So basically, given that the SyncOrchestrator picks the smallest value of the two, the numbers make sense, so long as you accept there is a special significance to 2102. But why would I do that?

    So, after all of that pre-amble, what is special about 2102 and why is batching being disabled when the smallest value of client and server is larger than 2102?

    Steve

     

    Thursday, April 29, 2010 5:33 AM

Answers

  • Hi,

    If you change:

        C=4000,S=2103 -> no

    To

      C=4000,S=2104

    will the Context.IsDataBatched still No? If yes, my guess is that your total data change is within 2102-2103 KB range. When all of your changes can be contained in one batch, batching is not enabled. Because provider doesn't need to write your data to batch file to control memory usage. Instead, the whole set of change will be send in memory directly just as same as no-batching cases.

     

    Thanks,
    Dong

     


    This posting is provided AS IS with no warranties, and confers no rights.
    • Marked as answer by Speedware Monday, May 10, 2010 8:23 PM
    Monday, May 10, 2010 8:00 PM
    Moderator

All replies

  • hi steve,

    am assuming in your test above, you ran the tests using the same set of data, meaning same number of rows ( and same size).

    in the tests where it didnt do batching (batch > 2102, w/c is in ur test is 2103) have you tried adding up the rowsizes for all the rows in the change set to see if they are less than or equal to the batch size * 110% (ie., 2103*1.1)?

     

     

    Thursday, April 29, 2010 10:48 AM
    Moderator
  • June,

    Same data. In fact, I have been simply breakpointing and stopping the test at the point in the ProcessChangeBatch() in the client where you get an indication of the batching status for thge changes to be uploaded to the server:

    Code in client:

     

    public override void ProcessChangeBatch(ConflictResolutionPolicy resolutionPolicy, ChangeBatch sourceChanges, object changeDataRetriever, SyncCallbacks syncCallbacks, SyncSessionStatistics sessionStatistics)

    {

     

    this.Log("ProcessChangeBatch called...");

     

    DbSyncContext context = changeDataRetriever as DbSyncContext;

     

    if (context != null && context.IsDataBatched)

    This is different from the situation you are thinkng about wrt to the 110% of batchsize. In that situation the server side would throw an exception because a record would not fit in its batchsize. In this situation, when batching does not occur, a changeset message containing the 18382 inserts is actually being sent over the WCF wire as a single non-batched transaction (and the remote end ie the client end is throwing an exception because it exceeds my 6mb WCF message limit).

    In other words, if I made the receiving side have a maximum WCF message size of 10MB, this sync works. But that's besides the point because I want to have batching. And, if I set the value to 2102, I get it!

    I guess that the para above indirectly answers your question about the sum of the rowsizes for all rows: more than 6mb and less than 10mb.

    So, there are two ways for me to get this particular change set to work:

    a) Have a maximum WCF message size of 10mb. This change I fully understand why it would work but it is not a solution because another changeset coule be bigger than 10mb and besides I want batching.

    b) Have the batchsize selected be less than 2103. This gets me my batches. But I don't understand why, so this is not a solution either.

    Steve

     p.s. There is reference code for this at http://msdn.microsoft.com/en-us/library/dd918908(SQL.105).aspx

    • Edited by Speedware Thursday, April 29, 2010 2:16 PM Put correct code snippet in post
    Thursday, April 29, 2010 1:49 PM
  • Hi,

    When you send the changes to the other provider, the batches are generated on the client side. It's not related to what the other provider's architecture is. i.e, it should not matter whether the other provider is on WCF service or just a local 2-tier provider. That means, if data changes are exactly same and the min batch size is same, if in n-tier the batching is not happening then in 2-tier it will not happen either.

    How do you set the server batch size in above scenarios, have you checked the proxyProvider's GetSyncBatchParameter is really returning 2102 or 2103 for batch size?

    After you set the batching (Batch directory, batch size and whether cleanup batch dir when sync is done ) on the client sending provider, you can check the batch directory and see whether batch files are generated, when the breakpoint is hit at the Proxy provider's ProcessChangeBatch or use batching events to verify.

    You could check the same for a 2 tier scenario as well, though no breakpoint can be set, but you can set cleanupBatchingDir to false and check whether the batch directory has any files or use batching events to verify.

    The data changes for all experiments should be enough for a change batch to be generated. You could generate 2 - 3 * batchSize changes in all cases.

    Thanks. 

    Thursday, April 29, 2010 5:31 PM
    Answerer
  • Hello Jin,

    First let me just say that what I reported is what actually happened, not what "should be enough for a change batch to be generated". So, when I said no batch was generated, I meant that context.IsDataBatched was false (see code snippet above). My results reported were not theoretical ("should"), they were actual.

    How do I set the batch server batch size in the above scenarios? Provider.MemoryDataCacheSize = someValue; As you correctly identify in another of my postings on batch processing (http://social.microsoft.com/Forums/en/syncdevdiscussions/thread/9aa6eae9-5ac5-404f-9e16-f7aca840b05a), the GetSyncBatchParameters() method in its BatchSize parameter returns the value set for Provider.MemoryDataCacheSize.

    So when I say that the client had value X, I meant that there is a statement Provider.MemoryDataCacheSize = X; ... and that debug reinforced this.

    Have I checked the the proxyProvider's GetSyncBatchBatchParameter is really returning 2102 or 2103? Simple answer: Yes.

    When I set Provider.MemoryDataCacheSize to X, Provider.GetSyncBatchParameters() returns X. So when X is 2102, 2102 is returned. When X is 2103, then 2103 is returned. And so forth.

    Regarding your last paragraph, I agree with what should happen. The point of the post is that it does not.

    The original post closed with these questions:

    What is special about 2102 and why is batching being disabled when the smallest value of client and server is larger than 2102?

    Thanks

    Steve 

    Friday, April 30, 2010 4:50 AM
  • I have forwarded the reported issue to relevant people for investigation.

    From my experience, I had created different batches with size from a few kbs to dozens of MBs. With enough data changes to sync, I never see batching disabled.

    We will contact you if we can not reproduce this issue on our side.

    Thanks.

     

    Monday, May 3, 2010 4:41 PM
    Answerer
  • Hello Jin,

    Thank you for the update.

    I just want to stress that this is an uploading batching issue only (that is, the source provider is the one that decides that there should be no batching).

    I have 200,000+ record downloads containing 50MB of data that are successfully broken up into 2MB+ batch files when the batch size/MemoryDataCahcheSize is set to 3000 at both ends.

    Here is an overview of the architecture.

    Sql2005 - SqlSyncProvider - SyncOrchestration - CustomRelationalSyncProvider(Proxy) - WCF - CustomServiceClass - SqlSyncProvider - Sql 2008

    Here is the setup that reproduces the problem for me:

    • Local Sql2005 has a 3 tables with a sum of 18,000 updates.
    • Local SqlSyncProvider has MemoryDataCacheSize set to 3000.
    • Local SyncOrchestrator initiates an upload sync.
    • Remote Sql2008 has no records in the 3 tables.
    • Remote SqlSyncProvider has MemoryDataCacheSize set to 3000.
    • Scope has never been sync'ed before.
    • Local database was provisioned with SqlSyncScopeProvisioning.
    • Remote database was provisioned with SqlSyncScopeProvisioning.
    • During provisioning SetPopulateTrackingTableDefault was set to create (the default, but I thought I'd identify it for you).
    • SyncOrchestrator calls GetKnowledge() on proxy provider.
    • ProxyProvider, via WCF, calls GetKnowledge on remote service.
    • Remote service calls GetKowledge() on remote provider which returns the MemoryDataCacheSize in the BatchSize parameter, namely 3000.
    • Remote service returns BatchSize, again 3000.
    • ProxyProvider receives BatchSize (3000) and returns it to the SyncOrchestrator.
    • SyncOrchestrator does whatever it does with the sourve provider to create the change set and ends up calling ProcessChangeBatch() in proxy provider.
    • Proxy provider examines context.IsDatabatched (see code snippet previously posted above). It is not batched and, by inspection, the change batch is seen to contain the 18,000 inserts.

    Now, in the above, if the MemoryDataCacheSize at the remote provider is 2103 or above, then no batching occurs as per above. If it is 2102 or less, but not 0, batching does occur and the first batch consist of about 6,000 records (IIRC). If it is 0, batching does not occur and there are the 18,000 records in the change batch.

    The same results (of batching or no batching) can be achieved by modifying the value from 3000 to X (either in debug or simply by hard coded assignment) in the 3rd from last step, namely

  • ProxyProvider receives BatchSize (3000) and the value is modified to X and then the proxy returns it to the SyncOrchestrator.

    That is, it does not appear to be anything to do with the remote side of things.

    HTH

    Steve

Monday, May 3, 2010 5:40 PM
  • Speedware, If batching did not occur then it means that all changes fit within 110% of the specified batch size. It seems that you have reasons to belive that your 18000 rows should not fit within 2103 (which in realty is 2213 bytes based on the 110% rule). You can simply verify this by either adding more rows to your upload payload or enable verbose logging to see the logs from the batch producer. It will tell you the size of data it has and whether or not it batches it.
    Maheshwar Jayaraman - http://blogs.msdn.com/mahjayar
    Thursday, May 6, 2010 8:39 PM
    Moderator
  • Speedware, If batching did not occur then it means that all changes fit within 110% of the specified batch size. It seems that you have reasons to belive that your 18000 rows should not fit within 2103 (which in realty is 2213 bytes based on the 110% rule). You can simply verify this by either adding more rows to your upload payload or enable verbose logging to see the logs from the batch producer. It will tell you the size of data it has and whether or not it batches it.
    Maheshwar Jayaraman - http://blogs.msdn.com/mahjayar


    Mahjayar,

    Well your comment here made me think to try some things and I have got it to work. First a recap of the setting I originally had in place:

    Client's MemoryDataCacheSize 3000 (meaning approximately 3,400,000 bytes with the 10% allowance). Server's MemoryDataCacheSize 3000. WCF maximums for message size related settings: 6,000,000.

    I had come up with these settings based on a single record that in one table is 2.2MB and as I have previously posted, your batch size (aka MemoryDataCacheSize) needs to be bigger than that or else you ill get a fault. So I as a consequence I had set MemoryDataCacheSize to 3000, i.e. a bit bigger. Originally I had wanted 500k batch sizes, but this is forced on me.

    Having the 3.4MB maximum batch, I decided that 6MB would be enough for serializing it. An arbitrary guess really. This had worked on syncs of 200,000 records from server to client including the 2.2MB record; typically about 50MB of data broken into 2-3MB batch files.

    So, now to the problem again. As previously reported, setting the MemoryDataCacheSize to 2102 got me batching and messages that all fit in the 6MB so it worked. Setting it to 2103 or larger got me no batching. What can be concluded from your comments is that my batchsize is approximately 2213kb or 2,666,112 bytes.

    The problem actually turns out to be one of maximum WCF message size. You see the 6MB was not enough. Previously, I had increased it to 8MB and the 10MB unsuccessfully. I concluded at the time that this must have been adequate and looked to a problem within the sync batching. Not so. I initially increased the maximum WCF message size to 60MB and the sync was successful.

    This means that the approximately 2213kb batch was creating a WCF message north of 10MB and south of 60MB. Turns out it is between 12MB and 16MB.

    Summary is the batch increased in size over WCF six fold.

    Regarding your comments about enabling the diagnostic tracing and message logging (at least that's what I assume you mean), there are several problems with this in using it to determine what message size is being produced.

    First there does not appear to be any explicit identification of the length of the message in any of the trace records. Perhaps you can post an image of a record in a trace that shows this so I know where to look.

    Second, the long messages are not logged because they exceed the maximum length to log (as controlled by maxSizeOfMessageToLog which defaults to 256kb). If this maxSizeOfMessageToLog is set to, say 20MB, the diagnostic system fails with a System.OutOfMemoryException. The same happens if you set it to -1, which means no limit to length for logging. So, even when enabled to do so, the trace facility will not log the 12-16MB message. Perhaps you can help me know how you log a message of this size.

    So.. the short of all of this is that its not a batching issue; it's a WCF issue in that it generates a ridiculously large message. Which is another matter :-)

    Steve

    Sunday, May 9, 2010 2:48 AM
  • Using MTOM encoding could reduce large message size sent on the wire. You can try it and see whether that's helpful to your scenario.

    Monday, May 10, 2010 5:15 PM
    Answerer
  • Using MTOM encoding could reduce large message size sent on the wire. You can try it and see whether that's helpful to your scenario.


    Jin,

    We're already using MTOM.

    I think this is simply the nature of the beast, meaning using serialized XML output to represent the various objects. In this particular change set, there are 18,000 changes so there must be lots of overhead to represent the changed dataset even with MTOM specified. For our biggest changeset going in the other direction (server to client), we have a single 2.2mb record containing a few small columns and one large one, so the overhead on that is relatively small compared with the actual data.

    Steve

    Monday, May 10, 2010 5:50 PM
  • Try the binary message encoder if interopablity is not an issue.
    jandeepc
    Monday, May 10, 2010 6:00 PM
  • Try the binary message encoder if interopablity is not an issue.
    jandeepc


    Jandeep,

    That's something I will look at because interoperability is not an issue (and if it were, we could have a separate endpoint/binding for that situation).

    Thanks

    Steve

    Monday, May 10, 2010 7:26 PM
  • Hi,

    If you change:

        C=4000,S=2103 -> no

    To

      C=4000,S=2104

    will the Context.IsDataBatched still No? If yes, my guess is that your total data change is within 2102-2103 KB range. When all of your changes can be contained in one batch, batching is not enabled. Because provider doesn't need to write your data to batch file to control memory usage. Instead, the whole set of change will be send in memory directly just as same as no-batching cases.

     

    Thanks,
    Dong

     


    This posting is provided AS IS with no warranties, and confers no rights.
    • Marked as answer by Speedware Monday, May 10, 2010 8:23 PM
    Monday, May 10, 2010 8:00 PM
    Moderator
  • Hi,

    If you change:

        C=4000,S=2103 -> no

    To

      C=4000,S=2104

    will the Context.IsDataBatched still No? If yes, my guess is that your total data change is within 2102-2103 KB range. When all of your changes can be contained in one batch, batching is not enabled. Because provider doesn't need to write your data to batch file to control memory usage. Instead, the whole set of change will be send in memory directly just as same as no-batching cases.

     

    Thanks,
    Dong

     


    This posting is provided AS IS with no warranties, and confers no rights.


    Dong,

    Yes, any value above 2102 resulting in no batching (i.e. Context.IsDataBatched == False). Really the problem here all along was that the resulting WCF message ended up being huge. In my testing, the first thing I did was change the maximum size of an incoming WCF message from 6mb to 8mb and then 10mb without success. At that point I concluded (incorrectly, it turns out) that there must be something wrong with batching as it seemed wrong that setting a batch size to 3mb resulting there being no batching when a message was being generated that was larger 10mb. At that point I created this thread.

    I retrospect, this aspect of batching works as advertised, the big kicker was the expansion size for the WCF message. As reported above, in this particular case, it is somewhere above 12mb and below 16mb for a changeset that is a little bigger than 2mb.

    I will be looking into the BinaryMessageEncoder to reduce the size of the messages. Also, there is a CodeProject article at http://www.codeproject.com/KB/WCF/CompactMessageEncoder.aspx which talks about how to compress the WCF data stream. This looks  it can fit into SyncFx over WCF since it is block-oriented.

    Thanks

    Steve

    Monday, May 10, 2010 8:22 PM