Batching with SQL 2008 Change Tracking and SQL CE RRS feed

  • General discussion

  • Hi All,

    I'm hoping some of you sync guru's out there can assist me with what I'm viewing as an architectural deficiency within the sync framework.

    I'm syncing a SQL server 2008 database to a SQL CE database running on a mobile client. The Server DB uses change tracking to track modifications. The sync is working properly (albiet slowly as others have already pointed out in previous posts), but I've run into a serious issue regarding batching.

    Let me explain the situation. We are syncing approx 30 tables between client and server. For the most part, the row count for each table is quite small and requires no batching (< 1000 row/table). However, there is one table on the server that contains many row (1.2 million or so). Each client requires a filtered subset from this table (the current maximum is somewhere around 30,000/client). On an inital sync, somewhere around the 10,000 record mark is where the client throws an OutOfMemory exception, so we have to batch the inserts to the client.

    I've been studying the "sample" implementation in MSDN (the usp_GetNewBatchAnchor) proc, which outlines a method for batching. It doesn't work with SQL 2008 Change Tracking, but that's OK, I've modified it to work fine for my purposes. That said, I'm having a very hard time figuring out exactly how this will work without having hundreds of empty batches in every initial sync for this one very large table. Here's the scenario:

    Assuming a starting Server database Change Tracking (CT) version of 0:

    1. Server inserts 1.2 million rows (assume 1 version used per row) via an external process. CT version is now 1.2 million.

    2. Client begins initial sync. It requires only 30,000 of the 1.2 million. However the rows it requires were not inserted into the server table in sequential order, so the version numbers of each of the rows it requires are not sequential, they are spread all over the table.

    3. Assume a batch size of 5000.  The GetNewBatchAnchor proc simply determines the starting version for the client and increments it by BatchSize. The BatchCount is basically the difference between the min and max version numbers / BatchSize.

    Because there are 1.2 million distinct versions, there will be 240 batches (!).  ( 1.2 million (current version) - 0 (starting version) / 5000 )

    4. Because the subset required by the client is spread all throughout the server table, each batch can have anywhere between 0 and 5000 rows, but a likely count is somewhere between 20 and 100.  This of course is because the SelectIncrementalInserts for the SyncTable is basically saying "Get all data that was inserted between @Last_Received_Anchor and @New_Received_Anchor AND (@Additional Filters)" There might well be 5000 rows returned that meet the anchor requirments, but these are pared way down because of (@Additional Filters).
    I can't arbitrarity up the BatchSize to a higher number, because it's entirely possible that there ARE 5000 legitimate changes for a particular client in a given batch.

    5. Result: Synchronization still works, but takes hours. This is mostly because a join to a Change_Tracking() version of a table to the table itself is a very expensive query when there are lots of rows in the server table. Each batch fetch can take anywhere from 10 - 30 seconds.

    So I'm left wracking my brain trying to work out another solution that works within in provided sync framework. Is there anyway to get batching to work using a system seperated from DB Version stamps?
    If not, is there anyway for the SelectNewAnchorCommand to take into account the current client and determine BatchCounts more appropriately for that particular client rather than having the batch system rely on database wide version stamps?

    Surely I can't be the only one out there struggling with this particular issue?

    Thanks in advance for any assistance.


    • Moved by Hengzhe Li Friday, April 22, 2011 3:14 AM (From:SyncFx - Microsoft Sync Framework Database Providers [ReadOnly])
    Tuesday, May 19, 2009 6:48 PM

All replies