none
Problem with batching - always misses last batch if more than one RRS feed

  • Question

  • I have an existing solution that enables PDAs (WM 6.x) to sync with a server (WCF/SQL 200x) using the Sync Framework. Currently the database has change tracking columns based on DateTimes (SyncCreatedDate & SyncUpdatedDate) and companion Tombstone tables to track deletes (SyncDeletedDate); the tables are hooked up to the Tombstone tables with triggers to ensure the change tracking is updated. This solution works fine but hits OutOfMemory issues on the devices when the volume of rows to sync is too large.

    I am trying to implement batching to solve the OOM issues. I have removed the tracking columns from the main tables and replaced the tombstone tables with change tracking tables, which contain all three tracking columns (SyncCreatedTS, SyncUpdatedTS, SyncDeletedTS) now as BigInts (for Timestamps) rather than DateTimes; again, a set of triggers ensures that the change tracking is updated accordingly.

    My sync client connects up and syncs no problem, so long as the number of changes is less than or equal to the BatchSize. However, if the number of changes exceeds the batch size, then all changes except those in the last batch get synced across. A subsequent sync is then necessary to get the last batch of changes across.

    So, for example, if there are 128 changes to sync across, and BatchSize is 50, then batches 1 & 2 will get synced then the session will end. Batch 3 doesn't get requested, and no errors are generated at either end.

    I suspect the issue is within the stored proc used for SelectNewAnchorCommand. I took it from the following article:

    http://msdn.microsoft.com/en-us/library/bb902828.aspx

    I've trawled through the threads on Batching in this forum and can't find anyone with a similar problem.

    BTW, I don't want to use the built-in Change Tracking in SQL 2008, as some of our customers are using SQL 2005 and I may even have to implement the database in Oracle in the future.

    Can anyone offer any help?

    Many thanks in advance, PaulM.

     

    Thursday, September 16, 2010 1:05 PM

Answers

  • As is often the way, I got this working within 5 minutes of posting this thread!

    I reworked the GetNewBatchAnchor stored procedure to this:

    CREATE PROCEDURE [dbo].[spSync_GetNewBatchAnchor]
    
    	@sync_last_received_anchor	bigint,
    	@sync_batch_size			int,
    	@sync_max_received_anchor	bigint out,
    	@sync_new_received_anchor	bigint out,
    	@sync_batch_count			int output
    
    AS
    
    BEGIN
    	SET NOCOUNT ON
    	
    	--NOTES: Based upon the example at: http://msdn.microsoft.com/en-us/library/bb902828.aspx
    	
    	--Set a default batch count if a valid one is not passed in
    	If (@sync_batch_count IS NULL)
    		SET @sync_batch_count = 0
    	
    	--Set a default batch size if a valid one is not passed in
    	IF (@sync_batch_size IS NULL OR @sync_batch_size <= 0)
    		SET @sync_batch_size = 100
    	
    	--Before selecting the first batch of changes,
    	--set the maximum anchor value for this synchronization session.
    	--After the first time that this procedure is called, 
    	--Synchronization Services passes a value for @sync_max_received_anchor
    	--to the procedure. Batches of changes are synchronized until this 
    	--value is reached.
    	IF (ISNULL(@sync_max_received_anchor,0) <= 0)
    		SET @sync_max_received_anchor = CAST(MIN_ACTIVE_ROWVERSION() AS bigint) - 1
    	
    	--If this is the first synchronization session for a database,
    	--get the lowest timestamp value from the tables. By default,
    	--Synchronization Services uses a value of 0 for @sync_last_received_anchor
    	--on the first synchronization. If you do not set @sync_last_received_anchor,
    	--this can cause empty batches to be downloaded until the lowest
    	--timestamp value is reached.
    	IF (ISNULL(@sync_last_received_anchor,0) <= 0)
    		SELECT @sync_last_received_anchor = MIN(MinTimestamp) FROM dbo.fSync_GetMinTimestampAllSyncTables()
    	
    	--Check there are any records to sync
    	IF (@sync_last_received_anchor IS NULL)
    		RETURN 0
    	
    	--Determine @sync_new_received_anchor
    	SET @sync_new_received_anchor = @sync_last_received_anchor + @sync_batch_size
    	IF (@sync_new_received_anchor > @sync_max_received_anchor)
    		SET @sync_new_received_anchor = @sync_max_received_anchor
    	
    	--Determine how many batches are required to get the sync client up to date
    	If (@sync_batch_count = 0)
    		SET @sync_batch_count = CAST(CEILING(CAST((@sync_max_received_anchor - @sync_last_received_anchor) AS float) / @sync_batch_size) AS int)
    	
    	SET NOCOUNT OFF
    END
    

     ...which uses a separate function fSync_GetMinTimestampAllSyncTables to get the min row versions from all Change Tracking tables:

    CREATE FUNCTION [dbo].[fSync_GetMinTimestampAllSyncTables]()
    	
    	RETURNS @tblTimestamps TABLE (Rec_ID int IDENTITY(1,1) PRIMARY KEY, TableName nvarchar(100), MinTimestamp bigint)
    AS
    
    --THIS FUNCTION WILL CREATE A TABLE OF MIN TIMESTAMPS TAKEN FROM ALL SYNC TABLES
    --(INSERT OR UPDATE TIMESTAMPS ONLY - NOT DELETE TIMESTAMPS)
    
    BEGIN
    	--Account_Sync
    	INSERT INTO @tblTimestamps (TableName, MinTimestamp)
    		SELECT N'Account', MIN(TimestampCol) FROM (
    			SELECT MIN(SyncCreatedTS) AS TimestampCol FROM Account_Sync WHERE SyncDeletedTS IS NULL
    			UNION
    			SELECT MIN(SyncUpdatedTS) AS TimestampCol FROM Account_Sync WHERE SyncDeletedTS IS NULL
    		) MinTimestamp
    	
    	--PLUS OTHERS HERE
    	--...
    	
    	RETURN
    END
    

    ...and now all batches are synced.

    I've yet to expand this to include more than one synced table, but will post here if there are any further issues.

    PaulM

     

     

    • Edited by P.Mc Thursday, September 16, 2010 3:23 PM Typo
    • Marked as answer by JuneTModerator Thursday, September 16, 2010 3:25 PM
    Thursday, September 16, 2010 2:30 PM

All replies

  • As is often the way, I got this working within 5 minutes of posting this thread!

    I reworked the GetNewBatchAnchor stored procedure to this:

    CREATE PROCEDURE [dbo].[spSync_GetNewBatchAnchor]
    
    	@sync_last_received_anchor	bigint,
    	@sync_batch_size			int,
    	@sync_max_received_anchor	bigint out,
    	@sync_new_received_anchor	bigint out,
    	@sync_batch_count			int output
    
    AS
    
    BEGIN
    	SET NOCOUNT ON
    	
    	--NOTES: Based upon the example at: http://msdn.microsoft.com/en-us/library/bb902828.aspx
    	
    	--Set a default batch count if a valid one is not passed in
    	If (@sync_batch_count IS NULL)
    		SET @sync_batch_count = 0
    	
    	--Set a default batch size if a valid one is not passed in
    	IF (@sync_batch_size IS NULL OR @sync_batch_size <= 0)
    		SET @sync_batch_size = 100
    	
    	--Before selecting the first batch of changes,
    	--set the maximum anchor value for this synchronization session.
    	--After the first time that this procedure is called, 
    	--Synchronization Services passes a value for @sync_max_received_anchor
    	--to the procedure. Batches of changes are synchronized until this 
    	--value is reached.
    	IF (ISNULL(@sync_max_received_anchor,0) <= 0)
    		SET @sync_max_received_anchor = CAST(MIN_ACTIVE_ROWVERSION() AS bigint) - 1
    	
    	--If this is the first synchronization session for a database,
    	--get the lowest timestamp value from the tables. By default,
    	--Synchronization Services uses a value of 0 for @sync_last_received_anchor
    	--on the first synchronization. If you do not set @sync_last_received_anchor,
    	--this can cause empty batches to be downloaded until the lowest
    	--timestamp value is reached.
    	IF (ISNULL(@sync_last_received_anchor,0) <= 0)
    		SELECT @sync_last_received_anchor = MIN(MinTimestamp) FROM dbo.fSync_GetMinTimestampAllSyncTables()
    	
    	--Check there are any records to sync
    	IF (@sync_last_received_anchor IS NULL)
    		RETURN 0
    	
    	--Determine @sync_new_received_anchor
    	SET @sync_new_received_anchor = @sync_last_received_anchor + @sync_batch_size
    	IF (@sync_new_received_anchor > @sync_max_received_anchor)
    		SET @sync_new_received_anchor = @sync_max_received_anchor
    	
    	--Determine how many batches are required to get the sync client up to date
    	If (@sync_batch_count = 0)
    		SET @sync_batch_count = CAST(CEILING(CAST((@sync_max_received_anchor - @sync_last_received_anchor) AS float) / @sync_batch_size) AS int)
    	
    	SET NOCOUNT OFF
    END
    

     ...which uses a separate function fSync_GetMinTimestampAllSyncTables to get the min row versions from all Change Tracking tables:

    CREATE FUNCTION [dbo].[fSync_GetMinTimestampAllSyncTables]()
    	
    	RETURNS @tblTimestamps TABLE (Rec_ID int IDENTITY(1,1) PRIMARY KEY, TableName nvarchar(100), MinTimestamp bigint)
    AS
    
    --THIS FUNCTION WILL CREATE A TABLE OF MIN TIMESTAMPS TAKEN FROM ALL SYNC TABLES
    --(INSERT OR UPDATE TIMESTAMPS ONLY - NOT DELETE TIMESTAMPS)
    
    BEGIN
    	--Account_Sync
    	INSERT INTO @tblTimestamps (TableName, MinTimestamp)
    		SELECT N'Account', MIN(TimestampCol) FROM (
    			SELECT MIN(SyncCreatedTS) AS TimestampCol FROM Account_Sync WHERE SyncDeletedTS IS NULL
    			UNION
    			SELECT MIN(SyncUpdatedTS) AS TimestampCol FROM Account_Sync WHERE SyncDeletedTS IS NULL
    		) MinTimestamp
    	
    	--PLUS OTHERS HERE
    	--...
    	
    	RETURN
    END
    

    ...and now all batches are synced.

    I've yet to expand this to include more than one synced table, but will post here if there are any further issues.

    PaulM

     

     

    • Edited by P.Mc Thursday, September 16, 2010 3:23 PM Typo
    • Marked as answer by JuneTModerator Thursday, September 16, 2010 3:25 PM
    Thursday, September 16, 2010 2:30 PM
  • care to post your SelectNewAnchorCommand?
    Thursday, September 16, 2010 2:31 PM
    Moderator
  • The SelectNewAnchorCommand code is pretty much the standard from the above article:

    Private Function GetSelectNewAnchorCommand() As SqlCommand
    
    	Dim oCmd As SqlCommand = Nothing
    
    	Try
    		'Set up SQL Command
    		oCmd = New SqlCommand()
    		With oCmd
    			.CommandText = "spSync_GetNewBatchAnchor"
    			.CommandType = CommandType.StoredProcedure
    			.Parameters.Add(New SqlParameter("@" & SyncSession.SyncLastReceivedAnchor, SqlDbType.BigInt)).Direction = ParameterDirection.Input
    			.Parameters.Add(New SqlParameter("@" & SyncSession.SyncBatchSize, SqlDbType.Int)).Direction = ParameterDirection.Input
    			.Parameters.Add(New SqlParameter("@" & SyncSession.SyncMaxReceivedAnchor, SqlDbType.BigInt)).Direction = ParameterDirection.Output
    			.Parameters.Add(New SqlParameter("@" & SyncSession.SyncNewReceivedAnchor, SqlDbType.BigInt)).Direction = ParameterDirection.Output
    			.Parameters.Add(New SqlParameter("@" & SyncSession.SyncBatchCount, SqlDbType.Int)).Direction = ParameterDirection.InputOutput
    		End With
    
    		Return oCmd
    
    	Catch ex As Exception
    		Throw ex
    
    	Finally
    		oCmd = Nothing
    
    	End Try
    
    End Function
    

    ...but with BigInt rather than Timestamp (was getting a CASTing exception until I changed the SqlDbTypes to BigInt).

    Thursday, September 16, 2010 2:47 PM
  • Actually, if it's any further help to anyone, here's an example of the tables and triggers used:

    Tables:

    CREATE TABLE [dbo].[Account](
    	[Account_Code] [nvarchar](5) NOT NULL,
    	[Account_Desc] [nvarchar](30) NULL,
    	[Account_Last_Update] [datetime] NULL,
     CONSTRAINT [Account_PKey] PRIMARY KEY CLUSTERED 
    (
    	[Account_Code] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    
    GO
    
    CREATE TABLE [dbo].[Account_Sync](
    	[Account_Code] [nvarchar](5) NOT NULL,
    	[SyncCreatedTS] [bigint] NOT NULL,
    	[SyncUpdatedTS] [bigint] NULL,
    	[SyncDeletedTS] [bigint] NULL,
    	[RowVersion] [timestamp] NOT NULL,
     CONSTRAINT [PK_Account_Sync] PRIMARY KEY CLUSTERED 
    (
    	[Account_Code] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    

    Insert Trigger:

    CREATE TRIGGER [dbo].[trigAccount_Insert_Sync] ON [dbo].[Account]
    AFTER INSERT
    AS
    BEGIN
    	--<object id="trigAccount_Insert_Sync" version="2.0.0" date="2010-09-13" />
    	
    	SET NOCOUNT ON
    	
    	--CLEAR ANY EXISTING RECORDS FROM SYNC TABLE FOR THOSE INSERTED
    	DELETE
    		Account_Sync
    	
    	FROM
    		Account_Sync
    		INNER JOIN INSERTED ON Account_Sync.Account_Code = INSERTED.Account_Code
    	
    	--INSERT RECORD INTO SYNC TABLE FOR EACH INSERTED ROW
    	DECLARE @Rec_Count int
    	SELECT @Rec_Count = COUNT(*) FROM INSERTED
    	
    	IF (@Rec_Count = 1)
    	BEGIN
    		--SINGLE INSERT
    		INSERT INTO
    			Account_Sync (
    				Account_Code,
    				SyncCreatedTS,
    				SyncUpdatedTS,
    				SyncDeletedTS)
    			
    		SELECT
    			Account_Code,
    			CAST(@@DBTS + 1 AS bigint),
    			NULL,
    			NULL
    		
    		FROM
    			INSERTED
    	END
    	ELSE
    	BEGIN
    		--MULTIPLE INSERTS
    		DECLARE @tbl TABLE (Rec_ID int IDENTITY(1,1) PRIMARY KEY, Account_Code nvarchar(5))
    		INSERT INTO @tbl (Account_Code) SELECT Account_Code FROM INSERTED
    		
    		DECLARE @Rec_ID int
    		DECLARE @Account_Code nvarchar(5)
    		
    		SET @Rec_ID = 1
    		
    		WHILE @Rec_ID <= @Rec_Count
    		BEGIN
    			SELECT @Account_Code = T.Account_Code
    			FROM @tbl AS T
    			WHERE T.Rec_ID = @Rec_ID
    			
    			INSERT INTO
    				Account_Sync (
    					Account_Code,
    					SyncCreatedTS,
    					SyncUpdatedTS,
    					SyncDeletedTS)
    					
    			VALUES
    				(@Account_Code,
    				CAST(@@DBTS + 1 AS bigint),
    				NULL,
    				NULL)
    			
    			SET @Rec_ID = @Rec_ID + 1
    		END
    	END
    	
    	SET NOCOUNT OFF
    END
    

    Update Trigger:

    CREATE TRIGGER [dbo].[trigAccount_Update_Sync] ON [dbo].[Account]
    AFTER UPDATE
    AS
    BEGIN
    	--<object id="trigAccount_Update_Sync" version="2.0.0" date="2010-09-16" />
    	
    	SET NOCOUNT ON
    	
    	--UPDATE CORRESPONDING RECORD IN SYNC TABLE FOR EACH UPDATED ROW
    	DECLARE @Rec_Count int
    	SELECT @Rec_Count = COUNT(*) FROM INSERTED
    	
    	IF (@Rec_Count = 1)
    	BEGIN
    		--SINGLE UPDATE
    		UPDATE
    			Account_Sync
    		
    		SET
    			SyncUpdatedTS = CAST(@@DBTS + 1 AS bigint)
    			
    		FROM
    			Account_Sync
    			INNER JOIN INSERTED ON Account_Sync.Account_Code = INSERTED.Account_Code
    	END
    	ELSE
    	BEGIN
    		--MULTIPLE UPDATES
    		DECLARE @tbl TABLE (Rec_ID int IDENTITY(1,1) PRIMARY KEY, Account_Code nvarchar(5))
    		INSERT INTO @tbl (Account_Code) SELECT Account_Code FROM INSERTED
    		
    		DECLARE @Rec_ID int
    		DECLARE @Account_Code nvarchar(5)
    		
    		SET @Rec_ID = 1
    		
    		WHILE @Rec_ID <= @Rec_Count
    		BEGIN
    			SELECT @Account_Code = T.Account_Code
    			FROM @tbl AS T
    			WHERE T.Rec_ID = @Rec_ID
    			
    			UPDATE
    				Account_Sync
    			
    			SET
    				SyncUpdatedTS = CAST(@@DBTS + 1 AS bigint)
    					
    			WHERE
    				Account_Code = @Account_Code
    			
    			SET @Rec_ID = @Rec_ID + 1
    		END
    	END
    	
    	SET NOCOUNT OFF
    END
    

    Delete Trigger:

    CREATE TRIGGER [dbo].[trigAccount_Delete_Sync] ON [dbo].[Account]
    AFTER DELETE
    AS
    BEGIN
    	--<object id="trigAccount_Delete_Sync" version="2.0." date="2010-09-16" />
    	
    	SET NOCOUNT ON
    	
    	--UPDATE CORRESPONDING RECORD IN SYNC TABLE FOR EACH DELETED ROW
    	DECLARE @Rec_Count int
    	SELECT @Rec_Count = COUNT(*) FROM DELETED
    	
    	IF (@Rec_Count = 1)
    	BEGIN
    		--SINGLE DELETE
    		UPDATE
    			Account_Sync
    		
    		SET
    			SyncDeletedTS = CAST(@@DBTS + 1 AS bigint)
    			
    		FROM
    			Account_Sync
    			INNER JOIN DELETED ON Account_Sync.Account_Code = DELETED.Account_Code
    	END
    	ELSE
    	BEGIN
    		--UPDATE CORRESPONDING RECORD IN SYNC TABLE FOR EACH UPDATED ROW
    		DECLARE @tbl TABLE (Rec_ID int IDENTITY(1,1) PRIMARY KEY, Account_Code nvarchar(5))
    		INSERT INTO @tbl (Account_Code) SELECT Account_Code FROM DELETED
    		
    		DECLARE @Rec_ID int
    		DECLARE @Account_Code nvarchar(5)
    		
    		SET @Rec_ID = 1
    		
    		WHILE @Rec_ID <= @Rec_Count
    		BEGIN
    			SELECT @Account_Code = T.Account_Code
    			FROM @tbl AS T
    			WHERE T.Rec_ID = @Rec_ID
    			
    			UPDATE
    				Account_Sync
    			
    			SET
    				SyncDeletedTS = CAST(@@DBTS + 1 AS bigint)
    					
    			WHERE
    				Account_Code = @Account_Code
    			
    			SET @Rec_ID = @Rec_ID + 1
    		END
    	END
    		
    	SET NOCOUNT OFF
    END
    
    

    Note: At first glance, it may seem a long-winded approach to loop through the Inserts/Updates/Deleted to update each row in the corresponding Sync table, but this way each Insert/Update/Delete gets a new Timestamp, rather than batches of them all getting the same value of @@DBTS+1.

    Hope that makes sense!

    PaulM

     

    Thursday, September 16, 2010 3:22 PM
  • Since I posted this, I've made many improvements to the above code, and additional SQL objects to support full batch syncing.

    If anyone finds this post and wants some assistance, feel free to contact me and I'll see if I can help (workload permitting).

     

    Friday, October 22, 2010 3:22 PM