locked
Batching updates with SQL2005 & stored procedures RRS feed

  • Question

  • I have an existing solution that includes Sync Services between an SQL2005 server and SQL/CE on a mobile device.

    The current implementation consists of additional columns in each of the server tables for date created and date updated (both datetime) supported by tombstone tables containing the primary key and a datetime to manage the deletion of records similar to the following example...

    CREATE TABLE [dbo].[users](
         [id] [int] IDENTITY(1,1) NOT NULL,
         [code] [varchar](30) NOT NULL,
         [password] [varchar](30) NOT NULL,
         [name] [varchar](60) NOT NULL,
         [sync_date_created] [datetime] NOT NULL,
         [sync_date_updated] [datetime] NULL,
         CONSTRAINT [pk_user_profiles] PRIMARY KEY CLUSTERED ([code] ASC) ON [PRIMARY]) ON [PRIMARY]
        
    ALTER TABLE [dbo].[users] ADD  CONSTRAINT [DF_users_sync_date_created]  DEFAULT (getutcdate()) FOR [sync_date_created]
        
    CREATE TABLE [dbo].[users_tombstone](
         [id] [int] NOT NULL,
         [sync_date_deleted] [datetime] NOT NULL,
         CONSTRAINT [pk_users_tombstone] PRIMARY KEY CLUSTERED ([sync_date_deleted] ASC, [id] ASC) ON [PRIMARY]) ON [PRIMARY]
        
    ALTER TABLE [dbo].[users_tombstone] ADD  CONSTRAINT [df_users_tombstone_sync_date_deleted]  DEFAULT (getutcdate()) FOR [sync_date_deleted]

    The updates and deletes are then managed using triggers...

    ALTER TRIGGER [dbo].[users_update]
        on [dbo].[users]
        for update
        as
        
         update s
         set sync_date_updated = getutcdate()
         from dbo.users s
          inner join inserted i 
           inner join deleted d on i.id = d.id
          on s.id = i.id
         where i.code != d.code
            or i.password != d.password
            or i.name != d.name
        
        
    ALTER TRIGGER [dbo].[users_delete]
        on [dbo].[users]
        for delete
        as
        
         insert into dbo.users_tombstone (id)
         select id
         from deleted

    Whilst inserts, updates and deletes are then selected using stored procedures similar to the following...

    ALTER PROCEDURE [dbo].[UserSyncAdapter_SelectIncrementalInserts]
         @sync_last_received_anchor datetime,
         @sync_new_received_anchor datetime,
         @sync_initialized bit
         
        AS
        
         set nocount on
         
         select us.id, us.code as un, us.password as pw, us.name as nm
         from dbo.users us
         where us.sync_date_created > @sync_last_received_anchor
           and us.sync_date_created <= @sync_new_received_anchor 
         order by us.id
         
         return @@error
        
        
    ALTER PROCEDURE [dbo].[UserSyncAdapter_SelectIncrementalUpdates]
         @sync_last_received_anchor datetime,
         @sync_new_received_anchor datetime,
         @sync_initialized bit
        
        AS
         
         set nocount on
         
         select us.id, us.code as un, us.password as pw, us.name as nm
         from dbo.users us
         where @sync_initialized = 1
           and us.sync_date_created <= @sync_last_received_anchor 
           and us.sync_date_updated > @sync_last_received_anchor 
           and us.sync_date_updated <= @sync_new_received_anchor 
    
    ALTER PROCEDURE [dbo].[UserSyncAdapter_SelectIncrementalDeletes]
         @sync_last_received_anchor datetime,
         @sync_new_received_anchor datetime,
         @sync_initialized bit
         
        AS
        
         set nocount on
         
         select t.id
         from dbo.users_tombstone t
         where @sync_initialized = 1
           and t.sync_date_deleted > @sync_last_received_anchor 
           and t.sync_date_deleted <= @sync_new_received_anchor 
         order by t.id
         
         return @@error
     order by us.id
     
     return @@error

    I recently (read today) had a situation where 91k+ updates were applied to the server database (well actually, they were all in the same table) and had to be reflected out to those devices and though I have grouped the tables, I still ended up with those 91k updates in the same group and thus ran out of memory on the device whilst deserialising.

    My question is this: I would like to implement batching of updates to the devices, whilst maintaining the groups that are currently used. I have looked at various documentation for batching but none of the examples include use of SyncGroups or the use of datetime columns to retrieve the necessary updates.

    Can anybody please point me to any better examples of how to use batching with SQL2005, stored procedures and SyncGroups or provide some examples of their own as to how I might introduce this without having to completely replace the current implementation?

    I should also point out that there are actually about 25-30 tables being synchronised in groups containing between 5 and 10 tables at a time. Some of the tables can contain in excess of 100k rows. The update of 91k rows was an unexpected occurrence - usually there are only between a couple of hundred and a couple of thousand rows at a time but if it has happened once...

    Thanks in advance for all or any help.

    Monday, March 22, 2010 9:59 PM

All replies

  • i think you can still follow the steps here: http://msdn.microsoft.com/en-us/library/bb902828(SQL.105).aspx

    just replace all references to timestamps with your datetime columns. 

    Since you have multiple sync groups, you can pass the sync_group_name session variable to your get new batch anchor stored procedure to check the groupname being sync'ed and determining which tables you should retrieve the datetime columns to be used for determining the batch.

     

    Monday, March 22, 2010 10:35 PM
  • Unfortunately, the SP you refer to will not work with the datetime columns due to the way that the timestamp columns are being manipulated in the following lines...

    -- Determine how many batches are required during the initial synchronization.
    IF @sync_batch_count <= 0
      SET @sync_batch_count = ((@sync_max_received_anchor / @sync_batch_size) - (@sync_last_received_anchor /  @sync_batch_size))
    END

    and

    -- Determine how many batches are required during subsequent synchronizations.
    IF @sync_batch_count <= 0
      SET @sync_batch_count = ((@sync_max_received_anchor / @sync_batch_size) - (@sync_new_received_anchor /  @sync_batch_size)) + 1
    END
    Thanks.
    Tuesday, March 23, 2010 10:43 AM
  • you'll have to convert the datetime to timestamp...

    see Yunwen's sample here: http://social.microsoft.com/Forums/en/uklaunch2007ado.net/thread/138a6931-08cc-4c9f-8cfb-cbf544567146

    Tuesday, March 23, 2010 10:49 AM
  • JuneT,

    Thanks for the link. Just so that I can be sure, reading through the code it appears that batches are kind of generated by grouping changes at 1 second intervals. Is this correct?

    If this is the case, the example cannot help me as the 91k updates that were put through were all on the same table and all part of a batch update (so the datetime value indicating when the update took place will be very close between all rows and therefore I am still likely to end up with extreme batch sizes).

    If I am reading things right, it looks like I may have to redesign the server layer to use timestamp and bigint columns instead of datetime - especially since the article you pointed to basically says that may way of doing it sucks and the right way is timestamp/bigint/min_active_row_version etc.

    Martin.

    Tuesday, March 23, 2010 11:28 AM
  • JuneT,

    Another question: in the original article you refer to on MSDN (the one using timestamp columns), there seems to be nothing taken into account for the deletes (only the insert and update columns are referenced). Is this an oversight?

    Additionally, will I need to modify the stored procedures used to retrieve the inserts/update/deletes or will this then be handled automatically by the min/max anchors? (the example code uses syncbuilders).

    Thanks again.

    Martin.

    Tuesday, March 23, 2010 11:38 AM
  • hi martin,

    yes, the sample doesnt include the deletes. you can include it by including the tombstone tables when getting the anchor.

    if you're retaining datetime for change tracking, then you don't have to change the SelectIncremental SPs since they already have the conditions for filtering against a particular range of rows based on the anchors.

    if you're moving to timestamps, then you'll have to modify the SelectIncremental SPs to compare using the timestamp columns instead of the datetime.

    have you considered moving to the Sync 2.0 and using the collaboration providers instead? v2.0 includes support for batching based on memory size.

    Tuesday, March 23, 2010 9:59 PM
  • have you considered moving to the Sync 2.0 and using the collaboration providers instead? v2.0 includes support for batching based on memory size.

    I was under the impression that SyncFX 2.0 didn't support devices?! That is of course until the seemingly-elusive CTP supporting devices becomes available? ;-)

    Regards,

    Charles R. Tomlinson | Senior Consultant/Developer | Alocet Inc.

     

    Tuesday, April 6, 2010 8:45 PM
  • have you considered moving to the Sync 2.0 and using the collaboration providers instead? v2.0 includes support for batching based on memory size.

    I was under the impression that SyncFX 2.0 didn't support devices?! That is of course until the seemingly-elusive CTP supporting devices becomes available? ;-)

    Regards,

    Charles R. Tomlinson | Senior Consultant/Developer | Alocet Inc.

     


    I stand corrected :)

    the device support still uses V1.

    Thursday, April 8, 2010 11:14 AM