locked
One of my databases has gone out of sync - how to repair RRS feed

  • Question

  • Hey, 

     

    I am using the sync framework to sync three SQL Azure database. A master database, and two slave databases which contain a subset of the data in the master. It's a one way sync from master to slaves. No data is ever inserted into the slaves. 

     

    At some point in the last few days, one of my slaves has gone out of sync. There are rows in tableA in slave1 that were deleted from the master, and successfully deleted by the sync framework from slave2 ... but not slave1. 

     

    I'd like to understand how and why this occurred, and find a way to ensure my databases are back in sync with each other. I could manually delete the problem rows I am aware of, but it's a massive database so there's now way I'll be able to track down all other consistencies with certainty!

     

    Thanks, 

     

    Steven

    Sunday, September 11, 2011 10:37 PM

All replies

  • I've investigated this further. The problem is ongoing ... the vast majority of rows that are deleted from tableA in the master are deleted from slave1, but around 4 each minute are *not* deleted. I can't see any pattern.

     

    TableA does not have any foreign keys. It's a very simple table with no constraints. My application that does the syncing does not report any errors, the statistics show 100% success every time it syncs. 

     

    Any ideas how I can even begin to debug this? I'm using SQL Azure so I don't have access to profiler etc.

     

    Thanks, 

     

    Steven

     

    Sunday, September 11, 2011 11:52 PM
  • you can enable sync framework tracing on your sync application.
    Monday, September 12, 2011 12:00 AM
  • Thanks. I didn't know you could enable tracing for the sync framework. I'll do that now and see if I can find any clues. 
    Monday, September 12, 2011 2:49 AM
  • Ok, so I've been working on this for a couple of days now. Tracing did not show any problems, but I have narrowed down the issue. I will explain with an example:

     

    I have a table called Locations which stores the coordinates of a user. The user ID is not the key, but a user should only ever have 1 location, so on inserting a location, I check for an existing location, if it exists I delete it then insert the new one. 

     

    I have since modified the stored procedure that does this so that instead of deleting, it updates if exists and inserts if it doesn't. 

     

    I cleared all duplicate keys from my slave (the duplicates should never have existed, this is the crux of the problem. I did not delete any rows that existed in the master). I figured this should be safe since the deleted rows were already gone from the master and would never ever return, or be updated etc.

     

    Much to my surprise, even though no Locations are now being deleted, I'm still receiving duplicates in the slave.

     

    I recorded the IDs of one of the duplicates, let's say 1111 and 1112 in the slave. I checked the master database, only 1112 exists. 

     

    I deleted 1111 and 1112 from the slave, and *shock* they both re-appeared *even though 1111 does not exist in the master*!

     

    So the sync is re-inserting rows that have long vanished from the master, even if I manually delete them from the slave. 

     

    I figure something needs to be cleaned up in the tracking, but I have no idea what. Any suggestions?

     

    Steven 

    Tuesday, September 13, 2011 4:24 PM
  • sync framework does not keep a copy of the rows it's tracking. the only information in the _tracking table from the original row in the base table is the primary key and the filter columns. so if a row from the base table is deleted, there is no way for it to recover that row.

    my guess is that you may have done some operations on the base table that has kept it out of sync with the tracking tables. the selectchanges sp does a right join between the base table and the tracking table (side) and there may be "phantom" records in your tracking table.

     

    Wednesday, September 14, 2011 1:04 AM
  • Hi June, 

     

    Can you give me some examples of the kind of operations I could have done to the base table that would cause it be out of sync with the tracking table? I thought I was able to insert, update, delete rows in my base tables just like normal, with no restrictions, and those changes would sync. I have never, ever touched any of the tracking tables only my base tables. 

     

    I ended up giving up on the problem, deleting my slave database, deprovisioning the master and doing a cleanup, then starting from scratch. Everything is working fine now but I still don't feel safe i.e. I think it could kick in again at any point. 

     

    The *only* thing I've noticed that could be responsible is that when I tried to backup my database using the Sql Azure Migration Wizard, it alerted me that all the sync batch stored procedures were trying to use "CHANGE TABLE" which is not supported in Azure. Could this be the cause? With my new setup, I have told it explicitly not to use batch procedures. 

     

    Thanks,

     

    Steven

    Thursday, September 15, 2011 5:19 PM
  • bulk operations would cause the tracking table to be out of sync with base tables. e.g., BULK INSERT, BCP, TRUNCATE

    Tuesday, September 20, 2011 2:56 AM