none
How does the SQL Server 2008 sync framework handle schema changes? RRS feed

  • Question

  • I have been looking at converting an existing, "hand-crafted" synch framework for our application into using MS Sync Framework 2.0 in the hope that this would solve some problems we are having with our current implementation and perhaps reduce maintenance efforts.  Our central server is currently SQL Server 2005 and our client systems are running SQL Server 2005 Express.  I finally got my synch prototype working.  I upgraded my SQL Server 2005 Express to SP3 and turned on the replication services capability and things started working as expected.

    As part of my prototyping effort, I decided to see what would happen if I altered my table to add a new column and then populated that column with some data.  Our system is constantly adding new capabilities and this is something that we do nearly every release.  So this question is pretty important to us.

    I would not expect the synch framework to propagate schema changes from our central database to our client databases.  In fact, as part of a typical application upgrade, we download and apply a database patch file and then the user synchs his or her database to the central database.  The synch operation would then be expected to populate any new columns with any changes.

    So, I set up my prototype to mimic this situation.  I added a new column to my table in my central database and populated some records with data (and left others NULL).  On my client data base, I added the new column and then ran the synch.

    I can't say I was surprised when nothing happened.  So I decided to start digging around into the framework components that were added to my database to support the synch operation.

    When the sync framework is applied to a database, a lot of stuff happens.  Many of you already know this, but it is helpful to summarize this.  Two new 'global' tables are added.  In addition, for each table added to the sync configuration, a new "tracking" table, three new triggers, and 8 stored procedures are all added. 

    If one looks at the stored procedures generated by the framework, it becomes pretty obvious what happened.  Four of the stored procedures generated by the framework are essentially "hand coded" to work with the version of the table that existed when the the stored procedures were originally generated:  <tablename>_insert, <tablename>_update, <tablename>_selectchanges, and <tablename>_selectrow.  The other stored procedures, triggers, and tracking table do not appear to be affected by the schema change.

    As far as I can tell, to only way to get my new column recognized by the sync framework would be to add (or remove if I was deleting the column) the column to these four stored procedures.  This is doable; but tedious.  It means that we would have to patch and deliver four more stored procedures for each table.  The "good" news is that are already doing something similar in our current application.  However, one of the things we want to do is reduce the amount of hand-coding we have to do for each release, and this solution is actually worse than our current solution. 

    In fact, if I think about it a bit, this solution could be much worse because if I add a new table to the schema, I now have to add the tracking table, the three triggers, and the 8 stored procedures for that table to the database patch file.  I have no idea how to update the various sync descriptors and filters that might be impacted.  Unless there is something I'm not seeing (I am new to the sync framework), it produces and initial framework that does easily evolve along with the database.  Maybe there is an easier way that I just haven't seen yet.

    So I decided to dig around a bit more and I discovered that SQL Server 2008 allegedly has a much better and less "hand-crafted" way of keeping track of database changes.  I very much like that idea of having the database do the "grunt work" of maintaining and tracking changes.  In fact, I could probably "sell" the idea of upgrading to SQL Server 2008 (the initial upgrade would probably be painful) if such an upgrade would greatly improve our synchronization process.

    Thus my question:  how does the SQL Server 2008 sync framework handle this scenario?  If I add (or remove) a column to a table, or add/remove a table to the schema in our central database, what would happen when client databases are synched?  Even better, are there any actual examples I can look at?

    I apologize for the length of the post, but data synchronization is becoming a top priority for our development efforts.  We need a solution that is reliable, easy to implement, and easy to maintain.  Right now, I believe this framework satisfies the first two requirements, but it does not appear to produce an easily maintainable system.
    Friday, February 12, 2010 5:22 PM

Answers

  • Hi,

    As you realized, Sync Framework does not automatically propagate the schema changes. Please take a look at these articles: Working with Table Schema and Deploying and Versioning Applications that specifically talk about how to deal with new columns and new table. Please note that specific things in these articles apply for the offline providers and looking at your descriptions you are using the collaboration providers. So you would need to tweak your steps accordingly a bit, but the gist is the same.<!---->
    This posting is provided AS IS with no warranties, and confers no rights
    Sunday, February 14, 2010 7:42 AM

All replies

  • Hi,

    As you realized, Sync Framework does not automatically propagate the schema changes. Please take a look at these articles: Working with Table Schema and Deploying and Versioning Applications that specifically talk about how to deal with new columns and new table. Please note that specific things in these articles apply for the offline providers and looking at your descriptions you are using the collaboration providers. So you would need to tweak your steps accordingly a bit, but the gist is the same.<!---->
    This posting is provided AS IS with no warranties, and confers no rights
    Sunday, February 14, 2010 7:42 AM
  • Hi,

    currently we have the same situation as you had. I came across this answer, but it think it is not possible to apply the proposed change for the offline scenario also for the collaboration scenario (the description in MSDN is also really vague in my opinion).

    Did you manage to update the synchronization scope without reinitialising all clients on a schema change (add/remove column, add/remove table)? If yes - how?

    Friday, July 2, 2010 8:50 AM
  • I can give you a short answer, and the refer you to another posting I made on another thread.

    I have actually implemented a sync system in our application that completely avoids using the static adapters, etc., generated by the sync provisioner classes.  My basic technique is to use a sync provider factory class to custom build sync providers, and also to use a table adapter factory to build the table adapters used by the sync providers.  These are all built dynamically based on the table's current schema.  Basically, when building a table adapter the code queries the database to get the current layout of the table and then this is used to build the table adapter.  Then, one or more table adapters and a scope name are used to build the sync provider.

    My response at the end of this thread provides some code examples of what I did:

    http://social.msdn.microsoft.com/Forums/en-US/syncdevdiscussions/thread/4af321b6-4678-4620-af46-98c560cc2bc6

    The reason this works is because every time a user runs the client application, it automatically checks to see if the client application needs to update.  This automatic update also includes changes to the database (table modifications, stored procedures, etc).  This automatic update keeps the two database schemas in sync.  Without it, this technique would fail.

    The other thing I did that turned out to be really useful is that I built a "sync configuration tool" that allowed me to build all of the sync framework in the database.  This includes the tracking table, tombstone table (we're currently on SQL Server 2005), triggers, and the eight stored procedures required by the sync framework.  I also gave it a "wizard" that allows me to easily build a database script to include as many sync components as I need for as many tables as I need.  This way, when we modify a table or add a new table, I can easily generate the sync framework database code.  Without this tool, "hand coding" or attempting to reuse the code generated by the sync provisioners would have been painfully tedious.

    Another technique I use is to dynamically create scope_info entries.  Our system organises data by "Site" and each site has its own unique "Site ID" which is a Guid or uniqueidentifier.  The majority of our row level filtering is by "Site ID" and we frequently add new Sites (our application has become pretty successful and visible in our company so we are constantly adding new sites and users).  Rather than "patch" the scope_info table every time we add a new site, we simply add a new record to the scope_info table in our client and server databases.  To do this, I just insert a new scope_info record with a name like:

    WorkOrders_bySiteID_xxxxxxx-xxxx-xxxx-xxxxxxx

    Where all of those x's are the Guid of the site.  In fact, whenever we start a sync, the client sends the scope_name name over to the server and the server will automatically insert the new scope_info record for that name if it doesn't already have it.  When the record is inserted, the default field values seem to be enough to get the sync framework started and it seems to work just fine.

    Another technique I use is that I have added my own scope_configuration table to replace the scope_config table provided by the sync framework.  This particular table contains a scope_name and the list of tables to be synced by that scope name and the order those tables must be synced in.  This particular table doesn't worry about row level filtering.  Thus, it has entries for the "WorkOrders" scope name and--in our current configuration--lists 15 tables that must be included in the sync.

    One of the major reasons for having this scope_configuration table is allow us to do some sync configuration without necessarily having to "patch" the database.  If we deploy the sync and discover we made an error, perhaps we missed a table or got the order wrong, we can correct it by editing this table.  The scope_configuration table is always the first table we sync whenever the client database is synced.

    If we add new tables to the database or change an existing table, then the procedure to include the changes is pretty straightforward.  First, run the sync configuration wizard to generate the patch to the database.  Second, check the scope_configuration table to make sure the table is listed (add it if it is a new table) and make sure it is in the right order.  If the change is the addition of a new table and the new table needs to support row level filtering, i.e., we need to filter by "Site ID", then I have to write the corresponding "selectchanges_bySiteID" stored procedure, but that is the only thing I have to do by hand.  Also, since we are managing our own tombstone table, any alterations to a table have to be duplicated there (unless we are willing to lose tombstone data).

    As I said in the other posting I linked, there is nothing terribly difficult about any of this.  However, it is tedious and would be very difficult to do if I had to write everything by hand (we sync 80 tables, so that would mean creating two tables, three triggers, and 8 or 9 stored procedures for every table to be synced).  So the right tool really makes all the difference.

    Once I got everything in place (and got my WCF interface working, but that is another very painful tale), the sync has been worth the effort.  Our new sync framework runs about 40 times faster than our original solution (we knew it was bad; we didn't know it was that bad) and we should be deploying it to our users in a few weeks as soon as we finish our deployment and support planning.

    Good luck to you and I hope this helps you with your development efforts.  If you have other questions, be sure to post them here.  The sync developers and sync users are pretty good about answering them (unlike the WCF forums).

     

    Wednesday, July 7, 2010 5:00 PM
  • Hi,

    I have a problem with the db schema changes. When the schema of a table on the server db is updated/deleted the same schema is getting reflected on the client. However if a new column is added I am getting the following error :

    SQL exception "Invalid Column Name", Procedure tableName_bulkinsert.

     The steps I am following to handle schem changes are :

    ·         On the server db, whenever a schema is changed I am deleting the sync frame work related objects(sp's,triggers and scope information) and provisioning it again using the following code.

    SqlSyncProvider provider = new SqlSyncProvider();

    provider.ScopeName = tableName;

    provider.Connection = new SqlConnection(connectionString);

    provider.CommandTimeout = 600;

    //create a new scope description and add the appropriate tables to this scope

    DbSyncScopeDescription scopeDesc = new DbSyncScopeDescription(tableName);

    //add the approrpiate tables to this scope

    scopeDesc.Tables.Add(SqlSyncDescriptionBuilder.GetDescriptionForTable(tableName, (System.Data.SqlClient.SqlConnection)provider.Connection));

    //class to be used to provision the scope defined above

    SqlSyncScopeProvisioning serverConfig = new SqlSyncScopeProvisioning((System.Data.SqlClient.SqlConnection)provider.Connection);

    //determine if this scope already exists on the server and if not go ahead and provision

    //Note that provisioning of the server is oftentimes a design time scenario and not something

    //that would be exposed into a client side app as it requires DDL permissions on the server.

    //However, it is demonstrated here for purposes of completentess.

    //

    //Note the default assumption is that SQL Server is installed as localhost. If it's not,

    //please replace Environment.MachineName with the correct instance name in

    //SqlSharingForm.SqlSharingForm_Shown().

    if (!serverConfig.ScopeExists(tableName))

    {

    serverConfig.CommandTimeout = 600;

     

    //note that it is important to call this after the tables have been added to the scope

    serverConfig.PopulateFromScopeDescription(scopeDesc);

    //indicate that the base table already exists and does not need to be created

    serverConfig.SetCreateTableDefault(DbSyncCreationOption.Skip);

    SqlSyncStoreMetadataUpgrade syncStoreMetadataUpgrade = new SqlSyncStoreMetadataUpgrade((System.Data.SqlClient.SqlConnection)provider.Connection);

    syncStoreMetadataUpgrade.UpgradeSyncSchema();

    //provision the server

    serverConfig.Apply();

    }

    On the client side whenever I find there is a schema mismatch between the client and the server db, I delete that table and the related objects, remove entries form scope tables and execute the below code

    SqlSyncScopeProvisioning sqlConfig = new SqlSyncScopeProvisioning(clientSQLConnection);

    //if the scope does not exist in this store

    if (!sqlConfig.ScopeExists(scopeName))

    {

    //retrieve the scope description from the server

    DbSyncScopeDescription scopeDesc = remoteProxy.GetScopeDescription();

    //use scope description from server to intitialize the client

    sqlConfig.PopulateFromScopeDescription(scopeDesc);

     

    SqlSyncStoreMetadataUpgrade syncStoreMetadataUpgrade = new SqlSyncStoreMetadataUpgrade(clientSQLConnection);

    syncStoreMetadataUpgrade.UpgradeSyncSchema();

    sqlConfig.Apply();

    }

    

    I am getting a SQL exception. "Invalid Column Name", Procedure tableName_bulkinsert.

    I do not have a problem when I update or delete a column, Only when a new column is inserted I am getting this error.

    Any help would be really appreciated.

    Wednesday, April 6, 2011 12:40 PM
  • use the DeprovisionScope API rather than manually deleting the sync objects.

    you may want to check out the  following blog entries as well to get an idea what objects get's provision:

    http://jtabadero.wordpress.com/2010/09/02/sync-framework-provisioning/

    http://jtabadero.wordpress.com/2011/03/21/modifying-sync-framework-scope-definition-part-1-introduction/

    http://jtabadero.wordpress.com/2011/03/24/modifying-sync-framework-scope-definition-part-2-workarounds/

    (shameless plug :) )

    Wednesday, April 6, 2011 3:12 PM
    Moderator
  • Hi,

    We have implemented a application to sync azure databases with sync framework 2.1 and it is working fine. If we have any schema changes on the source database we need to deprovision the entire db (remove all objects related to previous sync).

    Below are the codes for deprovision the db.

    deprovisioningvar = new SqlSyncScopeDeprovisioning(sqlServerConn);

    deprovisioningvar.DeprovisionStore();

    For more detailed codes please refer http://tuvian.wordpress.com/2011/05/04/how-to-sync-schema-changed-database-using-sync-framework/





    Wednesday, May 11, 2011 12:42 PM