none
Using Sync Framework for heterogenous, one-way "synchronisation"? RRS feed

  • Question

  • Hi,

     

    I have downloaded and played around with Sync Framework 2.0, particularly the SyncOrchestrator and the SQLSyncProviders, and am very impressed. However, I am unsure if the product will meet my needs without heavy modification and need the input of someone far more experienced than myself to decide whether to use the Sync Framework or go fully bespoke.

    My problem is that I have ten or so identical SQL2005 databases (identical in schema, hardware and architecture, but NOT data), and every table has a "created", "modified" and "stale" attribute per row. At the end of each day, I want to upload inserts and updates (NOT deletes) from these 10+ to a "master" SQL Server 2005 DB, with a different schema.

    The main issues I see with Sync framework, and that I don't know if I can easily solve, is:

    1)I need to assign client IDs for each write so that every row in the new Master can be tracked back to an individual client DB for reporting purposes. I see that each syncsession has a "clientID", but is this unique only per session, or can I have it fixed per client, no matter what the session?

    2)I don't want the master to be a "mirror" of any client. Every insert and update must be copied to the master, regardless if the data is the same....it seems sync framework would attempt to make the master a mirror of each client, so if I had 10 clients, and synced each one in turn, then the master would just end up being a mirror of the last sync.

    3) I dont understand the concept of anchors, and ideally should not modify the client Databases. Can I make sync framework WORK with only a datecreated, datamodified and stale field? If I cant, can I make sync framework work with an existing database, with existing data, without modifying existing tables (perhaps by just creating new tables in the schema)?

    4)As stated, the master will NOT be a mirror schema of the clients, it will be much more general so that, regardless of the table in the client the data came from, it will be written to a "changed" table or something similar, with the table name, client id, date etc.

    Thankyou very much for taking the time to read the above, and thanks for any help!

    Thursday, July 8, 2010 9:53 AM

All replies

  • let me try to answer your queries:

    1. the ClientId is unique for each client and not session-based.

    2. For as long as the rows from the client can be uniquely identified, they will be uploaded/inserted to the master table. If you have two clients having the same unique identifier, Sync Fx will raise a conflict when you do the sync since it will think that the row was updated on two different places. e.g Client A uploads row 123, Client B loads row 456, Server has 123 and 456.  if client A uploads row 123 and client B also uploads a row 123, it will overwrite client A's row 123. If you have 3 clients having row 123, and the row was updated on more than one client, you get a conflict. You can handle the conflict and specify which row version wins.

    3. The anchors are for the offline providers. SqlSyncProvider and SyncOrchestrator are for the so called collaboration/peer-to-peer providers. In the anchor-based change tracking, the change tracking columns are in the base table being synched (either as created/updated date or timestamp) and the anchor table keeps track of what was uploaded/downloaded. In the peer-to-peer providers,  each table being synched will have a tracking table to keep track of what was changed and which peer did the change. Triggers will be created for insert/update/delete in the base table to insert/update metadata on the tracking table on what was changed.

    4. tables/columns must match for both client and server in the scope configuration (there are workarounds). Sync Fx is for synchronizing data (keeping copies in sync) and not a central data repository for dumping data from various tables in a central location (although you can use it as a mechanism to upload the data)

    hth

    Thursday, July 8, 2010 12:19 PM
    Moderator
  • Thankyou very much for the quick and informative reply! Can I ask then, how the ClientID is assigned? Is it only "the same" as long as the application is active, ie in memory?

     

    I am struggling to see how syncframework knows that SQL server A is the same server if i do a sync, server A gets assigned GUID 1, and then i go and sync the other servers, then i shut the app down, 2 days later i start it up again and sync server A once more....surely it gets a new random GUID?

    I am very confused by how the client ID works.

     

    Could you point me to some code/tutorial that allows me to have different tables and columns in the scope config?

     

    Finally, and this is subjective, do you believe that the sync framework is the right tool for what im trying to do, or am i trying to shoehorn it to do something it wasn't designed for.

     

    Thanks again.

    Thursday, July 8, 2010 1:01 PM
  • the id that get's assigned to a client is persisted to the database as part of the metadata so it doesnt change in between syncs. Actually, ClientID is from the offline providers, not sure about the exact property on the collaboration providers though.

    if you can elaborate more on items 2 & 4 and give some samples how you want to load your data from one client to the server, etc..., we'll have a better chance of determining if Sync Fx is right for the solution. Is it just upload only, download only, bidirectional? will the data be updated in one server? will client data overlap with another client? etc...

    Thursday, July 8, 2010 1:18 PM
    Moderator
  • Hi Again

    I can load the data anyway I want really, I just can't modify the existing 10 SQL schemas (I can add new schemas though). I was just thinking of making a simple winforms app that a user could run once per day, or add a console app to the windows scheduler. The Sync will happen once per day for each of the 10 SQL schemas, and any inserts and deletes for each of the 10 will be recorded to a "master database" (the schema that I can define how I see fit). Data will be upload only, and the master will never be modified except via this syncronisation process.

    2 Rows could have exactly the same information and UID from 2 seperate databases, but I want to somehow add the clientID to each row so that it guarantees uniqueness. I dont really understand how to get hold of the client ID and use it because the syncorhcestrators and providers do not allow me to add a "clientID" parameter....they are "closed" in that they only allow me to say what columns/tables from the client database i want to sync, and do not allow me to add extra columns/data in programmatically.

    I am also unsure if what i describe is an "offline" or "collaboration" (for the lifetime the app is running) scenario.

    Thursday, July 8, 2010 1:40 PM
  • the offline is more of a hub-spoke topology, you have multiple clients syncronizing to a central server.

    in collaboration or peer-to-peer providers, each client can sync with one another.

    e.g

    Client A has row 1,

    Client B has row 2,

    Client C syncs with Client A, so it gets row 1 as well

    Client C can also sync with Client B so it now has both rows 1 and 2

    the problem with the offline provider is that there is no SQL Express/Server client provider, only a server provider (its a client to server sync).

    So you're better off with the collaboration providers even if you're not doing peer-to-peer.

    when you provision a scope (tables+columns+filter for the data you want to sync), it will create a stored procedure for selecting changes and there is also a tables that will be created for tracking changes: tracking table for each table being synched, scope_info table containing the data on what was sent/received and another table called scope_config which stores the definition of the scope.

    your challenge is to inject that "clientId" in the changes selected as an extra column. you can do it programmatically during by injecting the field in the change dataset in the SelectingChanges event (if am not mistaken, you can actually grab the replica id from the eventargs. You can also embed a client id in the select changes sql as  a hardcoded literal/constant at each client.

    you can check out the sample sync apps here : http://code.msdn.microsoft.com/sync. i suggest you go thru them first and when you get to know how to sync the basic stuff, you can figure out the extension points where you need to do customizations.

     

    Thursday, July 8, 2010 2:13 PM
    Moderator
  • Ok thanks again...

    But don't i want to do the offline provider stuff simply because im only synching once per day? I dont mind not having a sql express client provider....and i see my 10 servers as clients that upload their changes to the master server.

     

    Why couldnt i use the offline scenario, and if there isnt a client provider, how do you do this "client to server" sync? I see this is being exactly what i want to do, just for 10 clients.

    Thursday, July 8, 2010 2:37 PM
  • the providers has nothing to do with how many time you sync or how much data you need to sync.

    in offline scenario, its the client that keeps track of what was sent/received (the anchors). the only client supported out of the box is SQLCE. there is no client provider for SQLServer or SQL Express except for a sample provider code. when you sync, you need two providers, one for the client and one for the server. in offline providers, theres only support for SQLCE out of the box.

    and you mentioned you cant change the client schema as well, as mentioned, in offline scenario the change tracking mechanism is on the tables being synched. you need to have a column there to identify when a row was inserted or updated (date or timestamp). in the collaboration providers, you dont have to add the columns as the change tracking is done on a separate table.

    the collaboration provider can do very much what the offline providers can do and has support for out of the box synchronizing SQLCE/Express/Server in a hub spoke or peer-to-peer scenario.

    Thursday, July 8, 2010 3:04 PM
    Moderator