how do client changes get recognized for bidirection sync? RRS feed

  • Question

  • I wish to keep a subset of my repository data on a client.  The client machine will be adding and deleting data. I would like to tell the sync system that I only want to sync a specific subset of data, but am getting very lost in the details.


    I used vs 2008 to create a local data cache.  I can force both sync down and bidirecitonal sync.  However when I start using real data the sync times in much too long.  I would like to continue to use the vs 2008 code generator, but i can't see how to add a filter criteria without modifying designer code which it appears would be changed if I removed and re-added the synced table.


    So I thought I could sacrifice the designer.  I duplicated the code into a file I won't accidiently confuse with the designer code and deleted the designer files.  The sync still worked.  But when I went to add a filter I realized I don't understand how the sync system is going to distinguish between filtered data and deleted data.  Or between filtered data and changed data for that matter.


    I looked through the generated code and don't understand how the framework can figure out if the data on the client has changed or just exists or not.  So what is the sync really doing?  What data is moving back and forth from who to who to decide if one side or the other needs to change?  I found syncServicesADOv2_CSharp_adapterBuilder and used it to generate some sync sql.  It appears that the data in the primary key is somehow used by the sync mechanism.  So how much data is moving back and forth?  How does the server change table recognize that a record was deleted from the client?  It appears that all the sql generated was for the server side.  Since I am primarily changing the client side I am interested in how slick the sync framework is about figuring out what happened there.


    I am also confused about how I can shrink the data on the client without the server thinking I really, trully want the data deleted from it.  My impression is the sync will keep bringing new stuff to the client and leave the local new stuff around.  My client isn't interested in any data back more than a specific number of days.  I would like to delete the data out of that window but don't want the sync logic to think this data is to be deleted from the server.


    Is there some set of documentation which I have missed which lays out this sort of work-a-day issues without requiring me to become a sync development expert?



    Jeff Landry


    Monday, November 3, 2008 10:01 PM

All replies

  • In the Server-client model, you only need to program the Server side and write SQL queries for them. The client side is closed and is automatically done for you. You do not need to generate any commands for them.

    There are a few things I am interested in:


    1. You mention that the sync time takes much too long. How much data do you have that you are synching? Is the slow down only for the first sync or all subsequent syncs too?

    2. I did not understand your comment on filtered data v/s deleted data - filtered data is something that the client will either always see or never know about. Deleted data from Server will make it to the client if the client has seen it (because those rows satisy the filter, if not, they are not sent to the client)

    3. If you have touched any data on the client, it will be tracked and sent up to the server using the in-built change tracking.

    4. The queries that you wrote on the Server - SelectIncrementalInsert, SelectIncrementalUpdate and SelectIncrementalDelete are the queries that govern what data needs to go from Server to client.

    5. As I read, you want a sliding window for the data - To achieve something like that, you can try to craft your SelectIncrementalInsert, SelectIncrementalUpdate and SelectIncrementalDelete commands appropriately. Additionally you will have to fake out deletes in the SelectIncrementalDelete command for all those rows that dont belong anymore at the client.


    Refer to this example to see how you can achieve filtering: http://msdn.microsoft.com/en-us/library/bb726003.aspx. It can be a good start even though it may not achieve your sliding window data.


    Monday, November 3, 2008 10:55 PM
  • I had the same quastion:




    If you do open up the client side table, you would see some columns with name beginning with __. These are the system generated columns that are used for change tracking and change enumeration.

    Tuesday, November 4, 2008 6:25 PM
  • Thanks for the interest.  My biggest concern is if I am using the right tool or applying it correctly.  [oops...looks like this is turning into a major paper, I hope you are willing to slog through it, I could really use some ideas.]


    Here is a more complete description of the situation.  I have a 12,000 line vb6 application which is very poorly written.  We are in the process of slowly migrating it to .NET.  Mostly by picking off subsets of functionality and packaging them as components.  We have reached the stage where we wish to eliminate the  dao functionality and re-package all of that on the .NET side in SQL Server Compact.


    The application is a desktop client which almost certainly has server connectivity, but must survive disconnection.  We currently have quite complex logic to simultaneously commit data locally and at the server, and then sync both after a network disconnect/reconnect.  The proposed rewrite would commit data only to the local sdf, and then request a sync with the server.  This clearly would decrease the client complexity by completely separating the client and server data management.  There is a possibility that a user would change client machines while the network connection to the server and even the other clients is down.  We technically tell our customers this is unsupported, but that doesn't stop it happening.  It's more likely the server would be unavailable than the other clients, so p2p sync would aid us here.  But not only do I not understand how to do this with the framework this is more of a big giant plus than a necessity.  A more likely scenario is that a machine would be shut down offline at the end of day and not get turned on (or the app would not be run) on the next business day.


    Please remember that I started with "existing" and "poorly written".  I have a number of distinct data management scenarios.

    1) there are 70 tables which must participate in some sync.

    2) 40 tables are essentially static enterprise wide data.  A new client install must copy these tables down.  On client startup, if any data in these tables has changed, the changed data must be moved to the client.  Sync for these tables must happen independently of how old the changes are.  A machine could be turned off for a year and then brought back into service and must sync.  I had been planning on using the sql server 2008 change tracking, but I see it has a time associated with it.  Have yet to see if it can be set to save change forever.

    3) 2 of the tables are like the previous 40, but may be changed at the client.  If they are changed on the client they must be propogated up to the server immediately and out to the other clients no later than then next time a user starts the client app on that machine.

    4) The remaining 28 tables are associated with the particular user. 

    1. I need to keep 30 days of this information on whatever station the user is currently on
    2. I need to get the data up to the server as soon as possible after it is committed to the local store
    3. I need to keep at least 1 year of this data (probably more) at the server
    4. The tables do not all change at the same time and do change in groups (surprise, surprise)
    5. One of these tables is an audit log (very poorly designed so dozens to hundreds of records may be generated for a single transaction) and must be synced for every transaction (i.e. change to any of the groups of tables

    I used the vs 2008 localdatacache to create two caches against the same sdf file.  I issued a sync command for both sets at the beginning of the application and low-and-behold the sync happened and it was good.  Now this was basically just a test to see if I could start getting a handle how to get started with the framework.  But it was very encouraging and satisfying to see it come together.  I already knew that this wasn't going to be a viable mode of operation, but figured better to see something work with a downside than implement for days and find out there is a fundamental problem.


    The 42 table sync took 1 minute.  No problem I figured, I can always make this faster somehow.  But I forgot I am making a huge number of changes to vb6 and that means running or building an exe for virtually every successful code change to find the next thing that no longer works.  Well, might as well check the 28 table sync.  2 minutes.  Unfortunately, subsequent syncs take as long or longer(hey, what can I say?).  OK, now I need to look at getting these down to something that won't force me to spend most of my time in compiles. 


    --- SIDE NOTE

    re-reading the previous paragraph, I see that I should point out I may not properly understand the create option.  I have created/re-created the data cache a bunch of times and just realized that I stopped paying attention to this parameter.  Looking at the created code I now see that in the tested version I have this set to dropExistingOrCreateNewTable.  I thought I should be (and was) using uploadExistingOrCreateNewTable.  Playing with this just now I see that dropOrCreate appears to be a non-negotiable default when adding a new table to the cache with the designer.  It appears that I should have revisited this for all the tables after generating the initial code.  I will batch edit this and re-run the test.


    Just recreated the 42 table cache.  Changed the tables to uploadOrCreate.  The designer went to sync the db and told me: "synctable object for ... has an invalid combination of creationOption and syncDirection properties...".  To wit: uploadOrCreate is not compatible with downloadOnly.  I didn't try every combination, but the only combination I tried that worked is the default.  For the life of me, I can't figure out how to switch the sync to bidirectional in the designer. 


    I did find the .sync file in the project directory.  Can't seem to get to it from the solution explorer.  In it I found:

    <syncTable Id="3dc49eb6-f88b-412a-9fc4-0b385d3b692c" name="BAL_RECPTSyncTable" syncDirection="DownloadOnly" creationOptions="UploadExistingOrCreateNewTable" tableName="BAL_RECPT".

    I changed it to syncDirection="Bidirectional".  Had to do this with the project open to get it to take.  Then the sync failed with a divide-by-zero error.  So apparently I can't run an uploadOrCreate test.



    So I started looking into filtering. I just looked at the site you provided above, and amazingly enough, I have read it.  I also used google to find a number of posts by various folks discussing adding filtering to the two versions of sync for ado.net and the framework.  I found two themes outside of the msdn link.  First, add a filter to the syncAgent and add the filter clause to the "hard" sql (ex: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2893169&SiteID=1).  Second, use sqlSyncAdapterBuilder.  The second also appears to be the scheme proposed in the msdn article, but I have to admit I am not 100% certain of this.


    Now back to "70 tables".  You can image how delighted I was that the localDataCache generator built the tables locally and generated all the necessary code to manage these tables.  I hate to admit it, but the thought of having to generate 70 adapterBuilders may be one of my major mental roadblocks in this situation.  And that's before I remember some crazy person here may need to add one or more columns to one or more of these tables in the future.  I am left with the impression I am not going to be able to use the designer and would like (if possible) to confirm this before starting a major code production effort.


    And further to your first reply line: "In the Server-client model, you only need to program the Server side and write SQL queries for them. The client side is closed and is automatically done for you".  It is entirely likely that I am missing a major point here.  The most comprehensible introduction I have managed to find to what I am trying to do is the "introduction to sync services and the visual studio designer" video on the sync site.  I didn't see a delete in this video, but assumed it would sync as well.  However, look as I might at the sdf schema I don't see any columns or tables which would tell sync services that a local table row was deleted.  So I am immediately confused about how the client side automatically takes care of this for me.  Further, I don't understand "program the server side".  Seems like the video runs against the cache which is clearly stated to be the local copy of the data.  I see that the generated tableNameSyncAdapter(s) are where all the sql is and it appears to me to be targed at the server since it references things like "@sync_last_received_anchor" which I don't see set anywhere on the client side.  I am immediately lost about how the client does more complicated operations and keeps track of them for sync services to manage.  This is where I added the sidenote above because I realized I should check and make sure I was asking for incremental changes and not just table recreation.  To tell you the truth, I can't figure out how to set up Server-client vs p2p.


    At this point I am no longer sure if I am just rambling and generating useless information or if providing you some insight into one fruitcake developer's mind is of any use as you continue to develop new versions.


    Answers to your questions:

    1) as previously noted, in the current configuration the sync times do not change for successive runs against the same db.  BUT, apparently I am always recreating the tables.


    2 & 3)  These kind of tie together.  Where are the client changes being recorded?  I don't see anything in the db like last change fields.  Although this brings up another question.  It appeared to me that the data set generated by the designer was just a bonus and it's use was not required to make the whole sync process work.  I.E. looking at the code I didn't think I saw it adding any row "versioning" tracking.  However, your comments about the client being managed makes me think I was wrong about this. 


    Not using the generated data sets appealed to me because I have fairly simple needs.  This application almost exclusively reads and writes.  There is virtually no updating or deleting on the client.  So if the designer generated data set isn't intimately tied into the sync process, it's big time overkill for me.  As a result, all of my current thinking is predicated on not using the designer provided data set.  Was this a big mistake?


    4) Since this is a (potentially) disconnected, and all-but standalone, desktop client, I can't get my head around "queries on the server".  Am I again missing a key piece of the puzzle?


    5)Hopefully the excess of information above gives you a solid idea of what I need to achieve.


    Sorry to be so wordy (hopefully this can be read in less than the three hours it took to compose), but I am on my own here and haven't got anyone available to bounce this stuff off of.  I would love to use the framework, but if I am trying to drive a screw with a hammer it would be nice to know that as soon as possible.






    Tuesday, November 4, 2008 6:36 PM
  • Thanks.  I used the vs 2008 designer to generate my sync code and it is not adding these files.  I am using the sql server 2008 changes checkbox and that doesn't allow me to add creating and edited columns. 


    Does this ring any bells?






    Wednesday, November 5, 2008 4:17 PM
  • There are 2 pieces here as I describe - Servers side and client side. If you are doing bi-directional sync, you will upload data from client to server and then download data from server to client. Now when sync starts, it will enumerate the changes from client (using built-in change tracking and enumeration logic) and send these change to the Server. On the Server, this data is applied using the Insert/Update/Delete commands that you write (for starters, you used Designer to generate this for you or you could use the Adapter builder). Then once this finished, Serve rnow enumerates the changes - using the SelectIncrementalInsert, SelectIncrementalUpadte, SelectIncrementalDelete commands (which again you write or used Designer for starters or used adapater builder). Then these changes are sent down to the client. On the client, you dont need to worry about what commands to write or use, the Compact provider does this for you.


    As I understand the following you are working on Server to client sync for now and are interested in client to client sync in a peer-to-peer fashion. Here, the clients are SQL compact databases. The peer-to-peer for SQL Compact is something we are thinking on. For now, you have the Server to client and peer-to-peer for server databases.


    As I read through some of the comments in your thread, I have some comments as below:


    You can set the SQL change tracking to save changes forever by setting the retention to 0. Of course this comes at a price that your metadata will keep growing and you may hit disk and/or perf issues.


    Yes, uploadOrCreate does not work with DownloadOnly. That is the design of it. If you chose dropOrCreate, it should work fine with DownloadOnly. As you might also have guessed, you dont need to re-create the tables everytime (do not have snapshot as the SyncDirection option) This will recreate the tables. If you set to Bi-directional, the schema is created and data is downloaded the first time. Thereafter any subsequent sync will only get you incremental changes from client to Server and rom Server to client.


    The designer does not support bidirectional out of the box. You need to take some additional steps. Please refer to this article http://msdn.microsoft.com/en-us/library/cc761494.aspx to extend the designer code to support bidirectional.


    Also using the designer is to get a start. Thereafter you would use the adapterBuilder to generate the commands and preferably wrap them into stored procs and use them during runtime. Using the designer or adapter builder for runtime is not recommended because it causes perf overhead. And it is not needed to be re-genrating the commands anyways because they will not change nor will re-generating them produce any different results. Of course you need to take into account when and if there will be any schema changes to your tables.


    For your question on deletes, on the server a tombstone table is created (designer does this for you, or you could do this yourself too). On the client side, again as I had said, the close Compact provider does this for you. You should not worry about change enumeration be it insert, update or delete. Compact tacks these changes and sends across to the server on the next sync. When I meant "program the server side" I mean, creating the SelectIncrementalInsert, update, delete commands to be executed on the Server for change enumeration and the Insert, Update, Delete commands for change application. If you are interested in looking at client change tracking, you could open the client database and look at the columns (names beginning with __ ) while you make DML on the table.


    Please take a look at the esamples (code samples) in http://msdn.microsoft.com/en-us/library/bb726010.aspx for Server-client configuration.

    http://msdn.microsoft.com/en-us/library/cc807291.aspx has examples of peer-to-peer setup.

    Thursday, November 6, 2008 7:12 AM
  • Mahesh:


    I really appreciate your taking you time to work with me on this.


    I do realize that I shouldn't be running the designer over and over once I know what I am doing.  However, I have to assume that in the future tables will be added and/or column in existing tables added or changed.  So I would like to pick the scheme which means the least confusion and work for the maintenance engineers.


    The only reason i am not using the adapterbuilder now is that I would prefer not to write 70 of them.  However, that's starting to look like the most robust solution to my situation.  In my case generating the adapters at runtime isn't a hit I am concerned about, I am pretty sure the user will never see this in all the other junk going on.  The fact that I can understand adding a filter to the adapterBuilder as opposed to documenting the multiple SQL changes someone would have to make for an update is probably the deciding factor.


    Thanks for the article link about modifying for bidirectional.  I had not noticed it, but I did try this [with the results noted in the previous post as "side note"].  Nothing changed.  In addition to setting all of the tables to bidirectional I also set them to updateOrCreate, although the article implies that this is done automatically.


    I'm sorry to harp on this, but I don't see any __ columns or tombstone tables on the client or server tables.  When I clicked use sql server 2008 change tracking the ability to add columns for created and changed is greyed out.  Whoops, I take it back.  I went back and took out the updateOrCreate statements and just left the bidirectional statements.  Then I deleted the sdf and ran the test again.  I looked at the sdf with sql server management studio 2008.  Looking at properties and table columns I don't see __ colums.  However, I did a select * from table and the __sysChangeTxBsn and __sysInsertTxBsn show up.  So things must be working.


    Just ran a test.  It took 6 minutes to download to a brand new sdf file.  The final result was a 28mb db.  The following code:

    startupAgent = new startupSyncAgent();

    syncStats = startupAgent.Synchronize();

    //force sdf to sync the data tables

    workingAgent = new workingSyncAgent();

    syncStats = workingAgent.Synchronize();


    takes over 6 minutes to sync.  I didn't change any data, just started the program up twice in a row.


    startupSyncAgent is the 48 all but static files.  workingSyncAgent is the 22 files which contain working data and are required to sync in groups.


    Interestingly the sync performed at the end of the designer took less than 30 seconds for both sets of files (I assume this is a one time download of all the data).


    Clearly filtering isn't an option for the statup sync since changes from forever must be accumulated.  DropOrCreate is an option, but an initial download time of over a minute isn't acceptible either.  Odd, the current scheme has VB6 drop the mdb, recreate it from scratch programmatically, and download all the data and this only takes about 20 secs.  It's a shame I can't figure out how to get that performance from the framework with DropOrCreate cause that would really be a big win to just have the designer generate this whenever things changed and not worry about maintaining the code.


    I don't why I am getting such lousy performance, but it looks like we will just have to rewrite our existing sync to use the sdf file and use it.


    A final note about the sample files.  Samples are great if one has a basic understanding of what to do and is looking for confirmation or improvements.  However, there are so many elements involved in a framework and so many options, and ultimately so much code that working one's way through even a simple sample can be frustrating.  A UML sequence diagram of how the code is designed would be really, really useful.  One immediately gets the overall picture and sees how the classes interact.  One can also ignore the form and textbox, etc. classes which are really there just because the program won't run without them.  Having gotten a feel for how things work from a design level view, then it is trully useful to dig into the sample code to see some additionally detail.


    Thanks again for all the help,




    Thursday, November 6, 2008 8:50 PM
  • Just remembered to tell you I am running all 32 bit on vista 64bit machine.  Any chance the WOW64 is affecting my sync performance dramatically?





    Thursday, November 6, 2008 9:08 PM
  • So are you saying that when you run through designer for the same set of data it takes lesser time than without the designer? And the difference is designer had DonwloadOnly and you had bi-di in your code? Are both using filtering?

    Also is your current VB6 solution using any filtering?

    Do you get better performance if you did not do filtering?


    Hmm, interesting about the wow. Did you try the native bits? And is your comparison with designer and current VB6 also on wow?

    Monday, November 10, 2008 7:13 AM
  • I am running on Vista 64.  When I initially installed the 64 bit sync frame work I got errors.  So I looked around and found some posts that indicated since I am developing a 32 bit application and using sql server compact (for which I am under the impression there is only a 32 bit version at the moment), that I should be using the 32 bit framework.  So I removed the 64 bit framework and installed the 32 bit version.  At this point things started working.  So my impression is that while I am on a 64 bit machine everything else in this case is 32 bit (VS 2008 team, sql server compact, VB6, and my .NET DLLs).


    I delete the existing localDataCache file from the project and delete the sdf file.  ThenI use the designer to add a localDataCache.  I select 42 tables for inclusion in the app and tell the designer I am finished.  The designer runs a sync.  This operation takes just a few seconds for all 42 tables (very few of the tables have more than 5 rows in them and the total size of the new cache will be 778K).  I just did this very thing and the designer sync took approximately 15 seconds.  Which is probably less time than our current vb6 logic is taking, so as a worst case this would be acceptible during program initialization.


    After the sync, the designer asks me if I want to create local data sets for the cached tables.  I decline this offer.


    Now I add the two lines to the beginning of my program which request the sync.  I get the two lines of code as is done in the video, I open the sync object and copy the files displayed by the show code link.  Now I run the program (I just did) and it takes 1 minute 15 seconds.  Remember this had just taken all the defaults.


    Used the designer to change the table options to truncateOrDelete and the sync dropped to about 45 seconds.  I wondered if the metadata compile would make a difference, so I performed the sync twice (second immediately follows first).  Both times it took about 45 seconds.  The sync logic is in a separate dll, so I compiled it for release and ran again.  Still about 45 seconds.


    I did not try any bi-directional and did not apply any filtering.  Everything is running in the context of VB6, so I assume that everything is running through some aspect of wow64.  I have no idea, of course, what the designer is doing.





    Monday, November 10, 2008 3:45 PM