locked
How to Sync one table filtered by another. RRS feed

  • Question

  • Hello

     

    I have created several simple synchronization projects that involve filtering a table based on field in that table but how does one go about syncing one table based on its foreign key being present in another?

     

    Here is a basic view of what I need to accomplish:

     

    OrderDetail (client)

    ID

    SalesManID

    ProductID

    CreateTime

    ModifyTime

    1

    101

    P01

    T

    T

    2

    101

    P02

    T

    T

     

    ProductMaster (client)

    ID

    Data Fields

    CreateTime

    ModifyTime

    P01

    blahblah

    T

    T

    P02

    blahblah

    T

    T

     

    ProductMaster (server)

    ID

    Data Fields

    CreateTime

    ModifyTime

    P01

    blahblah

    <T

    <T

    P02

    blahblah

    <T

    <T

    P03

    blahblah

    <T

    <T

     

    The OrderDetail table should sync to the client with salesman 101's records.  The ProductMaster table should only sync records that are needed; it would be filtered to include only products that are on some order for salesman 101.  This is the join I would use to accomplish this for incremental inserts on the client's product master.

     

    Code Snippet

    SELECT pm.[ID],pm.[Data Fields],pm.[CreateTime],pm.[ModifyTime]
    FROM ProductMaster as pm Inner Join OrderDetail as od
    On pm.ID = od.ProductID
    WHERE (pm.[CreationTime] > @sync_last_received_anchor AND

       pm.[CreationTime] <= @sync_new_received_anchor)

     

     

    The first sync at time T correctly gives the tables as above.  My problem occurrs at time T+1 when I sync and get a new order.  The new order arrives correctly but the client product master table does NOT have the P03 product in it.

     

    OrderDetail (client)

    ID

    SalesManID

    ProductID

    CreateTime

    ModifyTime

    1

    101

    P01

    T

    T

    2

    101

    P02

    T

    T

    3

    101

    P03

    T+1

    T+1

     

    This problem occurs because after syncing at time T, the client's product master table last_received_anchor is set to T.  When the second sync at time T+1 occurs, the P03 record in the server product master table meets the join condition but fails the creationtime > T clause.

     

    I would like to hear any solution ideas that are out there.

    • Moved by Max Wang_1983 Friday, April 22, 2011 5:49 PM forum consolidation (From:SyncFx - Microsoft Sync Framework Database Providers [ReadOnly])
    Wednesday, September 3, 2008 8:01 PM

Answers

  • Yes,

    You have identified the right cons with this approach. I was just typing a follow up post explaining all the assumptions that went with the approach and the possible cons. I should have added that to the original post in the first instance. Anyways here goes.

     

    Lets take the insert case first. You want to select all products (that have order details entry - join table) that have creation time since the last sent anchor. You also need to select existing product ids as inserts for which a new orderdetail has been added since the lastsentanchor for table order. But when you are enumerating productmaster you dont have the anchor for table orderdetail. So your second part is not so straight forward.

     

    The sent/receive anchors are maintained at a per table level on the client side. So technically you can have different received anchors for different tables. For your second criteria to work, the biggest assumption that you have to make is that table OrderDetail and ProductMaster are *ALWAYS* sync'd in one sync group so that the received anchor for them is always the same.

     

    Tables in one syncgroup  is always sent down in one transaction so you can assume that the lastsentanchor for all tables in that group is same. What that your insert query would look something like this.

     

    Code Snippet

    Select PM.RelevantData from ProductMaster join orderdetail od on pm.ID = od.ProductId where (pm.[CreationTime] > @sync_last_received_anchor AND

       pm.[CreationTime] <= @sync_new_received_anchor)

    OR

    (od.[CreationTime] > @sync_last_received_anchor AND

       od.[CreationTime] <= @sync_new_received_anchor)

     

     

     

    This introduces the problem you mentioned. Existing PID's being reenumerated for each new order detail. You need to handle the case on your client by hooking your conflict resolution action and resolving this insert-insert conflict.

     

     

    You can apply the same logic for updates case. Updates wont have any conflict but you do over enumearte updates and apply dummy updates on clients. Also if your filter column on orderdetails changes from P01 to P03 then you once again have the insertinsert conflict. So you might have to go with the approach that orderdetails never change product ids. if they need update then you delete the existing one and readd another one.

     

    Delete case is kind of complex. What do you do when you send deletes for all orderdetail belonging to one product id to the client? In that case dont you want to remove the product id from client side now that it has no orderdetails? But the delete to product id is not identified unless you track extra information in orderdetail tombstone table. Then your delete enumeration gets complex for productmaster. Basically it gets trickier.

     

    Its doable with the above mentioned issues of over enumeration, complext delete enumeration queries and possibly perf hit and increased sync durations.

    Thursday, September 4, 2008 11:02 PM
    Moderator

All replies

  •  

    You basically want to pick product rows for all new orders added after last sync time irrespective of the product's create/update time. In that case your where clause in your query needs to change.

     

    Changing the above query to the following with highlighted changes works for my test db.

     

     

    Code Snippet

    SELECT pm.[ID],pm.[Data Fields],pm.[CreateTime],pm.[ModifyTime]
    FROM ProductMaster as pm Inner Join OrderDetail as od
    On pm.ID = od.ProductID
    WHERE (od.[CreationTime] > @sync_last_received_anchor AND

       od.[CreationTime] <= @sync_new_received_anchor)

     

     

    Hopefully this works for you.
    Thursday, September 4, 2008 8:55 PM
    Moderator
  • Thanks for the reply.

     

    I like your idea but I'm afraid it will result in additional records synchronizing that dont really need to.  I agree that your changes will allow P03 to be inserted into my client product master table.  Consider what will happen on a sync at time T+2 that contains a new order with an old product.

     

    OrderDetail (client)

    ID

    SalesManId

    ProductID

    CreateTime

    ModifyTime

    1

    101

    P01

    T

    T

    2

    101

    P02

    T

    T

    3

    101

    P03

    T+1

    T+1

    4

    101

    P03

    T+2

    T+2

     

    The P03 record from product master will try to sync again as an insert.  I'm not sure if this is handled gracefully but when this example extends to several hundred orders there could be a lot of unneccessary product records traveling to the client.

     

    Im interested to hear what you think.

    Thursday, September 4, 2008 9:59 PM
  • Yes,

    You have identified the right cons with this approach. I was just typing a follow up post explaining all the assumptions that went with the approach and the possible cons. I should have added that to the original post in the first instance. Anyways here goes.

     

    Lets take the insert case first. You want to select all products (that have order details entry - join table) that have creation time since the last sent anchor. You also need to select existing product ids as inserts for which a new orderdetail has been added since the lastsentanchor for table order. But when you are enumerating productmaster you dont have the anchor for table orderdetail. So your second part is not so straight forward.

     

    The sent/receive anchors are maintained at a per table level on the client side. So technically you can have different received anchors for different tables. For your second criteria to work, the biggest assumption that you have to make is that table OrderDetail and ProductMaster are *ALWAYS* sync'd in one sync group so that the received anchor for them is always the same.

     

    Tables in one syncgroup  is always sent down in one transaction so you can assume that the lastsentanchor for all tables in that group is same. What that your insert query would look something like this.

     

    Code Snippet

    Select PM.RelevantData from ProductMaster join orderdetail od on pm.ID = od.ProductId where (pm.[CreationTime] > @sync_last_received_anchor AND

       pm.[CreationTime] <= @sync_new_received_anchor)

    OR

    (od.[CreationTime] > @sync_last_received_anchor AND

       od.[CreationTime] <= @sync_new_received_anchor)

     

     

     

    This introduces the problem you mentioned. Existing PID's being reenumerated for each new order detail. You need to handle the case on your client by hooking your conflict resolution action and resolving this insert-insert conflict.

     

     

    You can apply the same logic for updates case. Updates wont have any conflict but you do over enumearte updates and apply dummy updates on clients. Also if your filter column on orderdetails changes from P01 to P03 then you once again have the insertinsert conflict. So you might have to go with the approach that orderdetails never change product ids. if they need update then you delete the existing one and readd another one.

     

    Delete case is kind of complex. What do you do when you send deletes for all orderdetail belonging to one product id to the client? In that case dont you want to remove the product id from client side now that it has no orderdetails? But the delete to product id is not identified unless you track extra information in orderdetail tombstone table. Then your delete enumeration gets complex for productmaster. Basically it gets trickier.

     

    Its doable with the above mentioned issues of over enumeration, complext delete enumeration queries and possibly perf hit and increased sync durations.

    Thursday, September 4, 2008 11:02 PM
    Moderator
  • Hello todwag,

     

    good to see that others have got similar problems or questions with sync services / sync framework.

     

    I too had the "t+1 sync problem", when referenced data in one table was missing on the client because that table was already synced in t filtered via a join. In t+1 new detail data was added (hence it is fetched by selectIncrementalInserts), but the master data is still the same (although the filter join would now touch some other rows).

     

    My solution was to have some kind of "fake update". In my project this scenario might only occur with 3 tables, so I decided to have it this way. We have a server-side trigger on the n-to-m table that does a simple update on the corresponding table when a new row is inserted. On the client we hooked at the sync event handler to listen for incoming updates on rows that were not existing locally.

    What I feel is good with this solution that there is not that many obsolete data transferred between client and server (despite the updates that of course affect all clients) , the downgrade is that on every update first locally is checked if that entry exists. The thing that all clients receive the updates could surely be overcome with some column, stating which client this update "is intended for".

     

    What do others think of this solution? Any better approaches?

    Monday, September 8, 2008 7:52 AM