locked
N-Tier Performance Guidance requested RRS feed

  • Question

  • I have an existing desktop application that is using Sync Services (v2) to sync data (download, no upload) from Oracle.  It is currently a 2-tier solution, and does have some performance issues over the customer's WAN. We are updating it to use the new Sync Services bits from Sync Framework 2.0, which, in our testing, has helped slightly.  Additionally, we are evaluating an n-tier architecture using Sync Services v3 (from Sync Framework 2.0), and have a new set of performance issues to deal with (primarily on initial sync). From hooking the various events, the most time seems to be spent in serialization/transfer_over_the_wire/deserialization, and inserting data into the database. 

    We download data (a mix of DownloadOnly and Snapshot) for twenty tables, many of these tables have thousands or tens of thousands of rows.  There are four to five tables that pull hundreds of thousands of rows (100k to 300k rows).  This results in an sdf of approximately forty to seventy megabytes. I have spent quite some time searching this forum, the team blog, and the Internet, for suggestions and samples to help improve performance, and have found several suggestions and few samples.

    In regards to optimizations, I have done the following:

    • Split the tables out to five SyncGroups and am evaluating adding more SyncGroups. 
    • I tried using GZip compression with wsHttpBinding, but found I got better performance creating a custom WCF binding to do binary message encoding over HTTP that looks like this:
                <customBinding>
                  <binding name="BinaryHttpBinding"
                        closeTimeout="00:01:00"
                        openTimeout="00:01:00"
                        receiveTimeout="00:05:00"
                        sendTimeout="00:35:00">
                    <binaryMessageEncoding />
                    <httpTransport authenticationScheme="Anonymous"
                                   maxReceivedMessageSize="2147483647"
                                   maxBufferPoolSize="2359296000"
                                  />
                  </binding>
                </customBinding>
    • Batching initially seems slower, but does help memory issues on IIS (6.0).  I have removed this, but may revisit it as I just found some updated docs on this for Sync Framework v2.

    I am interested in any guidance regarding the use of the DataSetSurrogate (to improve DataSet serialization), as I see the transfer of the DataSet as our largest issue.  I think I'm missing something (likely obvious) as I'm not seeing where I can implement an override and get the DSS passed back from GetChanges within the SyncContext object.  While I don't need a sample, I would like some guidance on where to make modifications so that I can use the DSS.

    I am also going to look at creating the sdf on the server (still not 100% sure on this) and looking to see if I can make the inserts faster to the local SQL Server Compact db (not sure I can affect this).

    Any guidance or suggestions for additional performance improvements with an n-tier solution would be appreciated.  

    Thanks,
    Nino


    Please remember to mark replies as answers if they help you.
    • Moved by Max Wang_1983 Tuesday, April 19, 2011 11:05 PM Forum consolidation (From:SyncFx - Microsoft Sync Framework Database Providers [ReadOnly])
    Tuesday, December 8, 2009 7:19 PM

Answers

  • Yes, that's mostly correct, though for Collaboration scenarios (DbSyncProvider, SqlSyncProvider) only timestamp can be used.

    I think some clarification may be helpful too.
    There are 2 types of providers you can use for collaboration scenarios (both of them work with SqlCeSyncProvider)
    - One is with DbSyncProvider, you have to do all the provisioning yourself. 
      You can use it with SQL Server, and this is the only way to work with Oracle.
    - The other is SqlSyncProvider, the provisioning can be done by Sync framework code automatically for you.
      You can only use it with SQL Server, not Oracle.
    The SharingAppDemos are both using SqlSyncProviders, which you can not use for Oracle.
    In these samples, Sync framework code will automatically create all the sync adapters etc for you so in the code you will not see them.

    You can try out a full example for DbSyncProvider with SQL Server first with the complete code at
       http://msdn.microsoft.com/en-us/library/dd918709(SQL.105).aspx
    The batching can be enabled with DbSyncProvider with properties like BatchingDirectory, MemoryDataCacheSize etc.
    Check this link: http://msdn.microsoft.com/en-us/library/dd918908%28SQL.105%29.aspx 

    And we will soon post a Oracle with Sql Ce Sync sample with batching feature with DbSyncProvider.
    You can take a look at that sample when it's ready.

    Thanks.

    • Marked as answer by NinoBenvenuti Monday, December 21, 2009 2:23 AM
    Tuesday, December 15, 2009 10:30 PM
    Answerer
  • Yes, batching probably is the right way. However, there is one caveat.
    Enabling the batching with Datetime type tracking columns for offline providers are much harder.

    Because the sole purposes of 'Created', 'LastUpdated' columns are for sync changing tracking metadata.
    If the memory pressure problem is really a big issue, it's recommended to convince the schema administrator to do some modifications to the sync metadata.
    In that case, the tracking columns originally in Datetime types needs to be changed to timestamp or bigint types. 
    This will not cause data loss, just sync metadata change.

    If that can be done, either offline batching or collaboration batching scenarios can be enabled to solve the memory issue.

    Thanks.

    • Marked as answer by NinoBenvenuti Monday, December 21, 2009 2:23 AM
    Friday, December 18, 2009 5:58 PM
    Answerer

All replies

  • If you are using Sync Framework V2 with batching,
       DataSetSurrogate will be automatically used.
    Please check whether that will help with your scenario.
    Tuesday, December 8, 2009 11:45 PM
    Answerer
  • I am in the exact situation, however I will have some clients with approx 400-500mb sdf. Which scenario will the datasetsurrogate be applied to automatically? I have to use the offline scenario due to dynamic filtering, however v2 seemed no different in this scenario than v1. What am I missing?

    Thanks,

    Bernie
    Thursday, December 10, 2009 2:18 PM
  • Sorry for the confusion, here is the clarification:
    For the sync framework V2, I meant the collaboration providers such as SqlSyncProvider, DbSyncProvider and SqlCeSyncProvider.
    If you are using the offline scenario with DbServerSyncProvider & SqlCeClientSyncProvider (even with Sync Framework V2), the batching with dataSetSurrogate will not apply.

    Thanks.

    Friday, December 11, 2009 5:59 PM
    Answerer
  • I see. Is there a way to implement the datasetsurrogate for the offline scenario? Or possibly use dynamic filtering within the collaboration scenario? I tried overriding synccontext but without any luck.

    Thanks,

    Bernie
    Friday, December 11, 2009 6:03 PM
  • The better filtering support in collaboration scenario is definitely on our radar.
    But I am not sure about the dataSetSurrogate support in offline scenario. At this time, there is no such decision yet.

    Thanks.
    Friday, December 11, 2009 6:07 PM
    Answerer
  • My main issue is that both batching and dynamic filtering are equally critical in my application. However I am torn between the two because of advantages/disadvantages both have in regards to my situation. We will have approx. 2000 clients, all needing a filtered subset based on thier account. However some clients may have hundreds of thousands of rows, hence the necessity for batching. I would love to be able to utilize sql changetracking if possible. Any thoughts or recomendatations?

    thanks again,

    Bernie
    Friday, December 11, 2009 6:13 PM
  • Our PM had a post on the comparison recently, I am pasting it below from
    http://social.msdn.microsoft.com/Forums/en-US/uklaunch2007ado.net/thread/3a26c015-30a9-4f24-aead-a5376100a9f5

    Great question and we are internally working on our guidance regarding the two sets of providers.  I would break the strengths/weaknesses of DbServerSyncProvider and SqlCeClientSyncProvider as follows:

    Strengths:

    ·     Supports SQL Server integrated change tracking

    ·     Great for scenarios that require dynamic parameters (i.e. I want a single set of commands to be used across all clients)

    ·     Reduces the amount of metadata required on the server

    Weaknesses:

    ·     Tied to a single topology shape

    ·     No formal support for SQL Express on the client

    ·     Batching capabilities are extremely limited

    On the other hand, I would break strengths/weaknesses of  SqlSyncProvider, SqlCeSyncProvider, and DbSyncProvider  as follows:

    Strengths:

    ·     Not tied to any specific topology shape

    ·     Enterprise class batching capabilities

    ·     Support for SQL Express on the client

    Weaknesses:

    ·     Not support for integrated change tacking

    ·     More metadata required

    ·     Does not address the needs of dynamic filtering

    Expect to hear more on this topic on our blog.

    Regards.

    Friday, December 11, 2009 6:47 PM
    Answerer
  • Jin,

    As I understand your responses on this thread, in order to use the built-in DataSetSurrogate in my scenario (Oracle -> SQL CE), I would need to:

    -Change from using DbServerSyncProvider & SqlCeclientSyncProvider (offline scenario) to use DbSyncProvider & SqlCeSyncProvider (collaboration providers)
    -Create a DbSyncTableDescription for each table I wish to download (since I'm synchronizing against Oracle).  This would be analogous to creating SyncTables?
    -Create DbSyncScopeDescription instances (analogous to SyncGroups ?)
    - Indicate what to sync (where do I assign my DbSyncAdapters ?)
    -Write custom code to use DateTime instead of TimeStamp
    -Create the service and proxy
    -Create a SyncOrchestrator and pass in the source (server) and destination (client) providers, and synchronize.

    Furthermore, if I did want to use batching with the DbServerSyncProvider, I would need to do it the 'old' way as detailed here: http://msdn.microsoft.com/en-us/library/bb902828.aspx   (but that does not use the DataSetSurrogate).

    Are those assertions correct?   And what about the DbSyncAdapters?  I've looked through the SharingAppDemo (both of them), but I do not see where I can set the SQL to excute for downloading each table's data (again, I'm likely missing something obvious).

    Thanks,
    Nino
    Please remember to mark replies as answers if they help you.
    Tuesday, December 15, 2009 4:40 PM
  • Yes, that's mostly correct, though for Collaboration scenarios (DbSyncProvider, SqlSyncProvider) only timestamp can be used.

    I think some clarification may be helpful too.
    There are 2 types of providers you can use for collaboration scenarios (both of them work with SqlCeSyncProvider)
    - One is with DbSyncProvider, you have to do all the provisioning yourself. 
      You can use it with SQL Server, and this is the only way to work with Oracle.
    - The other is SqlSyncProvider, the provisioning can be done by Sync framework code automatically for you.
      You can only use it with SQL Server, not Oracle.
    The SharingAppDemos are both using SqlSyncProviders, which you can not use for Oracle.
    In these samples, Sync framework code will automatically create all the sync adapters etc for you so in the code you will not see them.

    You can try out a full example for DbSyncProvider with SQL Server first with the complete code at
       http://msdn.microsoft.com/en-us/library/dd918709(SQL.105).aspx
    The batching can be enabled with DbSyncProvider with properties like BatchingDirectory, MemoryDataCacheSize etc.
    Check this link: http://msdn.microsoft.com/en-us/library/dd918908%28SQL.105%29.aspx 

    And we will soon post a Oracle with Sql Ce Sync sample with batching feature with DbSyncProvider.
    You can take a look at that sample when it's ready.

    Thanks.

    • Marked as answer by NinoBenvenuti Monday, December 21, 2009 2:23 AM
    Tuesday, December 15, 2009 10:30 PM
    Answerer
  • Jin,

    Thanks for the clarification.  Using timestamps is not an option, as I am unable to modify the schema of the Oracle db I am downloading data from.  Each row does have 'Created', 'LastUpdated' columns but those are Oracle Date types (not TimeStamp type).  Based upon the TimeStamp limitation, configuring a collaboration scenario between Oracle and SQL Compact is not possible in my situation.

    That said, what are my alternatives to enable a high number of concurrent synchronizations in an n-tier configuration and minimize memory utilization server-side?  
       -Perform batching the 'old' (i.e. Sync Services 2.0) way as described here: http://msdn.microsoft.com/en-us/library/bb902828.aspx ?
       -Implement a streaming solution?
       -???

    I specifically mention memory utilization on the server as that is the impediment to concurrency.  The first client syncs (initial sync) and w3wp.exe is using > 400MB of memory (peak working set is somewhere > 630MB), so every additional client only compounds the issue. Therefore I can only get a handful of client syncing before we run out of memory.
     
    Thanks,
    Nino


    Please remember to mark replies as answers if they help you.
    Wednesday, December 16, 2009 6:48 AM
  • Yes, batching probably is the right way. However, there is one caveat.
    Enabling the batching with Datetime type tracking columns for offline providers are much harder.

    Because the sole purposes of 'Created', 'LastUpdated' columns are for sync changing tracking metadata.
    If the memory pressure problem is really a big issue, it's recommended to convince the schema administrator to do some modifications to the sync metadata.
    In that case, the tracking columns originally in Datetime types needs to be changed to timestamp or bigint types. 
    This will not cause data loss, just sync metadata change.

    If that can be done, either offline batching or collaboration batching scenarios can be enabled to solve the memory issue.

    Thanks.

    • Marked as answer by NinoBenvenuti Monday, December 21, 2009 2:23 AM
    Friday, December 18, 2009 5:58 PM
    Answerer