Nightly Data Synchronization b/w CRM Online & Enterprise Reporting / Data Warehouse / Backend Systems


  • We have a scenario in which we need daily synchronization of data b/w MS Dynamics CRM and some of our enterprise back end systems (for reporting, analytics etc.).

    As of now we have on-premise deployment of MS Dynamics CRM (older version) and we are using SSIS packages to directly access the relevant data from MS CRM database (completely bypassing the MS Dynamics CRM application layer/web services) and make that data available for the other systems. The volume of this nightly data exchange is quite high (although we are only getting the modified data overnight i.e. the delta), that's the very reason we want to avoid the overhead of accessing that heavy amount of data via Web Services at the MS Dynamics Application Layer using FetchXML / other MS Dynamics application layer level programming methodologies.

    Now, we are planning to upgrade our MS Dynamics CRM deployment to the latest version. We are inclined to switch to the SaaS based offering of MS Dynamics CRM i.e. MS CRM Online. Once, we switch to that, we still need the daily/nightly data synchronization for our MS CRM Online database with our enterprise back end systems.

    What options do we have here from Microsoft to support other applications that require such nightly data refresh from MS Dynamics CRM Online? Is it recommended to just upgrade to MS CRM 2011 on-premise only, and avoid moving to MS CRM Online if such data synchronization use case will not be supported in MS CRM Online?

    Friday, June 28, 2013 11:24 AM

All replies

  • If you use CrmOnline, then you will only be able to retrieve the data via the CRM web services (using FetchXml or a QueryExpression). This could be done, but would incur significant performance overhead compared to local SQL queries. The two main overheads are:

    • Filtering data to only get changes. However, you should be able to filter based on the modifiedon field in CRM, so thisshould not be a problem
    • The network latency for retrieving data, which you can't do much about

    In addition to this, CrmOnline limits the number of records that can be returned from one query (I think the maximum you can get is 50 000 records).

    Overall, if the data volumes are large, the network latency is a major issue and I'd suggest to stick with Crm OnPremise. You could try setting up a trial Crm Online instance and populating it with a realistic volume of data, then testing the performance before making a decision

    Microsoft CRM MVP - http://mscrmuk.blogspot.com/ http://www.excitation.co.uk

    Friday, June 28, 2013 1:09 PM
  • I absolutely agree with David. There's definitely a point at which CRM Online becomes a bad fit.

    Lucas Alexander

    Friday, June 28, 2013 5:08 PM
  • Agree /w David here!

    If you are set on using CRM Online however there are tools out there that will let you replicate the db(although it isn't a re-deployable copy just really a pull of whichever entities you tell it to). One tool in particular is the Scribe Replicator tool ( http://www.scribesoft.com/onlineRS ). Could be helpful for doing data mining.

    Also, the maximum records you can get back via FetchXML by default is 50k. In the prevoius release of the sdk they have opened up this so you can create a larger limit but this will be applied system wide(have performance impacts). Could also page through the FetchXML to get all the records. The Scribe Online Replicator tool doesn't have this limitation for 50k(pulls everything without changing this 50k setting). Note that there is also a query limit results at 5k(little different)

    If this post answers your question, please click "Mark As Answer" on the post and/or "Mark as Helpful"

    • Edited by Brett Rojas Friday, June 28, 2013 9:04 PM missed a zero
    Friday, June 28, 2013 8:57 PM
  • Thanks for the prompt responses. I now understand that direct access to database layer for such data synchronization isn't an option at all for MS CRM Online.

    What about keeping an on-premise MS SQL DB (or on-cloud SQL Azure DB) and publishing any changes that happen in relevant entities of MS CRM Online to that on-premise MS SQL DB (or an on-cloud SQL Azure DB) in real time fashion using Plugins/Workflows? In that way, we will be able to have a snapshot of MS CRM Online DB in a totally independent SQL DB that can be accessed by any Enterprise Reporting / Data Warehouse / Backend Systems for such nightly data synchronization. Does that sound like a good way forward to tackle our situation (please note that the number of entities that we are interested in for data synchronization aren't huge, although the number of records that get added / updated in each entity is significantly high)?

    Yes, we did come across the said product i.e. Scribe Replicator tool as stated by Brett Rojas, that can be used to make the MS CRM Online Data available to other applications. At least the existence of that product indicates that our use case isn't that unique and there are other businesses who are in same situation. Has anyone here experienced using this product? What we are wondering is at the end of the day, this product must be using exactly the same techniques i.e. going via the application layer / web services instead of database layer to solve this use case and may not be that beneficial in terms of performance. I do totally agree that this will definitely reduce the custom development and eventually turn around time. Please share your thoughts around this understanding too.

    Monday, July 01, 2013 9:01 AM