locked
Reading CRM database directly VS using the SDK RRS feed

  • Question

  • I am curious if anyone else is using the CRM database views for read-only access into the CRM data? 

    I started using the views for read-only access to avoid having to use the SDK. I felt like it would have better performance than using the SDK for simple things like determining if a contact record already exists, or to get an entity's guid by searching other fields in the entity. Say I wanted to pull a contact's guid by a custom SSN field on the contact record. Hitting the CRM db directly to get the guid sounds like it would be faster than using the CRM SDK.

    Similar, if I wanted to check if a record exists for a given contact with a given SSN number, before I create/update it in CRM, it seems like hitting the db directly and using the views would offer a performace gain over using the CRM SDK. Once I know if it exists or not, I do use the SDK to execute the create/update. I suspect that doing so would be faster than using only the SDK. But I have not sat down to performace test both approaches. So I am wondering if anyone has any insight into this or has done any performace testing to know for sure.

    Is anyone else taking this approach or have any thoughts about taking this approach? Ultimately my reasoning is for performace.

    Best regards,

    Jon


    Jon Gregory Rothlander





    Wednesday, June 18, 2014 6:02 PM

Answers

  • Jon, first of all thank you for sharing more details about your CRM implementation. Now I will comment some parts of your reply.

    • In regards to your point #1, we are a fortune 500 company with many hundreds of CRM users... and growing. We'll never go with CRM Online.  So our solution is a 100% OnPremise solution and we have no customers.  This is an in-house solution.

    Your company is an end user, so for you it's right that you don't care about code reusability outside the company (because you can still reuse your code inside the company). My observation was more from a "we have different customers developer" point of view.

    • In regards to your point #2.  The solution uses a web service as an integration solution that syncs legacy data into CRM.  So if the database is down, there is nothing to do... it will not and cannot sync the legacy data into CRM.  But even if I didn't use a web service, the database would still be down and nothing could be sync'd.  So a web service seems like the logical and best solution for what I am doing.  The web service allows the connection between CRM and the legacy databases.  If the web service is down, CRM is fine.  It is just not sync with the legacy system. 

    When I write "web services" I refer to CRM web services (with the same meaning as you write "use the SDK"), not that the application is a webservice or use a proxy webservice between your application and CRM. What you wrote in the part I quoted is basically if I can't read fresh data I can't update it, this is absolutely right especially (as I understood) there is data coming from one or more external sources.

    • The reason I took this approach is because we hired a consulting company, which I cannot name because you'd immediately recognize them, as they are recommend by Microsoft and all over this site and others.  But they wrote the integration appliations (not a web service). [...] So their CRM integration app was just way too slow.  I went back through their design and code and it was so poorly designed and written that I just decided to just toss the whole thing out and rewrite it from scratch. [...]  But it is just WAY to slow to be used in an production environment the size of ours.

    I don't wave the flag for someone here, it's true that you hired a "we are the best because Microsoft suggests us" consulting company, but the problem for me is that you got (always from your side because we can't listen the other side) a poor team instead. Maybe this consulting company has 1000 employees, some are excellent, some very good, etc etc. Probably the team assigned to your case had not enough skills to fulfill your requirements. Now we can point the finger to the consulting company that sent you not prepared consultants or we can point the finger to the not-well trained consultants, but when a project fails normally there are multiple reasons behind (note 1: I'm not saying that one of the reasons is from your company. note 2: I never worked for a big consulting company, so my thoughts about this comes from my personal views and people that I know, not from "I worked for the big company ... and what happens is ...")

    • Now considering my scenario, I don't see any real issues with my approach.

    As I wrote before your approach is supported so there are no issues at all. First because you are using the filtered views and not accessing directly the sql tables (the classical example why direct access to tables must be avoided is an upgrade from 2011 to 2013. In CRM 2011 the records are splitted between two tables (the base and the extension), in CRM 2013 the records are stored inside a single table (if I'm right this is not completely true for some upgrade scenario, but as I wrote before is an example).
    One of my mantra is "an application is always an input that becomes an output" (I stealed it from my elder brother), or black-box approach if you prefer, and your optimization fits exactly this way to proceed. You saw a faster way to get your input to produce your output, why not use it?

    Just an addition about my second point (SQL Server reachability), this is important if the application is used by end users (meaning that the pc where the application runs must reach the sql server as well), not important if it is an integration application (meaning that is hosted somewhere behind the scenes so a future IFD setup is not affected by this)

    Regarding your numbers, I think that what you wrote is a perfect case to show that the use of Filtered view as Read component in a CRUD application is an efficient way in some scenarios. Maybe other companies or developer are using the same approach but they can't or don't want to share this achievement.

    I don't have suggestions for you (as probably you already optimized your code with ExecuteMultipleRequest, caching of optionsets, limit of sql columns returned,...) but thanks again for sharing this, it's always a pleasure to learn the details of a real (and big as in this case) CRM implementation.


    My blog: www.crmanswers.net - Rockstar 365 Profile



    Wednesday, June 18, 2014 11:17 PM
  • A few points on the relative performance:

    • When retrieving data via the CRM webservices there is a very small performance overhead of CRM building the query, and parsing the results, but the overhead is negligible
    • There is one noticeable difference in the SQL processing: if using filtered views, then security is implemented within SQL, whereas if you use the CRM web services, then security is implemented within CRM code. In general I'd think the performance is comparable, but it's worth noting that, if it were more efficient via SQL, then CRM woudl use the filtered views. I do find that queries with several joins can suffer poor performance when using filtered views, as there are too many joins for the SQL query optimiser to consider all options
    • The one time I'd use SQL for performance reasons is if there is SQL logic that can be applied which is beyond the capability of CRM queries. Examples include the use of the SQL Having clause, or sub-queries


    Microsoft CRM MVP - http://mscrmuk.blogspot.com/ http://www.excitation.co.uk

    • Marked as answer by jonrothlander Friday, June 20, 2014 7:48 PM
    Thursday, June 19, 2014 9:06 AM
    Moderator

All replies

  • Well, your approach (using the filtered view for read operations and web services for create/update/delete) is not wrong, it is supported but I don't know if you really gain something in this way (it depends how complex are your sql queries against the filtered views)

    However personally I don't agree with your approach for the following reasons:

    1. It works only for OnPremise instances (so you can't reuse your code for a future customer that uses CRM Online)
    2. the SQL Server must be reachable as well (if only the CRM webservices are used only the CRM needs to be reachable, this is easier for IFD scenarios)


    My blog: www.crmanswers.net - Rockstar 365 Profile

    Wednesday, June 18, 2014 6:18 PM
  • Thanks for your reply and comments.  Your points are valid for most people but they do not fit my scenario.  I didn't give any background to make the posting short, but I think I need to explain why I have this setup the way I do. Let me explain.

    In regards to your point #1, we are a fortune 500 company with many hundreds of CRM users... and growing. We'll never go with CRM Online.  So our solution is a 100% OnPremise solution and we have no customers.  This is an in-house solution.

    In regards to your point #2.  The solution uses a web service as an integration solution that syncs legacy data into CRM.  So if the database is down, there is nothing to do... it will not and cannot sync the legacy data into CRM.  But even if I didn't use a web service, the database would still be down and nothing could be sync'd.  So a web service seems like the logical and best solution for what I am doing.  The web service allows the connection between CRM and the legacy databases.  If the web service is down, CRM is fine.  It is just not sync with the legacy system. 

    The reason I took this approach is because we hired a consulting company, which I cannot name because you'd immediately recognize them, as they are recommend by Microsoft and all over this site and others.  But they wrote the integration appliations (not a web service).  We have to keep CRM as up to date as possible and we run around 500 to maybe 2500 updates to CRM per hour.  However, the applicaiton they wrote can only handle up to about 1200 record update/create into CRM per hour.  Now at 1200, there's really much more than just a simple create/udpate going on in CRM.  Every record has dozens of other records that get verified, created, udpated, etc.  So there's a lot going on and it is not just creating 1200 records.  But if we are running 500 to 3000 per hour and the application can only handle a max of 1200, we have an issue.  We also occasionaly need to load or refresh about 100K of records, which take so long with their solution that we really cannot even use it without breaking the 100K down into 2500 to 5000 record sets.

    So their CRM integration app was just way too slow.  I went back through their design and code and it was so poorly designed and written that I just decided to just toss the whole thing out and rewrite it from scratch.  In doing so, I created my own data access layer for all of the read-only processes, then use the SDK for update, create, etc.  

    My version runs 12,000 records per hour compared to their 1,200 per hour.  I can run what used to take 1-hour, I can run it in less than 3-minutes.  If they used the recommened and approved coding methods and techniques, that's fine.  But it is just WAY to slow to be used in an production environment the size of ours.

    Now considering my scenario, I don't see any real issues with my approach.  But I am very open to ideas, suggestions, concerns, etc. and expecially in regards to what I might run into down the road or problems that others see.    

    Jon

     


    Jon Gregory Rothlander

    Wednesday, June 18, 2014 9:38 PM
  • Jon, first of all thank you for sharing more details about your CRM implementation. Now I will comment some parts of your reply.

    • In regards to your point #1, we are a fortune 500 company with many hundreds of CRM users... and growing. We'll never go with CRM Online.  So our solution is a 100% OnPremise solution and we have no customers.  This is an in-house solution.

    Your company is an end user, so for you it's right that you don't care about code reusability outside the company (because you can still reuse your code inside the company). My observation was more from a "we have different customers developer" point of view.

    • In regards to your point #2.  The solution uses a web service as an integration solution that syncs legacy data into CRM.  So if the database is down, there is nothing to do... it will not and cannot sync the legacy data into CRM.  But even if I didn't use a web service, the database would still be down and nothing could be sync'd.  So a web service seems like the logical and best solution for what I am doing.  The web service allows the connection between CRM and the legacy databases.  If the web service is down, CRM is fine.  It is just not sync with the legacy system. 

    When I write "web services" I refer to CRM web services (with the same meaning as you write "use the SDK"), not that the application is a webservice or use a proxy webservice between your application and CRM. What you wrote in the part I quoted is basically if I can't read fresh data I can't update it, this is absolutely right especially (as I understood) there is data coming from one or more external sources.

    • The reason I took this approach is because we hired a consulting company, which I cannot name because you'd immediately recognize them, as they are recommend by Microsoft and all over this site and others.  But they wrote the integration appliations (not a web service). [...] So their CRM integration app was just way too slow.  I went back through their design and code and it was so poorly designed and written that I just decided to just toss the whole thing out and rewrite it from scratch. [...]  But it is just WAY to slow to be used in an production environment the size of ours.

    I don't wave the flag for someone here, it's true that you hired a "we are the best because Microsoft suggests us" consulting company, but the problem for me is that you got (always from your side because we can't listen the other side) a poor team instead. Maybe this consulting company has 1000 employees, some are excellent, some very good, etc etc. Probably the team assigned to your case had not enough skills to fulfill your requirements. Now we can point the finger to the consulting company that sent you not prepared consultants or we can point the finger to the not-well trained consultants, but when a project fails normally there are multiple reasons behind (note 1: I'm not saying that one of the reasons is from your company. note 2: I never worked for a big consulting company, so my thoughts about this comes from my personal views and people that I know, not from "I worked for the big company ... and what happens is ...")

    • Now considering my scenario, I don't see any real issues with my approach.

    As I wrote before your approach is supported so there are no issues at all. First because you are using the filtered views and not accessing directly the sql tables (the classical example why direct access to tables must be avoided is an upgrade from 2011 to 2013. In CRM 2011 the records are splitted between two tables (the base and the extension), in CRM 2013 the records are stored inside a single table (if I'm right this is not completely true for some upgrade scenario, but as I wrote before is an example).
    One of my mantra is "an application is always an input that becomes an output" (I stealed it from my elder brother), or black-box approach if you prefer, and your optimization fits exactly this way to proceed. You saw a faster way to get your input to produce your output, why not use it?

    Just an addition about my second point (SQL Server reachability), this is important if the application is used by end users (meaning that the pc where the application runs must reach the sql server as well), not important if it is an integration application (meaning that is hosted somewhere behind the scenes so a future IFD setup is not affected by this)

    Regarding your numbers, I think that what you wrote is a perfect case to show that the use of Filtered view as Read component in a CRUD application is an efficient way in some scenarios. Maybe other companies or developer are using the same approach but they can't or don't want to share this achievement.

    I don't have suggestions for you (as probably you already optimized your code with ExecuteMultipleRequest, caching of optionsets, limit of sql columns returned,...) but thanks again for sharing this, it's always a pleasure to learn the details of a real (and big as in this case) CRM implementation.


    My blog: www.crmanswers.net - Rockstar 365 Profile



    Wednesday, June 18, 2014 11:17 PM
  • A few points on the relative performance:

    • When retrieving data via the CRM webservices there is a very small performance overhead of CRM building the query, and parsing the results, but the overhead is negligible
    • There is one noticeable difference in the SQL processing: if using filtered views, then security is implemented within SQL, whereas if you use the CRM web services, then security is implemented within CRM code. In general I'd think the performance is comparable, but it's worth noting that, if it were more efficient via SQL, then CRM woudl use the filtered views. I do find that queries with several joins can suffer poor performance when using filtered views, as there are too many joins for the SQL query optimiser to consider all options
    • The one time I'd use SQL for performance reasons is if there is SQL logic that can be applied which is beyond the capability of CRM queries. Examples include the use of the SQL Having clause, or sub-queries


    Microsoft CRM MVP - http://mscrmuk.blogspot.com/ http://www.excitation.co.uk

    • Marked as answer by jonrothlander Friday, June 20, 2014 7:48 PM
    Thursday, June 19, 2014 9:06 AM
    Moderator
  • Thanks for both of your suggestions and response.  I've been trying to decide if I should remove the direct SQL access and I wanted to know if there was anything actualy wrong with it in scenario.  I don't see where there is anything necessarily wrong, but I suspect that the performace gains are minimal.  But at 100K in records to push and hours of run time, even a minimal amount of gain is helpful.

    I think I will leave it for now, but know that I can remove it and used the SDK directly if need be and I don't have to worry too much about performace.  Maybe I will go so far as to actually setup a performace test at some point and see just how much of a difference it really makes.

    Best regards,

    Jon


    Jon Gregory Rothlander

    Friday, June 20, 2014 7:50 PM
  • Hi Jon.  In my attempt in asking about direct db access VS SDK I stumbled upon your post.  I am currently implementing a similar scenario whereby I query the db filtered records to view data and use the web services for update.  Have you had any issues with your implementation so far?

     

    Christo Vermeulen

    Thursday, September 4, 2014 3:04 PM
  • Hi Christo

    Using filteredviews to read from database is supported and we have created a windows service that reads from an external system and compare against our crm data - and updates changes.

    the system has been running for the last 3 years without any problems.

    regards,

    Jithesh

    Friday, September 5, 2014 10:51 PM
  • Christo,

    It's been a long time sense you replied, but I was not notified that you had submitted a reply!  Sorry about that.  Not sure why I didn't get a notification.  But I still want to answer your question. 

    We have been running my new version of our integration service since June and we have not experienced any issues with using the views and accessing CRM data via the views. We are processing probably 2,000 to 5,000 records each cycle and we cycle every 30 minutes. We run 6 background processes, which are each multi-threaded, and all are syncing data from legacy systems into CRM.  When I say 2K to 5K of records, these are very deep and wide records. Some of them have as many as a dozen additional entities that are created and linked to the primary entity. So there's a lot going on here and the total number of records can be as many as 10,000 to 50,000 when you take into account all of the processes going on.

    So we are pushing CRM pretty hard and the reading the views has not caused us any problems at all.   suspect we are picking up a little bit of performance gain, but probably only negligible gains, as I have not performced tested the two approaches.  But it is probably worth setting up a little test just to see. I'd be curious exactly what sort of gains can be made when process as many as say 1K, 5K, 10K, 50K, and 100K records. 

    Best regards,

    Jon


    Jon Gregory Rothlander

    Tuesday, November 25, 2014 9:58 PM