Sunday 15 September 2019

My Experience with Sitecore processing/aggregation

In a recent project I worked on I had to work with a site with more than 2 million visit per month causing issues with the processing of xDB data into the reporting database, the aggregation was too slow comparing to the data collection, in this post I will share the things we tried and how we overcome this issue at the end.



When this issue was reported, first thing we reviewed the website architecture/setup, following how the production environment was configured:
  1. CM Server that also doing reporting and processing role. [xConnect lives here as well]
  2. CD server ( 2 instances )
  3. SQL Server 
  4. SOLR Server

so, from a first look to the above setup you can tell that this is not the recommended scale setup that Sitecore suggest, but for license and resources limitation we had to deal with the above setup.

Keep an eye on your server health status!

On of the first thing you do is to check how your server is doing with the three roles and xConnect running on it, if you see a high CPU usage like your CPU almost max out that should be taken as a bad indication that your server can't handle all of these roles in one place:



Configure the processing agents

On of the first things we thought to do is to increase the number of agents that are used for aggregation, in Sitecore.Analytics.Processing.Aggregation.Services.config configuration file you can increase the number of threads that is responsible of picking up interactions and convert these into a form that is good for reporting database. following this article to correctly configure your max agents setting. 

Once we applied the above we notice an enactment in the process speed of aggregating data into reporting database, monitoring tables line Fact_PageViews and Fact_Visits in the reporting database we can see how fast the records are increasing. 

But, comparing the speed of records added to the interactions table in the collection database with speed of the records added to the reporting; the former was still much faster.!! 


Tune interactions batch size

Batch aggregation is part of the standard Sitecore installation, in which interactions are being grouped together into patches to enhance the process performance and throughput, the MaximumBatchSize setting exists in Sitecore.Analytics.Processing.Aggregation.Services.config configuration file, specify the default number of interactions that can processed in a single patch, this number can be tuned to suite your application, in our case we increase this number to 256 and this enhance the performance/speed of aggregating data. Follow this article for more details of how to tune your interactions batch size.


But, comparing the speed of records added to the interactions table in the collection database with speed of the records added to the reporting; the former was still much faster.!! 


Other Attempts?


We also looks in the knowledge base and found this one which provides a couple of  configuration and SQL scripts that should enhance the analytics performance 

Also, we give it a shot by stopping the processing role and rebuild reporting database but this also didn't fix our issue, collection of data was much faster than processing and aggregation.


FINAL SOLUTION

After going through all of the above we decide to extend our environment to add a new server for the processing role, following a couple of steps:

  1. Copy the Sitecore instance from CM server to the new processing server.
  2. We are using a self signed certificate for our xConnect application, copying and installing that certificate to the new processing server, adding it to personal, trusted and intermediate certificates, in addition we added the binding into hosts file of the processing server, make sure you are able to browse the xConnect link in a secure channel. 
  3. Make sure to update the web.cofig settings to specify processing role. 
  4.  Make sure to add the reporting key to your connection strings file 
  5. You can find all the needed information by check this link.

After you follow the above steps to configure the processing server, you will need to tell the CM server where to find the processing server, in addition you need to add the same reporting API key you added to processing instance connection strings, following are the main two steps to do on CM server:

  1. Add a patch configuration file to specific the processing service URL, which is the processing instance URL.
  2. Add a reporting API Key to the connection strings file.
All information needed to configure a CM server can be find on this link.

Hope this will help! 

No comments:

Post a Comment