Friday 9 October 2015

Sitecore Solr Search Result Items Relevance Percentage



In a latest project I worked on there was a requirement on site search that a percent value should be displayed for each result item describe how much accurate it is for the current search criteria ( filters ), Or in another words we want to sort the returned result items starting with the closet item to search filters at the top.


First of all if you didn't here of Sitecore search boosting before please check Sitecore Solr Search Boosting blog which explain boosting and provide a code sample on it; because we will use sitecore search boosting in this blog post.

So I started to look if there is a build in functionality in sitecore or solr ready to be used as a search relevancy but I didn't get to a direct answer, Until I thought of the "score" which is a value solr return  which represents how much this item is accurate against the passed query

Then I thought that I found the solution unless I found that it was just the beginning because I found that the scores values returned  for items are exactly equivalent obviously because the all returned items match all the passed conditions in the query so what I should do?.

I thought of using the sitecore search boosting feature on the search predicates in such a way that as example if I am searching on specific keyword and this keyword appears on the title field  of item X and exits on the content field of Item Y then item X should appears on the list before item Y.

i.e if I have two search predicates one for title with boost value equal to 20 as example and a content predicate with boost value equal to 10 then the score for an item satisfy the title predicate will be larger than the score value for an item that satisfy content predicate only.

See the following code for more clarification:


In the above code you can see that the boost value for title is larger than  content and content is larger than media content and this will affect the score value at the end.

Is that it? Indeed no. After I got to this point I thought that I solved it but I didn't know how from this score value I can calculate the percentage value so I did so many tests trying to infer a sequence connect the predicates satisfied with the score value and the number of retrieved items but I couldn't and some friends advised not to try understand the way solr calculate the score value but anyway I am sure that the score value effected with two factors:
·         Satisfied predicates boost value.
·         Number of returned items.

but in my case the maximum value for the score value for an item satisfy all the predicates was 0.59 so I assumed that my search in best case will has a top score of 0.59, so I did the following to create a relevance percentage and it works fine. see the following code for more clarification:


The above code shows that if the item score value is 0.59 or larger then the relevancy percentage will be 100% else it will be calclulated based on the assumed top score value 0.59

At the end I hope the above will help you in a way or another and if anyone has a better way of doing this I will be happy to learn.

No comments:

Post a Comment