Rethinking Apache Solr For Magento

May 8, 2014

Magento is one of the most powerful eCommerce platforms currently available. Apache Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. We're bringing both together as MagSolr.

Background

There are a couple of things to consider with this project. You may be wondering why we are creating our own Solr integration for Magento. At some point last year we ran into performance issues with one of our clients. After looking at several options we ended up deciding to go with our own solution. Before that we looked at over 35 existing Magento modules that offer search or search-related features, using Solr, Elastic Search, straight Lucene, and proprietary search engines. Even Magento itself comes with its own Solr integration when you get the Enterprise Edition (EE).

The result of our research was a nice list of modules with their pros and cons. All of them had one or more disadvantages, ranging from price (EE) to missing features or stalled development.

Having built a popular Solr integration for TYPO3 CMS before and having Magento developers in our team as well it was clear to us how to approach the task and what best practices to follow.

Current Status

At the moment we have a solid foundation that we can build on from here. We try to follow the same principals we used when building the TYPO3 Solr extension. In the end, we strive for the best experience for the user and following best practices on the development side. So far we have covered indexing of products and product search including facetted search/layered navigation. On the backend side, there are a couple of convenience features already that allow you to test the Solr connection and emptying the index f.e. Since we realize that setting up a Solr server might not be the easiest thing for inexperienced users we also provide an easy to use install script which does all the work for you.

We intentionally kept the schema we use for the Solr index very close to the TYPO3 CMS Solr extension. It is not 100% compatible but it is close enough right now to make an integration between the two very easy.

We would like to invite anyone interested to have a look, try it out and provide feedback. All of the code can be found on GitHub.

Outlook

We already have our Magento Solr integration running in production at one of our clients and another is getting it soon, too. The next step is to also index Magento CMS pages including files that may be linked from those pages. This way we can offer an integrated solution covering both products and page content. As we will encounter bigger data sets we would like to move to a more dedicated Index Queue that can be controlled more easily than the built-in Magento events system. An integration with TYPO3 CMS and Adobe Experience Manager (AEM) is on our bucket list as well.

If there is anything that you would like to see in particular or have questions or other feedback, please let us know in the comments below.