I’ve been working on a project that involves full text search within a collection of documents. Search is a pretty common problem so I knew there must be well established solutions available. Sphinx is one such solution, and was our choice for implementation. It’s open source, fast, and very easy to use.
Setting up a simple search instance is straightforward. There are really only four steps:
- Install Sphinx
- Put the text you wish to search in a database (we use mySQL; other options are available)
- Set up a configuration file to point to your database. An example file is provided and you can get up and running by modifying just a few fields
- Run the indexer to create the search index
And thats it. At this point you can start the Sphinx search daemon and query it using one of several methods. We are using the included php library to implement a web-based search.
Even though setup is simple, there are a huge number of options and features that can be used to customize search to suit your specific needs. One example that we use is the excerpt builder. This provides a method to generate excerpts from the text containing the search terms, like you would see in Google search results.
There are a lot of alternatives out there, but Sphinx has provided everything we need, along with ease of use, so we haven’t even felt the need to look at those alternatives. Definitely recommended, and we will continue using it if opportunities present themselves in the future.