Real-time indexing and searching with Sphinx 1.10.1-dev
Real-time indexing and searching is one of the major goals of web development in 2010. As of version 1.10.1 Sphinx is added to the list of search engines that deliver the promise of real-time.
Remember, in this post I’m using a developers’ trunk, checked out from repository. Get your’s copy here:
$ svn checkout http://sphinxsearch.googlecode.com/svn/trunk sphinx
$ cd sphinx
$ ./configure
$ make
$ sudo make install
Worked for me out of the box on a Debian etch box after failing on Snow Leopard.
First you need to setup your sphinx.conf file:
index rt {
type = rt
path = /usr/local/sphinx/data/rt
rt_field = message
rt_attr_uint = message_id
}
searchd {
log = /var/log/searchd.log
query_log = /var/log/query.log
pid_file = /var/run/searchd.pid
workers = threads
listen = 192.168.24.11:9312:mysql41
}
You need listen switch in the searchd section to set up a MySQL protocol based server. However, a couple of quick benchmarks show that it’s significantly slower that the regular searchd.
After launching search I used the MySQL protocol to index my data. Given the structure (message, message_id) I can now connect to my server and start indexing:
$ mysql -P 9312 -h 127.0.0.1
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 1.10.1-dev (r2309)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> INSERT INTO rt VALUES (1, 'this message has a body', 1);
Remember, always use single quotes.
Querying real-time index is just as easy as typing:
mysql> SELECT * FROM rt WHERE MATCH('message');
+------+--------+------------+
| id | weight | message_id |
+------+--------+------------+
| 1 | 1500 | 1 |
+------+--------+------------+
1 row in set (0.00 sec)
And that’s it, now you’re ready to start your real-time search!
All the details can found in the doc folder of the repository. This is just a brief write up to get you started.