Real-time indexing and searching with Sphinx 1.10.1-dev

Real-time indexing and searching is one of the major goals of web development in 2010. As of version 1.10.1 Sphinx is added to the list of search engines that deliver the promise of real-time.

Remember, in this post I’m using a developers’ trunk, checked out from repository. Get your’s copy here:

$ svn checkout http://sphinxsearch.googlecode.com/svn/trunk sphinx
$ cd sphinx
$ ./configure
$ make
$ sudo make install

Worked for me out of the box on a Debian etch box after failing on Snow Leopard.

First you need to setup your sphinx.conf file:

index rt {
	type = rt
	path = /usr/local/sphinx/data/rt
	rt_field = message
	rt_attr_uint = message_id
}

searchd {
	log = /var/log/searchd.log
	query_log = /var/log/query.log
	pid_file = /var/run/searchd.pid
	workers = threads
	listen = 192.168.24.11:9312:mysql41
}

You need listen switch in the searchd section to set up a MySQL protocol based server. However, a couple of quick benchmarks show that it’s significantly slower that the regular searchd.

After launching search I used the MySQL protocol to index my data. Given the structure (message, message_id) I can now connect to my server and start indexing:

 $ mysql -P 9312 -h 127.0.0.1
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 1.10.1-dev (r2309)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> INSERT INTO rt VALUES (1, 'this message has a body', 1);

Remember, always use single quotes.

Querying real-time index is just as easy as typing:

mysql> SELECT * FROM rt WHERE MATCH('message');
+------+--------+------------+
| id   | weight | message_id |
+------+--------+------------+
|    1 |   1500 |          1 |
+------+--------+------------+
1 row in set (0.00 sec)

And that’s it, now you’re ready to start your real-time search!

All the details can found in the doc folder of the repository. This is just a brief write up to get you started.

Using redgreen gem in Ruby 1.9

We all love autotest, don’t we? It’s an essential tool for TDD software development, no doubt about it.

redgreen adds a little sugar to how your test results are displayed on the console. It doesn’t work out-of-the box with Ruby 1.9, however it’s just a matter of a simple tweak in your ~/.autotest file.

Here’s an example of my ~/.autotest:

require "autotest/fsevent"
require "autotest/growl"

unless ENV["RSPEC"]
  PLATFORM = RUBY_PLATFORM unless defined? PLATFORM
  require "redgreen/autotest"
end

Autotest.add_hook :initialize do |at|
  %w{.svn .hg .git vendor}.each {|exception| at.add_exception exception}
end

For Ruby 1.9 be sure to define PLATFORM constant, and if you want to use both autotest and autospec you need the unless ENV["RSPEC"] ... end block.

Fetching favicons with Ruby

With Ruby and Google, to be exact.

Favg is a simple gem that wraps one of Google’s S2 services and can be used to fetch and save websites’ favicons.

To fetch a favicon:

require "favg"
Favg.fetch "killingcreativity.com"
 => "\x89PNG\r\n\x1A\n\x00\x00\x00\rIHDR\x00\x00\x00\x10\x00\x00\x00\x10\b\x06\x00\x00\x00\x1F\xF3\xFFa\x00\x00\x00\x04sBIT\b\b\b\b|\bd\x88\x00\x00\x00OIDAT8\x8Dc\xFC\xFF\xFF\xFF\x7F\x06\n\x00\x13%\x9AG\r\x80\x00\x16\\\x12\x8F\x1E=b\xD8\xB7o\x1F\x03\x03\x03\x03\x83\x93\x93\x13\x83\x9C\x9C\x1Ci\x06\xEC\xDB\xB7\x8F\xE1\xF6\xED\xDBp~BB\x02Vu\xB4\xF3\x82\x93\x93\x13V6:`\x1CM\x89\x83\xC0\x00\x00g\xEB\x15\x94e\x1Er\x85\x00\x00\x00\x00IEND\xAEB`\x82"

To save a favicon to a file:

require "favg"
Favg.fetch "killingcreativity.com", "favicon.png"
 => #<File:favicon.png>

Submitting forms with JavaScript

From time to time some of our users file a bug report concerning an accidental form submit. We weren’t able to track it down until today and there seems to be a nasty bug in some browsers (appeared in Firefox 3.6 on Mac OS X).

We used to attach an event to a textarea element (we’re using Prototype, this is not the exact code but the general idea of it):

$("input-textarea").observe("keyup", function() {
  if (event.keyCode == Event.KEY_RETURN) {
    return formSubmit();
  }
});

This worked in most scenarios except for this one:

  • User enters some text into textarea,
  • User clicks on the browser’s address bar and enters an URL,
  • User presses Enter,
  • The form was submitted, then location was changed.

Avoiding this kind of behavior can be achieved by switching from onkeyup to onkeydown event.

$("input-textarea").observe("keydown", function() {
  if (event.keyCode == Event.KEY_RETURN) {
    return formSubmit();
  }
});

This snippet works for us as expected.

Read It Later tumblog

Read It Later is a place where I’ll be posting some links that I find interesting and worth reading.

I’ll keep in writing here too, so don’t leave and stay tuned!