Discarding non-ASCII characters in Ruby on Rails
My current Ruby on Rails project requires SEO-friendly URLs. I’ve installed plugin, but soon I found out that all my slugs contained Polish characters (in UTF-8). After some googling (acts_as_slugablePolish Ruby on Rails forum) and some more sips of coffee I came up with some modifications to acts_as_slugable.
Here’s lib/acts_as_slugable_ascii.rb:
require 'string'
module Multiup
module Acts
module Slugable
module InstanceMethods
private
def create_slug
return if self.errors.length > 0
if self[source_column].nil? or self[source_column].empty?
return
end
if self[slug_column].to_s.empty?
test_string = self[source_column]
proposed_slug = test_string.strip.downcase.gsub(/[\'\"\#\$\,\.\!\?\%\@\(\)]+/, '').to_ascii
proposed_slug = proposed_slug.gsub(/&/, 'and')
proposed_slug = proposed_slug.gsub(/[\W^-_]+/, '-')
proposed_slug = proposed_slug.gsub(/\-{2}/, '-')
suffix = ""
existing = true
acts_as_slugable_class.transaction do
while existing != nil
existing = acts_as_slugable_class.find(:first, :conditions => ["#{slug_column} = ? and #{slug_scope_condition}", proposed_slug + suffix])
if existing
if suffix.empty?
suffix = "-0"
else
suffix.succ!
end
end
end
end
self[slug_column] = proposed_slug + suffix
end
end
end
end
end
end
ActiveRecord::Base.class_eval do
include Multiup::Acts::Slugable
end
The only difference is the use of to_ascii method.
Now, we need to extend the String class. to_ascii method converts Polish UTF-8 characters to ASCII and discards all unknown characters:
require 'iconv'
class String
def to_ascii
ascii = 'acelnoszzACELNOSZZ'
non_ascii = "\271\346\352\263\361\363\234\277\237"
to_ascii_string = self
begin
result = Iconv.new("CP1250", "UTF-8").iconv(to_ascii_string)
rescue Iconv::IllegalSequence => e
failed = e.failed.chars.split(//, 2)
to_ascii_string = to_ascii_string.gsub(failed[0], '')
retry
end
result.tr!(non_ascii, ascii)
end
end
And we’re done!
Just add the following line to your model:
require 'acts_as_slugable_ascii'
and configure acts_as_slugable as instructed in README file. Now you’re ready to face your users.