There are countless posts out there, screams in the darkness by developers claiming that this gem doesn't work with that web framework using Ruby 1.9
In my case the problematic layers in the technology stack were Padrino 0.9.14 with Sinatra-1.0 and Typhoeus 0.1.31 plus some other String to byte array conversions in our codebase.
Testing had showed that when the app received input containing umlauts and other accented letters (i.e. multi byte characters) it failed. There are some lengthy and detailed explanations and discussions about how Ruby 1.8 handles character encoding differently from Ruby 1.9. The sort of thing I'd hoped I'd never have to read but some times you just have to get informed. Hopefully this article will help shortcut or reiterate some of this info.
So lets begin with the rendered view of your web app, a couple of things will help ease the pain here.
First make sure all your pages contain the correct content-type meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Secondly make sure that all your forms use the accept-charset attribute in the html form tag or whatever you use to generate the tag:
<form accept-charset="utf-8"...
now we know that the pages can display and post multi byte UTF-8 characters, the web application might be a different story...
One of the next quick fixes to apply which worked well for me was to apply a filter on all requests which forced the encoding of all incoming String params.
A branch of RKH's fork of Sinatra helped with this, there's a utility method in lib/sinatra/base.rb which does the trick...
if defined? Encoding if Encoding.default_external.to_s =~ /^ASCII/ Encoding.default_external = "UTF-8" end Encoding.default_internal ||= Encoding.default_external def force_encoding(data) return if data == self if data.respond_to? :force_encoding data.force_encoding(Encoding.default_external) elsif data.respond_to? :each_value data.each_value { |v| force_encoding(v) } elsif data.respond_to? :each data.each { |v| force_encoding(v) } end end else def force_encoding(*) end end
to work with this we need to set some default encodings for our webapp.
In Padrino this can be done in the app.rb file:
if RUBY_VERSION < '1.9'
$KCODE = 'u'
else
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
end
if you are a padrino user who doesn't wanna touch the Sinatra gem you can create a general before filter in app.rb like this
before do
force_encoding(params)
end
there are likely other parts of your application stack that will not be UTF-8 friendly, noteably your tests might complain if you attempt to add multi byte characters to your test data.
Adding the code hint
# encoding: utf-8
to the first line of the test class files will alleviate this, it's a grubby fix but we live in a world of compromises, what can you do.
blog comments powered by Disqus