Server config: Mistakes with character encoding - part 2
posted: March 10th, 2007 · by: Sven
So, you know how to pipe data from the webbrowser through various library layers to a database and all the way back again. You know how to configure every layer of all those sophisticated web applications, frameworks, libraries, programming languages, ...
But now here’s this dammned static file that seems to get totally screwed somewhere and you’re already starting to pull your hair out because there’s no apparent reason.
Relax and step back. Look again. Sometimes things are simple, that simple that one doesn’t see the wood for the trees.
Know what your server talks!
Know its configs and defaults, that is. If you’ve put a static (say) HTML file on your server, encoded as (say) utf-8 and you just can’t get it displayed the right way … look at your server’s config.
Here’s why.
Many webservers send a content-type HTTP header by default alongside with (e.g.) static HTML files. For example, Apache by default sends this one:
Content-Type: text/html; charset=iso-8859-1
If there’s a content-type HTTP header being send it doesn’t matter if you’ve included a meta-tag in your HTML file, like this one:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15">
Common browsers will prefer information that’s send via HTTP headers over information that’s embedded as meta-tags into the document itself.
You might well have put the correct meta-tag onto your page (e.g. declaring that this very document is utf-8 encoded) ... if the server sends a content-type HTTP header along with it that declares it is encoded as latin-1 your browser will stick to the HTTP header.
Among the top candidates: Rails’ page caching
This particular mishap might well catch you when your used to configuring your Ruby on Rails applications but for the first time ever start using Rails’ page caching mechanism. Page caching is the only one of Rails’ caching mechanisms where not Rails but the webserver will send the HTTP headers.
When you’ve configured Rails to e.g. talk utf-8 to the world, but haven’t pay attention to the headers your server is sending they’ll most probably differ from the Rails headers.
That is: the first page that’s send by Rails will be fine – the browser will know it as utf-8 and display everything correctly. Every subsequent request to the same page will be responded with the same data but preceeded by a different content-type header, so the browser will try to display it as latin-1 ...
How to check this?
To find out what HTTP headers are send for a given file you can just use curl:
curl -I http://rubyonrails.org
... will do the job.
Oh, and if you can’t use here’s an online HTTP header sniffer ... there are various other tools out there, too.
How to fix this?
It’s usually pretty trivial to change your webservers config to send your stuff with a different content-type header. For example, here’s a bunch of useful information about overwriting Apache’s default encoding with a .htaccess file. For example, in your .htaccess file you could use:
<Files "*.htm"> ForceType "text/html; charset=UTF-8" </Files>
This page contains a howto about configuring Mongrels mime-types. Basically, you add some MIME types to a YAML file like config/mongrel_mime.yml:
--- .htm: text/html; charset=UTF-8 .html: text/html; charset=UTF-8 .txt: text/plain; charset=UTF-8
... and then specify this file when you start Mongrel, like this: mongrel_rails start -m config/mongrel.mime.yml [...].
ember said March 10th, 2007 at 09:57 PM ¶
hello sven
i am lazy and prefer to set the default charset via
AddDefaultCharset utf-8
for a whole context. This can be done for a directory or a vhost.
look here: apache:adddefaultcharset
greetz,
ember
Sven said March 16th, 2007 at 03:53 PM ¶
Hi ember,
yes, that’s right. Thanks for adding this. There are several ways to configure the charset - especially with Apache. I just wanted to mention one example that would work for most people.
jack said January 23rd, 2011 at 11:43 AM ¶
thanks for that heads up. That’s a useful tip! I’ve never ran into that, but for sure that’s something quite some people will need a solution for. cheap vps
QQQ said February 7th, 2011 at 06:34 PM ¶
Finally we kissed and the passion scale went sky high and I knew I was onto a good thing - sex was a certainty free porn videos. She never hesitated when I began to fondle her breasts and she willingly exposed them for me mobile porn. They were firm and I suspected a breast enhancement but said nothing - they still felt good and I was enjoying them and gradually working my way further south free porn tube. She was a step ahead of me and before I could completely undress her she moved on me atk hairy and I was suddenly having my pants pulled down and I was enjoying one of he best cock sucking hairy pussy experiences I had ever had. ABB728019394
chat said March 31st, 2011 at 07:49 PM ¶
The following cleaned up the issue:
Dependencies.loadoncepaths -= Dependencies.loadoncepaths.select{|path| \ path =~ %r(^#{File.dirname(FILE)}) }
side sleeper pillow said April 22nd, 2011 at 05:44 AM ¶
Great Idea!
okey oyunu said May 12th, 2011 at 04:09 PM ¶
Thanks for this article. Tüm dünya artik okey oyunu oynuyor. Yillardir bir çok oyun programi olmasina ragmen, içlerinden en güzeli olarak nitelendirebilecegimiz tek bir site göze çarpmaktadir. Diger tüm okey oyunu programlarinin aksine ücretsiz olmasi ve 3 boyutlu olarak hizmet vermesi mükemmel bir gelismedir. Sizlerde www.okey-oyunu.com adresinden bu essiz okey oyununu indirebilirsiniz. Kullanimi çok basit ve Türkçe dil seçenegi ile kolaylikla oyuna baslayabilirsiniz. Ister kendi ülkenizden, isterseniz dünyanin tüm farkli bölgelerinden dilediginiz oyun odalarini seçerek, oyuna hemen baslayabilirsiniz. Okey oyunu oynamak için artik arkadas bile aramaniza gerek kalmadan, bilgisayarinizdan 100 binlerce üye ile online olarak okey oyununu oynamanin zevkine varabilirsiniz.
Foana22 said May 20th, 2011 at 07:46 AM ¶
An appliance that consumes some block of data, pass4sure 000-280 will charge to apperceive about the appearance encoding that’s been acclimated to adored the data. pass4sure 000-606 Likewise, a browser that receives an HTML page from a webserver needs to apperceive (or guess) the appearance encoding. pass4sure 1z0-042 It needs to break the $.25 and bytes this way or another.
porno said May 22nd, 2011 at 01:28 PM ¶
I do agree with all of the ideas you have presented in your post. They’re really convincing and will definitely work. Still, the posts are too short for newbies. Could you please extend them a bit from next time? Thanks for the post.
porno said May 22nd, 2011 at 01:59 PM ¶
good comment. thanks you friends.
I’ve surfed the net more than three hours today, however, I haven’t found such useful information. Thanks a lot, it is really useful to me