Monday, December 30, 2013

How to make UTF8 work with Ruby 1.9/2.0, Rails and Mysql

Storing and presenting UTF8 was always a problem especially when each component in your web development stack has partial support for UTF8.

The data entered in the forms on the web pages are valid utf8, and Mysql databases were encoded with utf8, but the Rails app with the original mysql gem (version 1) was storing data in  latin1 encoding into the utf8 tables! Somehow the data retrieved from Mysql was displayed properly in utf8 if I take care to remove encoding :utf8 from mysql gem database configuration, and set the html character set to utf8.

With Ruby 1.9/2.0 and Mysql2 gem, something went wrong and the data no longer displayed correctly. I found that the best solution to fix this problem is by configuring Mysql correctly and reinserting the data in the correct encoding.

1) First, backup your database.

2) Add the following configurations in Mysql my.cnf file, do not add [client] and [mysqld] just add the options in their respective directives;
default-character-set = utf8


3) Dump the database but prevent Mysql from converting the encoding of the data.
mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DATABASE_NAME > DATABASE_NAME.sql

mysql -u root -p --default-character-set=utf8 DATABASE_NAME < DATABASE_NAME.sql
Don't forget to check your data thoroughly to make sure that all characters are displayed correctly.

No comments:

Fun With Haxl (Part 1)

This is a blog-post version of a talk I recently gave at the Haskell eXchange 2015. The video of the talk is here, but there were a lot of...