Monday, December 30, 2013

How to make UTF8 work with Ruby 1.9/2.0, Rails and Mysql

Storing and presenting UTF8 was always a problem especially when each component in your web development stack has partial support for UTF8.

The data entered in the forms on the web pages are valid utf8, and Mysql databases were encoded with utf8, but the Rails app with the original mysql gem (version 1) was storing data in  latin1 encoding into the utf8 tables! Somehow the data retrieved from Mysql was displayed properly in utf8 if I take care to remove encoding :utf8 from mysql gem database configuration, and set the html character set to utf8.

With Ruby 1.9/2.0 and Mysql2 gem, something went wrong and the data no longer displayed correctly. I found that the best solution to fix this problem is by configuring Mysql correctly and reinserting the data in the correct encoding.

1) First, backup your database.

2) Add the following configurations in Mysql my.cnf file, do not add [client] and [mysqld] just add the options in their respective directives;
[client]
default-character-set = utf8

[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci


3) Dump the database but prevent Mysql from converting the encoding of the data.
mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DATABASE_NAME > DATABASE_NAME.sql
 

mysql -u root -p --default-character-set=utf8 DATABASE_NAME < DATABASE_NAME.sql
Don't forget to check your data thoroughly to make sure that all characters are displayed correctly.

No comments:

The Y Combinator (Slight Return)

Tiger got to hunt, Bird got to fly; Lisper got to sit and wonder, (Y (Y Y))? Tiger got to sleep, Bird got to land; Lisper got to tell hims...