Never been to CodeSnippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world (or not, you can keep them private!)

Convert a db to UTF8 after upgrading to MySQL 4.1 (See related posts)

If you've ever used a UTF8 application on a pre-4.1 MySQL server, or never cared about encodings on a 4.1 setup even, you may have a non-UTF8 database containing UTF8 data. While this doesn't bother most applications (e.g. PHP weblogs), it's not clean and you can't sort properly with any non-Western characters. This procedure will fix it:

mysqldump --user=username --password=password --default-character-set=latin1 --skip-set-charset dbname > dump.sql
chgrep latin1 utf8 dump.sql
mysql --user=username --password=password --execute="DROP DATABASE dbname; CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql


The chgrep part is important, because the table definitions in your dump will likely have "latin1" preserved. If you don't have chgrep, you may use any search-and-replace capable editor, but remember that it must open and save UTF8 properly. Edit: Instead of 'chgrep', you can use 'sed' e.g.:

sed -i "" 's/latin1/utf8/g' dump.sql


Alternatively you may attach "--skip-create-options" to the mysqldump command, but that could omit some needed options (e.g. PACK_KEYS=1 etc.).

You may change the utf8_general_ci collation to whatever you need, e.g. utf8_czech_ci for my purposes.

Edit: Fixed the typo as per the comment below.

Comments on this post

vanweerd posts on Feb 20, 2006 at 14:10
Note the typo --default_character-set in first line should b --default-character-set (as it is in line 4)
xy77 posts on Jul 18, 2006 at 11:49
It worked for me to dump the db using the command
mysqldump --user=username --password=password --default-character-set=latin1 --skip-set-charset dbname > dump.sql

emptying the db:
mysql --user=username --password=password --execute="DROP DATABASE dbname; CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"

and reimporting the undmodified dump.sql using
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql


The sed command didn't work for me, I get an error:
sed: couldn`t read s/latin1/utf8/g : No such file or directory

- David
lokus posts on Jul 20, 2006 at 07:19
I believe the instruction and method here is more straight forward than the dump and import.
http://dev.mysql.com/doc/refman/4.1/en/charset-conversion.html
sirn posts on Aug 08, 2006 at 19:56
for those who can't use the provided sed command, use this instead:

sed -r 's/latin1/utf8/g' dump.sql > dump_utf.sql
UltraKnorrie posts on Aug 05, 2008 at 11:02
A Quick add-on to this useful post.

I was using PhPMyAdmin, sincee I got no direct access to the server (mysqldump was not possible).
Here's my solution: on the old server running MySQL 3.xx use the export function to grab your data & structure (do both separately).
Now, as meentioned earlier, the charset of early MySQL DBs was by default latin1_swedish_ci.
So, on your new server running MySQL 4.xx or 5.xx (and also a newer version of PhPMyAdmin), select "latin1_swedisch_ci" as connection collation in the main screen of PhPMyAdmin.
Next, create you DB and select utf8_unicode_ci as collation.
Next, import your structure using charset = latin1
Next, import your data using charset = latin1

A browse on your tables should show that all accents are correctly stored in your database now.

Best,
Sarge posts on Mar 25, 2009 at 20:23
You can also use sed like this:
sed -i 's/latin1/utf8/g' dump.sql
kristen30 posts on Oct 16, 2009 at 09:57
The sed command didn't work for me, I get an error:

custom logo | brochure design
betssonbahis posts on Oct 24, 2009 at 08:10
Hi, my name is Betssonbahis
I have a same problem and this fix is works, thanks mamash
dew_Drops posts on Oct 25, 2009 at 14:55
and reimporting the undmodified dump.sql using
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql


The sed command didn't work for me, I get an error:
sed: couldn`t read s/latin1/utf8/g : No such file or directory

Apex Professionals
Adrian123 posts on Nov 07, 2009 at 07:52
When my grandparents were alive, they had little use for gotomeeting review, as proven by today's prevailing e car insurance. That said, it might be prudent to pay more attention to Vistaprint Promo Code, lest we fall foul of the notorious Printplace Coupon Code
danabr13 posts on Nov 19, 2009 at 15:47
I like previous comment website development packages.
Oil Quotes
zagam2 posts on Nov 27, 2009 at 05:47
Need a Tapeless Backup Solution? The SonicWALL CDP now comes in (4) models. SonicWALL CDP 110, SonicWALL CDP 210, SonicWALL CDP 5040, SonicWALL CDP 6080
zagam2 posts on Dec 09, 2009 at 08:19
Nice method of converting...i need this
Free Dating Websites
..thanks
seodanny posts on Dec 09, 2009 at 19:49
Very useful code snippet - can I suggest taking a look at our website of the day.

Also, I'd like to draw your attention to a website about quarry services and quarry plant.

Good bye.
aaronkolodjski posts on Dec 09, 2009 at 23:23
Thanks for the informative content

webhosting

You need to create an account or log in to post comments to this site.