Using UTF-8 (Unicode) in Gentoo: Difference between revisions

From Elvanör's Technical Wiki
Jump to navigation Jump to search
No edit summary
Line 20: Line 20:
This is because Linux uses the Rock Ridge extensions, whereas Windows/OS X must use the poorly designed Joliet extensions. It seems that currently mkisofs (part of cdrtools) can not deal with an input-charset of UTF-8, or at least the stable version in Gentoo cannot. But I don't know if support could be added, or if it is impossible because of some Joliet limitations. Joliet seems to use UTF-16 encoding for filenames.
This is because Linux uses the Rock Ridge extensions, whereas Windows/OS X must use the poorly designed Joliet extensions. It seems that currently mkisofs (part of cdrtools) can not deal with an input-charset of UTF-8, or at least the stable version in Gentoo cannot. But I don't know if support could be added, or if it is impossible because of some Joliet limitations. Joliet seems to use UTF-16 encoding for filenames.


Anyway, the current situation is that I can have DVDs with UTF-8 filenames, but they work correctly only with Linux. Slightly annoying.
Anyway, the current situation is that I can have DVDs with UTF-8 filenames, but they work correctly only with Linux. Slightly annoying. Maybe with more recent versions of cdrtools or cdrkit (the Debian fork of cdrtools), this issue will go away.

Revision as of 17:24, 11 December 2006

Using UTF-8 in your Gentoo system is absolutely mandatory for many reasons... This short guide contains some links to the official Gentoo UTF-8 documentation and also discussions about some potential issues with UTF-8.

Most important stuff: Gentoo documentation, and setting your locale

There are several manipulations you need to do in order to have a full UTF-8 system. Basically, however, there are 3 things to do:

  • Building UTF-8 support in your kernel;
  • Generating and using a UTF-8 locale (with a UTF-8 enabled glibc);
  • Add the "unicode" flag to your USE flags in /etc/make.conf.

The two following links will explain all that in more details.

UTF-8, ISO 9660 and Joliet extensions

A current problem I have is that I did not manage to burn a CD/DVD with filenames in UTF-8. In K3b, checking the option "Generate Rock Ridge extensions" creates a working DVD under Linux (eg, the filenames appear correctly). However, under Windows and Mac OS X the same DVD does not work (filenames appear with garbage characters).

This is because Linux uses the Rock Ridge extensions, whereas Windows/OS X must use the poorly designed Joliet extensions. It seems that currently mkisofs (part of cdrtools) can not deal with an input-charset of UTF-8, or at least the stable version in Gentoo cannot. But I don't know if support could be added, or if it is impossible because of some Joliet limitations. Joliet seems to use UTF-16 encoding for filenames.

Anyway, the current situation is that I can have DVDs with UTF-8 filenames, but they work correctly only with Linux. Slightly annoying. Maybe with more recent versions of cdrtools or cdrkit (the Debian fork of cdrtools), this issue will go away.