-
Notifications
You must be signed in to change notification settings - Fork 0
Displaying Unicode Text in irb
If you find that Unicode text is not showing up properly in irb, try the following (instructions for Mac OS X, rbenv):
iTerm2 was set up properly, and the shell would happily accept text containing Unicode characters. However, irb would not.
Launch irb and try pasting some Unicode text...such as "Déšť dští ve Španělsku zvlášť tam kde je pláň"
$ irb
>> "D dtve panlsku zvl tam kde je "
=> "D dtve panlsku zvl tam kde je "
Notice the missing text? All the characters with diacriticals are gone.
Check what encoding irb is using for its text.
>> puts __ENCODING__
US-ASCII
=> nil
Maybe we should try starting irb with the --encoding flag?
$ irb -EUTF-8
>> puts __ENCODING__
UTF-8
=> nil
This sets the encoding to UTF-8, but irb is still not happy.
Try pasting "西班牙的雨大多數落在平原上" into irb
>> " "
The solution requires recompiling Ruby by linking it against a specific library, and making sure an environment variable is set properly.
irb as shipped is compiled against libedit on Mac OS X, which means it doesn't handle Unicode text very well. We need to recompile Ruby is against readline, instead of libedit. Follow the instructions set out in [this blogpost] (https://coderwall.com/p/wdm-_q)
In my case,
$ cd .rbenv/versions/2.0.0-p247/lib/ruby/2.0.0/x86_64-darwin12.5.0/
$ otool -L readline.bundle
produced
$ /usr/lib/libedit.3.dylib (compatibility version 2.0.0, current version 3.0.0)
Th reference to libedit.3.dylib shows that Ruby was using libedit instead of readline. libedit doesn't know how to deal with Unicode text, apparently.
The solution suggested is to recompile Ruby using readline instead, as described in the [ruby-build wiki] (https://github.com/sstephenson/ruby-build/wiki).
$ brew install readline
$ RUBY_CONFIGURE_OPTS=--with-readline-dir="$(brew --prefix readline)" rbenv install 2.0.0-p247
after this, otool reports
$ /usr/local/opt/readline/lib/libreadline.6.2.dylib (compatibility version 6.0.0, current version 6.2.0)
Ruby is now using readline, and irb should now display Unicode text properly.
If the above instructions don't solve the problem, make sure that the $LANG environment variable has a sensible value.
The $LANG environment variable is normally set depending on the Language and Region settings in System Preferences (Mac OS X).
$ echo $LANG
should print out the region setting.
In my setup, however, I use a custom "Region" setting, and $LANG was blank. I edited my .zshrc file to include the following:
export LANG="en_GB.UTF-8"
and hey presto - irb now displays Unicode text!
$ irb
>> "Es blüht so grün wie Blüten blüh'n im Frühling".encoding
=> #<Encoding:UTF-8>
>> "Lenn délen édes éjen édent remélsz".unicode_titlecase
=> "Lenn Délen Édes Éjen Édent Remélsz"