*mbyte.txt* For Vim version 8.2. Last change: 2019 Jul 04
VIM REFERENCE MANUAL by Bram Moolenaar et al.
Multi-byte support *multibyte* *multi-byte*
*Chinese* *Japanese* *Korean*
This is about editing text in languages which have many characters that can
not be represented using one byte (one octet). Examples are Chinese, Japanese
and Korean. Unicode is also covered here.
For an introduction to the most common features, see |usr_45.txt| in the user
manual.
For changing the language of messages and menus see |mlang.txt|.
1. Getting started |mbyte-first|
2. Locale |mbyte-locale|
3. Encoding |mbyte-encoding|
4. Using a terminal |mbyte-terminal|
5. Fonts on X11 |mbyte-fonts-X11|
6. Fonts on MS-Windows |mbyte-fonts-MSwin|
7. Input on X11 |mbyte-XIM|
8. Input on MS-Windows |mbyte-IME|
9. Input with a keymap |mbyte-keymap|
10. Input with imactivatefunc() |mbyte-func|
11. Using UTF-8 |mbyte-utf8|
12. Overview of options |mbyte-options|
NOTE: This file contains UTF-8 characters. These may show up as strange
characters or boxes when using another encoding.
==============================================================================
1. Getting started *mbyte-first*
This is a summary of the multibyte features in Vim. If you are lucky it works
as described and you can start using Vim without much trouble. If something
doesn't work you will have to read the rest. Don't be surprised if it takes
quite a bit of work and experimenting to make Vim use all the multi-byte
features. Unfortunately, every system has its own way to deal with multibyte
languages and it is quite complicated.
LOCALE
First of all, you must make sure your current locale is set correctly. If
your system has been installed to use the language, it probably works right
away. If not, you can often make it work by setting the $LANG environment
variable in your shell: >
setenv LANG ja_JP.EUC
Unfortunately, the name of the locale depends on your system. Japanese might
also be called "ja_JP.EUCjp" or just "ja". To see what is currently used: >
:language
To change the locale inside Vim use: >
:language ja_JP.EUC
Vim will give an error message if this doesn't work. This is a good way to
experiment and find the locale name you want to use. But it's always better
to set the locale in the shell, so that it is used right from the start.
See |mbyte-locale| for details.
ENCODING
If your locale works properly, Vim will try to set the 'encoding' option
accordingly. If this doesn't work you can overrule its value: >
:set encoding=utf-8
See |encoding-values| for a list of acceptable values.
The result is that all the text that is used inside Vim will be in this
encoding. Not only the text in the buffers, but also in registers, variables,
etc. This also means that changing the value of 'encoding' makes the existing
text invalid! The text doesn't change, but it will be displayed wrong.
You can edit files in another encoding than what 'encoding' is set to. Vim
will convert the file when you read it and convert it back when you write it.
See 'fileencoding', 'fileencodings' and |++enc|.
DISPLAY AND FONTS
If you are working in a terminal (emulator) you must make sure it accepts the
same encoding as which Vim is working with. If this is not the case, you can
use the 'termencoding' option to make Vim convert text automatically.
For the GUI you must select fonts that work with the current 'encoding'. This
is the difficult part. It depends on the system you are using, the locale and
a few other things. See the chapters on fonts: |mbyte-fonts-X11| for
X-Windows and |mbyte-fonts-MSwin| for MS-Windows.
For GTK+ 2, you can skip most of this section. The option 'guifontset' does
no longer exist. You only need to set 'guifont' and everything should "just
work". If your system comes with Xft2 and fontconfig and the current font
does not contain a certain glyph, a different font will be used automatically
if available. The 'guifontwide' option is still supported but usually you do
not need to set it. It is only necessary if the automatic font selection does
not suit your needs.
For X11 you can set the 'guifontset' option to a list of fonts that together
cover the characters that are used. Example for Korean: >
:set guifontset=k12,r12
Alternatively, you can set 'guifont' and 'guifontwide'. 'guifont' is used for
the single-width characters, 'guifontwide' for the double-width characters.
Thus the 'guifontwide' font must be exactly twice as wide as 'guifont'.
Example for UTF-8: >
:set guifont=-misc-fixed-medium-r-normal-*-18-120-100-100-c-90-iso10646-1
:set guifontwide=-misc-fixed-medium-r-normal-*-18-120-100-100-c-180-iso10646-1
You can also set 'guifont' alone, Vim will try to find a matching
'guifontwide' for you.
INPUT
There are several ways to enter multi-byte characters:
- For X11 XIM can be used. See |XIM|.
- For MS-Windows IME can be used. See |IME|.
- For all systems keymaps can be used. See |mbyte-keymap|.
The options 'iminsert', 'imsearch' and 'imcmdline' can be used to chose
the different input methods or disable them temporarily.
==============================================================================
2. Locale *mbyte-locale*
The easiest setup is when your whole system uses the locale you want to work
in. But it's also possible to set the locale