diff options
author | Bram Moolenaar <Bram@vim.org> | 2017-03-29 15:31:20 +0200 |
---|---|---|
committer | Bram Moolenaar <Bram@vim.org> | 2017-03-29 15:31:20 +0200 |
commit | 0c078fc7db2902d4ccba04506db082ddbef45a8c (patch) | |
tree | 7c142af9692ea6315986e3d2239e8d3f143f6881 /runtime/doc/pattern.txt | |
parent | c6cd8409c2993b1476e123fba11cb4b8d743b896 (diff) |
patch 8.0.0519: character classes are not well testedv8.0.0519
Problem: Character classes are not well tested. They can differ between
platforms.
Solution: Add tests. In the documentation make clear which classes depend
on what library function. Only use :cntrl: and :graph: for ASCII.
(Kazunobu Kuriyama, Dominique Pelle, closes #1560)
Update the documentation.
Diffstat (limited to 'runtime/doc/pattern.txt')
-rw-r--r-- | runtime/doc/pattern.txt | 43 |
1 files changed, 26 insertions, 17 deletions
diff --git a/runtime/doc/pattern.txt b/runtime/doc/pattern.txt index 1496604983..090ca6452e 100644 --- a/runtime/doc/pattern.txt +++ b/runtime/doc/pattern.txt @@ -1085,25 +1085,27 @@ x A single character, with no special meaning, matches itself - A character class expression is evaluated to the set of characters belonging to that character class. The following character classes are supported: - Name Contents ~ -*[:alnum:]* [:alnum:] ASCII letters and digits -*[:alpha:]* [:alpha:] ASCII letters -*[:blank:]* [:blank:] space and tab characters -*[:cntrl:]* [:cntrl:] control characters -*[:digit:]* [:digit:] decimal digits -*[:graph:]* [:graph:] printable characters excluding space -*[:lower:]* [:lower:] lowercase letters (all letters when + Name Func Contents ~ +*[:alnum:]* [:alnum:] isalnum ASCII letters and digits +*[:alpha:]* [:alpha:] isalpha ASCII letters +*[:blank:]* [:blank:] space and tab +*[:cntrl:]* [:cntrl:] iscntrl ASCII control characters +*[:digit:]* [:digit:] decimal digits '0' to '9' +*[:graph:]* [:graph:] isgraph ASCII printable characters excluding + space +*[:lower:]* [:lower:] (1) lowercase letters (all letters when 'ignorecase' is used) -*[:print:]* [:print:] printable characters including space -*[:punct:]* [:punct:] ASCII punctuation characters -*[:space:]* [:space:] whitespace characters -*[:upper:]* [:upper:] uppercase letters (all letters when +*[:print:]* [:print:] (2) printable characters including space +*[:punct:]* [:punct:] ispunct ASCII punctuation characters +*[:space:]* [:space:] whitespace characters: space, tab, CR, + NL, vertical tab, form feed +*[:upper:]* [:upper:] (3) uppercase letters (all letters when 'ignorecase' is used) -*[:xdigit:]* [:xdigit:] hexadecimal digits -*[:return:]* [:return:] the <CR> character -*[:tab:]* [:tab:] the <Tab> character -*[:escape:]* [:escape:] the <Esc> character -*[:backspace:]* [:backspace:] the <BS> character +*[:xdigit:]* [:xdigit:] hexadecimal digits: 0-9, a-f, A-F +*[:return:]* [:return:] the <CR> character +*[:tab:]* [:tab:] the <Tab> character +*[:escape:]* [:escape:] the <Esc> character +*[:backspace:]* [:backspace:] the <BS> character The brackets in character class expressions are additional to the brackets delimiting a collection. For example, the following is a plausible pattern for a UNIX filename: "[-./[:alnum:]_~]\+" That is, @@ -1114,6 +1116,13 @@ x A single character, with no special meaning, matches itself regexp engine. See |two-engines|. In the future these items may work for multi-byte characters. For now, to get all "alpha" characters you can use: [[:lower:][:upper:]]. + + The "Func" column shows what library function is used. The + implementation depends on the system. Otherwise: + (1) Uses islower() for ASCII and Vim builtin rules for other + characters when built with the |+multi_byte| feature. + (2) Uses Vim builtin rules + (3) As with (1) but using isupper() */[[=* *[==]* - An equivalence class. This means that characters are matched that have almost the same meaning, e.g., when ignoring accents. This |