Loads of platforms appear to have old or broken Unicode character type

information and are missing widths for relatively common Unicode characters (so mbtowc() works, but wcwidth() fails). So if wcwidth() returns -1, assume a width of 1 instead of ignoring the character.
author: nicm <nicm> 2016-04-27 09:36:25 +0000
committer: nicm <nicm> 2016-04-27 09:36:25 +0000
commit: 23fdbc9ea6a6f5c93f042043f0407ed5d9bd0e5b (patch)
tree: ec0c98a2478987a3bc5e1b6ce8c243e6453c61bf /utf8.c
parent: d3546cc85cc0ed80011ec35c105ae92e1e254148 (diff)
1 files changed, 8 insertions, 0 deletions
diff --git a/utf8.c b/utf8.c
index 56281aa2..22ab62c1 100644
--- a/utf8.c
+++ b/utf8.c
@@ -119,6 +119,14 @@ utf8_width(wchar_t wc)
 	width = wcwidth(wc);
 	if (width < 0 || width > 0xff) {
 		log_debug("Unicode %04x, wcwidth() %d", wc, width);
+
+		/*
+		 * Many platforms have no width for relatively common
+		 * characters (wcwidth() returns -1); assume width 1 in this
+		 * case and hope for the best.
+		 */
+		if (width < 0)
+			return (1);
 		return (-1);
 	}
 	return (width);
author	nicm <nicm>	2016-04-27 09:36:25 +0000
committer	nicm <nicm>	2016-04-27 09:36:25 +0000
commit	23fdbc9ea6a6f5c93f042043f0407ed5d9bd0e5b (patch)
tree	ec0c98a2478987a3bc5e1b6ce8c243e6453c61bf /utf8.c
parent	d3546cc85cc0ed80011ec35c105ae92e1e254148 (diff)