fix iconv mapping of big5-hkscs characters that map to two unicode chars
authorRich Felker <dalias@aerifal.cx>
Sat, 2 Jun 2018 01:50:17 +0000 (21:50 -0400)
committerRich Felker <dalias@aerifal.cx>
Sat, 2 Jun 2018 01:50:17 +0000 (21:50 -0400)
this case is handled with a recursive call to iconv using a
specially-constructed conversion descriptor. the constant 0 was used
as the offset for utf-8, since utf-8 appears first in the charmaps
table, but the offset used needs to point into the charmap entry, past
the name/aliases at the beginning, to the byte identifying the
encoding. as a result of this error, junk was produced.

instead, call find_charmap so we don't have to hard-code a nontrivial
offset. with this change, the code has been tested and found to work
in the case of converting the affected hkscs characters to utf-8.

src/locale/iconv.c

index 3a34395cfad8b091a5bec186f3f22398dedbc8de..05d4209561dbbf63db5e10152fff7d52530f38c8 100644 (file)
@@ -461,7 +461,7 @@ size_t iconv(iconv_t cd, char **restrict in, size_t *restrict inb, char **restri
                                        if (totype-0300U > 8) k = 2;
                                        else k = "\10\4\4\10\4\4\10\2\4"[totype-0300];
                                        if (k > *outb) goto toobig;
-                                       x += iconv(combine_to_from(to, 0),
+                                       x += iconv(combine_to_from(to, find_charmap("utf8")),
                                                &(char *){"\303\212\314\204"
                                                "\303\212\314\214"
                                                "\303\252\314\204"