fix output size handling for multi-unicode-char big5-hkscs characters
authorRich Felker <dalias@aerifal.cx>
Sat, 2 Jun 2018 02:05:48 +0000 (22:05 -0400)
committerRich Felker <dalias@aerifal.cx>
Sat, 2 Jun 2018 02:05:48 +0000 (22:05 -0400)
since this iconv implementation's output is stateless, it's necessary
to know before writing anything to the output buffer whether the
conversion of the current input character will fit.

previously we used a hard-coded table of the output size needed for
each supported output encoding, but failed to update the table when
adding support for conversion to jis-based encodings and again when
adding separate encoding identifiers for implicit-endianness utf-16/32
and ucs-2/4 variants, resulting in out-of-bound table reads and
incorrect size checks. no buffer overflow was possible, but the
affected characters could be converted incorrectly, and iconv could
potentially produce an incorrect return value as a result.

remove the hard-coded table, and instead perform the recursive iconv
conversion to a temporary buffer, measuring the output size and
transferring it to the actual output buffer only if the whole
converted result fits.

src/locale/iconv.c

index 05d4209561dbbf63db5e10152fff7d52530f38c8..3047c27b2e8b36561dd8feba244f274492be1693 100644 (file)
@@ -458,16 +458,24 @@ size_t iconv(iconv_t cd, char **restrict in, size_t *restrict inb, char **restri
                                 * range in the hkscs table then hard-coded
                                 * here. Ugly, yes. */
                                if (c/256 == 0xdc) {
-                                       if (totype-0300U > 8) k = 2;
-                                       else k = "\10\4\4\10\4\4\10\2\4"[totype-0300];
-                                       if (k > *outb) goto toobig;
-                                       x += iconv(combine_to_from(to, find_charmap("utf8")),
+                                       union {
+                                               char c[8];
+                                               wchar_t wc[2];
+                                       } tmp;
+                                       char *ptmp = tmp.c;
+                                       size_t tmpx = iconv(combine_to_from(to, find_charmap("utf8")),
                                                &(char *){"\303\212\314\204"
                                                "\303\212\314\214"
                                                "\303\252\314\204"
                                                "\303\252\314\214"
                                                +c%256}, &(size_t){4},
-                                               out, outb);
+                                               &ptmp, &(size_t){sizeof tmp});
+                                       size_t tmplen = ptmp - tmp.c;
+                                       if (tmplen > *outb) goto toobig;
+                                       if (tmpx) x++;
+                                       memcpy(*out, &tmp, tmplen);
+                                       *out += tmplen;
+                                       *outb -= tmplen;
                                        continue;
                                }
                                if (!c) goto ilseq;