[Q] Handle bytes in the range 0x80-0xC0 better when dealing with ISO-IR 196.

Thu Nov 23 13:02:33 EST 2006

Aidan Kehoe writes:

 > Well, David?s problem is an actual problem. 

David's problem is also reasonably easy to workaround.  Read TeX's
output as binary (which is what it is, of course), walk up the buffer
or string until you hit a legal UTF-8 first byte, then decode from
there on.  It's a little bit harder than that, but not much.

 > And the many ways in which we fail
 > http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt is
 > something that annoys me the purist in me immensely--implementing
 > such a thing would address both.

I see how it addresses David's complaint, although it's way overkill
for that purpose, especially if you actually implement all the display
and chartable stuff.  I fail to see how it addresses failures on
Markus Kuhn's test.  We still won't be able to display that whole
file.