Bug in ISO-2022 codec

Stephen J. Turnbull stephen at xemacs.org
Tue Jan 9 20:24:03 EST 2007


The ISO-2022 codec was recently enhanced to handle UTF-8 in some way
(using DOCS if I'm not mistaken), but the support is broken, as the
encoder produces codes that the decoder barfs on, producing this error
(tripped in Funicode_to_char):

Wrong type argument: natnump, -205385933

(Presumably that value is specific to the particular data.)

Here is a sample file (gzipped) that I know produces the error for me
(on Mac OS X/PPC 10.4, XEmacs 21.5.27 +CVS-20060105).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: for-aidan.gz
Type: application/octet-stream
Size: 224 bytes
Desc: file that decoder barfs on
Url : http://lists.xemacs.org/pipermail/xemacs-beta/attachments/20070110/4d89d96d/for-aidan.obj
-------------- next part --------------

Here is the best I can do to reproduce the original data (gzipped)
that was encoded by XEmacs (probably a build from early December):

-------------- next part --------------
A non-text attachment was scrubbed...
Name: for-aidan.bin.gz
Type: application/octet-stream
Size: 238 bytes
Desc: file similar to the original UTF-8 stream
Url : http://lists.xemacs.org/pipermail/xemacs-beta/attachments/20070110/4d89d96d/for-aidan.bin.obj
-------------- next part --------------

Originally this was part of a multi-megabyte shell buffer that I save
as a log of my work; I don't have the original of that, so these
cut-down versions are the best I can do.

There's a different, higher-level bug as well.  If you read the
error-prone file as binary, XEmacs will read it in.  Then select the
ISO 2022 DOCS-encoded text, tell XEmacs to decode it as iso-2022-jp
using decode-coding-region, and *poof* it disappears!  Neat trick, eh?
It does issue the same error as above, so the codec bug is involved.

However, this data corruption is a higher level problem; apparently
decode-coding-region is deleting the text before decoding it or
something like that.  This is bad for several reasons, in particular
it violates invariants that extents and markers are supposed to
satisfy.



More information about the XEmacs-Beta mailing list