decoding from unicode early

Ilya N. Golubev gin
Tue Nov 7 14:28:45 EST 2006


After changes from 2006-11-02 to 2006-11-06, including that of `src'
of 2006-11-05, ucs -> emchar conversion for ucs code point 0x2500 is
not set up as it was before.

Before the change, the following would work.

. Create `\342\224\200' text, that is, utf-8 representation of that
character.

. Call `decode-coding-region utf-8' on it.

Before the change the result would be text containing `(chinese-gb2312
41 36)' emchar, not only after dumping, but even before it, when
loading `cyrillic.el'.  After the change the same decoding operation
tries to create `jit-ucs-charset-0' emchar, which triggers the bug
described in <82y7qnfa4w.fsf at mo.msk.ru> (<unicode jit emchars
broken>).

The default sequence of elisp files dumped in 21.5 does not do so.
Site packages, however, may easily do.  (Let alone slight modification
of coding system data in dumped core elisp files.)  Actually have one
build configuration doing that.  At least need documentation saying
since what point in xemacs initialization before dumping can we rely
on unicode decoding to work without errors.  It does not happen just
after loading `unicode.el' as one might reasonably assume.  Certainly
it would be more robust to get unicode decoding running just as early
as before.

Please fix.



More information about the XEmacs-Beta mailing list