XEmacs cannot display U+FFFD (REPLACEMENT CHARACTER) correctly
Aidan Kehoe
kehoea at parhasard.net
Sun Jul 22 15:38:04 EDT 2007
Ar an fichiú lá de mí Iúil, scríobh Mike FABIAN:
> XEmacs 21.5.x cannot display U+FFFD correctly. In "normal" text files, a
> wrong glyph is shown, in web-pages viewed with w3m.el only garbage
> Chinese characters are shown after the first occurence of U+FFFD.
>
> The problem seems to be that XEmacs maps U+FFFD to Big5:
>
> (split-char (string-to-char (decode-coding-string "\357\277\275"
> 'utf-8))) => (chinese-big5-1 35 110)
>
> and the reason for this seems to be that BIG5.TXT in the XEmacs sources
> (which comes originally from Unicode.org) maps several Big5 characters to
> U+FFFD.
Okay, for the sake of round-trip compatibility (especially with a future
Unicode-oriented XEmacs), we should map those 7 characters either to a
private-use area (and one outside of the BMP) or an area outside of
Unicode. Do you have any objections to the following mapping?
0xA15A => U+FA15A
0xA1C3 => U+FA1C3
0xA1C5 => U+FA1C5
0xA1FE => U+FA1FE
0xA240 => U+FA240
0xA2CC => U+FA2CC
0xA2CE => U+FA2CE
I’m not sure that this will solve the issue with W3M; I don’t have a URL to
test that with.
--
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)
More information about the XEmacs-Beta
mailing list