Unicodification of sources, part 1

Tue Jun 20 02:40:06 EDT 2006

 Ar an naoú lá déag de mí Meitheamh, scríobh Stephen J. Turnbull: 

 > [...] Comments?  Objections?  Obstacles I've overlooked?

The reason I haven’t proposed anything remotely like this up to now is that
once we’re at the point where the release branch has been stable for a while
and has good support for UTF-8, we can deprecate 21.4 and do all the things
you listed in a week, without the need for a parallel set of Mule
packages. Which would seem to be more economic of effort.

Now, the pragmatic issue with that is that there won’t be a release--I’m the
only person who’s remotely a release engineer candidate, and when I devote
all my spare cycles to the editor, I’m told to slow down, to take my
time. Which I do, but there’s an inconsistency in the idea of my doing
things slowly and my concurrently doing enough work for a release.

And there are several UTF-8 input methods in the GNU Emacs tree today that I
would love to import, but that will break in 21.4--e.g. latin-ltx.el, which
allows generating almost any character supported by TeX with its TeX escape,
or sgml-input.el, which is the same idea but using SGML escapes instead.
Mule-UCS doesn’t support lots of the code points used there, and I would be
very uncomfortable saying to people “yeah, here’s a great input method in
the packages tree, but it’ll break on the stable branch.”

So, I’m tentatively for that plan, given that someone else has proposed it
and is unlikely to veto it on the grounds I put forward above. Mike’s point
about the impact on 21.4 equally applies to SXEmacs; I haven’t looked into
how much work it would be, but I think the best way to support Unicode on
21.4 and SXEmacs would be to port over Ben’s 21.5 Unicode support, since
that’s much less demanding of memory and currently more complete in its
support for code points than is Mule-UCS. But Norbert can reasonably veto
that for XEmacs 21.4.

(From the SXEmacs Unix-only perspective, one alternative _might_ be to find
an ISO-2022 compatible coding system supported by GNU iconv that does what
my recent changes to the escape-quoted coding system does; that is,
represent Unicode code points that don’t have a known ISO 2022 mapping using
UTF-8 escapes. Then one could replace the guts of the Unicode coding systems
with iconv() calls. But I couldn’t find such an iconv coding system easily;
maybe one does exist, still.)

-- 
Santa Maradona, priez pour moi!