.FN!set-language-environment.FN" should set .FN!language-unicode-precedence-list.FN"

Mike FABIAN mfabian at suse.de
Wed Aug 1 10:18:02 EDT 2007


Aidan Kehoe <kehoea at parhasard.net> さんは書きました:

>  Ar an chéad lá de mí Lúnasa, scríobh Mike FABIAN: 
>
>  > Aidan Kehoe <kehoea at parhasard.net> さんは書きました:
>  >
>  > > The reason there isn’t a German (UTF-8) locale available when you start
>  > > in a Japanese UTF-8 locale, is that there is such a wide and legitimate
>  > > variation in the coding systems used with a given language under
>  > > Unix--English in ISO 8859-1 vs. English in ISO 8859-15 vs. English in
>  > > UTF-8 vs. English in CP1252, to pick a simple example--that it makes
>  > > more sense to pick one as canonical
>  > 
>  > But then I would like to have the UTF-8 variants as the canonical ones
>  > because UTF-8 is the default on the systems I use.
>
> Should they be the ones picked up if LANG is just "de" or "ja" ?

No. But on the systems I know, LANG=de is not allowed anyway:

mfabian at magellan:~$ LANG=de locale charmap
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
ANSI_X3.4-1968
mfabian at magellan:~$ 

> That’s what I meant by “canonical,” the one chosen if language
> information but not character set information is available from the
> environment.

Is this possible on POSIX systems? I thought the character set
information is always available from the environment on
POSIX systems: 

mfabian at magellan:~$ LANG=de_DE locale charmap
ISO-8859-1
mfabian at magellan:~$ LANG=de_DE.UTF-8 locale charmap
UTF-8
mfabian at magellan:~$ 

>  > Having UTF-8 only for the language I use during startup and not for the
>  > others makes changing the language-environment totally useless because
>  > it messes up the encoding for external processes (and even for the
>  > XEmacs menu bar when opening new frames with “C-x 5 2”!).
>  > 
>  > > and create a variant if a corresponding LANG or LC_CTYPE is seen. 
>  > > If you want a German UTF-8 locale despite not having LANG as de_??.UTF-8 in
>  > > your environment at startup, call:
>  > >
>  > >   (set-language-environment
>  > >     (create-variant-language-environment "German" 'utf-8))
>  > 
>  > Nice, I can use this in site-start.el for the time being.
>
> OK. But if your users start up in the appropriate locale, as most of them
> will, that shouldn’t be necessary.

It makes switching between language environments after startup possible
without messing up the encoding. Switching the language-environment
after startup may be useful because it influences the
language-unicode-precedence-list, the input-method, ...

For example, one might want to switch between the language environments
"Japanese (UTF-8)" and "Chinese-BIG5 (UTF-8)" to change the priority of
fonts temporarily.

If all other language-environments except the one set during startup are
not available in UTF-8, changing after startup cannot work right because
it messes up the encoding.

If the user is not supposed to call set-language-environment after
startup, then why is this function interactive?



>  > > set-language-environment is supposed to be a cross-platform API, so my
>  > > feeling is that we shouldn’t do this automatic generation of new language
>  > > environments within it for what is essentially a Unix quirk;
>  > 
>  > But if it is a Unix quirk, then why not do it on Unix?
>
> For the sake of a consistent cross-platform API :-) . Unexpected behaviour 
> makes things harder to maintain, normally.

But what is the problem of doing something like

;; create missing UTF-8 language environments:
(mapcar
 (lambda (x)
   (let ((langenv (car x)))
     (if (not (string-match "UTF-8" langenv))
	 (create-variant-language-environment langenv 'utf-8))))
 language-info-alist)

only when starting on a Unix-like system?

>  > > the system locales under Windows are basically fixed in the coding
>  > > systems they use.  But if you think otherwise, say so, and I’ll think
>  > > about it some more.
>  > 
>  > I wouldn't mind to have several possible values for language-environments
>  > for each language. Do you think that will confuse people?
>
> I have 156 POSIX locales on this FreeBSD machine; but only 41 separate
> languages are used in those locales. Maintaining around 40 language
> environments within XEmacs is more practical than maintaining around 200
> (since I’m sure FreeBSD omits plenty).

Is it necessary to maintain the UTF-8 language environments seperately
when then can be generated so easily with
‘create-variant-language-environment’?

Or am I missing something? Is the call to that function not enough?

> It’s not really a question of user confusion if the user gets the correct
> language environment by default, which most of them should if their POSIX
> locale is correctly set.

Yes.

> Would adding information to the set-language-environment docstring about
> create-variant-language-environment make things easier? 

Adding this info to the docstring is a good idea.

But I still don't understand why one should not generate all "(UTF-8)"
variants by default if it can be easily done without having to maintain
the variants seperately.

>  > The confusion is already there because all these locales exist
>  > unter Unix, having them available in the language-environments doesn’t
>  > make it any worse.
>
> It makes it worse on Windows, though. And yes, Windows isn’t your problem.

But one could omit generating the variants on Windows. 

>  > If one absolutely wants to reduce confusion by reducing the number of
>  > choices, then I think the non-UTF-8 choices could be removed if one
>  > starts in an UTF-8 locale because in that case the non-UTF-8 ones are
>  > almost useless anyway.

-- 
Mike FABIAN   <mfabian at suse.de>   http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。
I � Unicode



More information about the XEmacs-Beta mailing list