bug in file-name-coding-system detection [was:
font-lock-fontify-* ...]
Aidan Kehoe
kehoea at parhasard.net
Wed Jan 10 04:31:08 EST 2007
Ar an deichiú lá de mí Eanair, scríobh Stephen J. Turnbull:
> Aidan Kehoe writes:
>
> > Unless your ~/.xemacs/init.el already handles what coding system file
> > names are in, you probably don’t want to remove the package. The
> > change I made that provoked the new behaviour on your machine added
> > support for sniffing what encoding file names were in, which should be
> > beneficial for you.
>
> Excuse me? "Sniffing the encoding of a file *name*"? Surely you mean
> your patch for "determining the system's file-name-coding-system"?
Tomato, tomato :-) .
> And this is in the *locale* package? Aidan, that's wrong; the locale
> package was intended to be data-only, with only the code needed to
> load the data. This kind of basic functionality should be in
> mule-base, or even core.
>
> And now I know who to blame for the fact that suddenly I can no longer
> reliably read Japanese file names in UTF-8. (Part of the blame goes
> to Mac OS X, which doesn't set the locale, but has a whole separate
> set of internationalization functions---this confuses all Unix
> software, of course, even ls in an Apple Terminal.)
echo '(define-coding-system-alias 'file-name 'utf-8) ' >> ~/.xemacs/init.el
As I said, I don’t have access to an OS X machine. Having a system-specific
hard-coding of the file-name coding-system alias is the right thing to do
there, but if I implement it without being able to test it, I’ll get it
wrong.
> The point is (as I've said before) that the POSIX locale is *not* a
> sufficiently reliable way to determine file-name-coding-system.
And as I said in lisp/mule-cmds.el and in email,
;; On Unix--with the exception of Mac OS X--there is no way to
;; know for certain what coding system to use for file names, and
;; the environment is the best guess. If a particular user's
;; preferences differ from this, then that particular user needs
;; to edit ~/.xemacs/init.el. Aidan Kehoe, Sun Nov 26 18:11:31 CET
;; 2006. OS X uses an almost-normal-form version of UTF-8.
> The user can set the locale but at least on Mac OS X HFS+ that doesn't
> affect the file system's encoding, it stays canonically decomposed UTF-8
> (and will barf on, eg, ISO 8859/2). On the other hand, on most Unix file
> systems, a file name is simply a binary blob, that happens to be human
> readable most of the time.
>
> Also, something that sniffs file-name-coding-system should definitely
> *not* affect user interface.
As I followed up to Wulf, I was wrong in that. What seems to have happened
is that the improved POSIX locale handling picked up that de_DE.UTF-8 was a
German locale where it didn’t before, and mule-packages/locale just payed
attention to that. If his LC_CTYPE had been de_DE all along, he would have
had his menus in incompletely-translated German all along.
Our language environment model is not as fine-grained as that of POSIX. For
working out which language to use on Unix, we pay attention to LC_CTYPE and
nothing else. If you can suggest a better approach to this, that is also
compatible with language environment treatment on Windows, where the
granularity is different, have at it.
--
When I was in the scouts, the leader told me to pitch a tent. I couldn't
find any pitch, so I used creosote.
More information about the XEmacs-Beta
mailing list