21.5 mule: Latin-2(polish) - wrong coding system identification
Aidan Kehoe
kehoea at parhasard.net
Wed May 14 06:15:43 EDT 2008
Hi, Krzysztof, and sorry about the delay --
This is my bug, I think. If you have a second, can you apply the following
patch, and check if the problem still happens?
diff -r 49f8ed034500 lisp/ChangeLog
--- a/lisp/ChangeLog Mon May 12 11:53:04 2008 +0200
+++ b/lisp/ChangeLog Wed May 14 12:12:09 2008 +0200
@@ -1,3 +1,9 @@ 2008-05-11 Aidan Kehoe <kehoea at parhasa
+2008-05-14 Aidan Kehoe <kehoea at parhasard.net>
+
+ * mule/mule-coding.el (make-8-bit-choose-category):
+ Control-1 characters extend from #x80 to #x9F (inclusive),
+ not from #x80 to #xBF.
+
2008-05-11 Aidan Kehoe <kehoea at parhasard.net>
* disp-table.el (make-display-table):
diff -r 49f8ed034500 lisp/mule/mule-coding.el
--- a/lisp/mule/mule-coding.el Mon May 12 11:53:04 2008 +0200
+++ b/lisp/mule/mule-coding.el Wed May 14 12:12:09 2008 +0200
@@ -533,7 +533,7 @@ disk to XEmacs characters for some fixed
(check-argument-range (length decode-table) #x100 #x100)
(block category
(loop
- for i from #x80 to #xBF
+ for i from #x80 to #x9F
do (unless (= i (aref decode-table i))
(return-from category 'no-conversion)))
'iso-8-1))
Bye,
Aidan
Ar an naoú lá déag de mí Feabhra, scríobh Krzysztof Rudnik:
> I've already mailed to xemacs-beta but I've got no response at all.
>
> I use mule XEmacs 21.5-b28 "fuki" (+CVS-20071205) configured for
> `i686-pc-linux'.
> to edit large number of polish texts encoded in iso-8859-2.
>
> init.el: (I've found this somewhere in the list)
> (set-language-environment "Latin-2")
> (setq latin-unity-preapproved-coding-system-list '(iso-8859-2))
> (latin-unity-install)
>
>
> locale : LANG=pl_PL.UTF-8
>
> In most cases xemacs recognizes coding system correctly but sometimes
> coding system for saving buffer is set to
> iso-8859-1 :
> Coding system for saving this buffer:
> Latin 1 -- iso-8859-1-unix
> Default coding system (for new files):
> Latin 2 -- iso-8859-2
> Coding system for keyboard input:
> Latin 2 -- iso-8859-2
> Coding system for terminal output:
> Latin 2 -- iso-8859-2
>
> I can even I get :
> Coding system for saving this buffer:
> UTF8 -- utf-8-unix
> Default coding system (for new files):
> Latin 2 -- iso-8859-2
> Coding system for keyboard input:
> Latin 2 -- iso-8859-2
> Coding system for terminal output:
> Latin 2 -- iso-8859-2
>
>
> I think the files are properly encoded ( `iconv -f iso-8859-2 -t utf8` does
> not complain).
> In fact some of them were prepared in xemacs in Latin2 environment.
> (usually edit in Latin-2 env -> save -> close -> open again -> Latin-1)
>
> I redused the problem to a very small (couple of letters) documents and got
> strange results:
>
>
> 1. if a document contains exactly one small polish letter (there are 9 of
> them) then coding system is always Latin-1
>
> 2. if there are just 2 polish letters then coding system is Latin-2 unless
> these letters are separated by any string i.e.
> for example: it is ok for "wziąć" but not for "wzią ć"
>
> 3. I could not automaticaly get Latin-2 coding system for documents with
> exactly 3 polish letters - did't check all posibilites.
>
> 4. I could't see any rule. in more complicated cases
>
> Is this a bug or my xemacs is not configured properly?
> Could you please help me or at least sugest where I can get help?
>
>
> thanks in advance
> Krzysztof
--
¿Dónde estará ahora mi sobrino Yoghurtu Nghé, que tuvo que huir
precipitadamente de la aldea por culpa de la escasez de rinocerontes?
More information about the XEmacs-Beta
mailing list