quail: TeX input method: change UTF8 to tex and back: solution
& problems
Uwe Brauer
oub at mat.ucm.es
Tue Jan 8 09:32:23 EST 2008
>>>>> "Stephen" == Stephen J Turnbull <stephen at xemacs.org> writes:
> Aidan Kehoe writes:
>> What you need to do is rewrite the code to use
>> #'posix-search-forward, which guarantees that it will return the
>> longest match (that is, it will always try to match "\\infty" if
>> possible, and only then look for "\in".)
> I don't think so: this code is almost surely trying each car in
> order. Something like:
> (progn
> (search-forward (regexp-opt (mapcar #'car replacement-alist)))
> (goto-char (match-begin 0))
> (when (looking-at target-string)
> (replace-match replacement-string)))
As I said in my mail to Adian: that is the main part of the code
(seems less sophisticated than yours, though)
(defun utf8symbol-translate-conventions (trans-tab)
"Use the translation table argument to translate the current buffer."
(save-excursion
(let ((beg (point-min-marker)) ; see the `(elisp)Narrowing' Info node
(end (point-max-marker)))
(unwind-protect
(progn
(widen)
(goto-char (point-min))
(let ((buffer-read-only nil) ; (inhibit-read-only t)?
(case-fold-search nil))
(while trans-tab
(save-excursion
(let ((trans-this (car trans-tab)))
(while (search-forward (car trans-this) nil t)
;; (regexp-opt trans-this) ;NEW
(replace-match (car (cdr trans-this)) t t)))
(setq trans-tab (cdr trans-tab))))))
(narrow-to-region beg end)))))
together with the call
(defun fix-tex2utf8symbol ()
"Replace SGML entity references with ISO 8859-1 (aka Latin-1) characters."
(interactive)
; (if (member major-mode utf8symbol-modes-list)
(let ((buffer-modified-p (buffer-modified-p)))
(unwind-protect
(utf8symbol-translate-conventions tex2utf8symbol-trans-tab)
(set-buffer-modified-p buffer-modified-p))))
> Note to Uwe: one problem you're facing here is that SGML entities
> have a fairly reliable terminator character, the semicolon. TeX
> does not, so you might try replacing `target-string' in the logic
> above with `(concat target-string "\\>")'.
I even thought of adding the each texstring \; and then kill the char
backwards, but that looks highly inefficient.
>> You might also want to look into the #'regexp-opt function, which,
>> given a list of strings,
> regex-opt won't work here, it doesn't know anything about
> internal grouping, so there's no way to trigger the replacement
> of "\in" rather than "\infty". In fact, I doubt that Emacs
> regexp groups can simultaneously express all the relevant string
> matches for
> #r"\in\(t\|fty\)".
Uwe
More information about the XEmacs-Beta
mailing list