MLHT translation import system on genba.ffii.org
-> [ mlhtimport & urlwatch | MLHT | Translation Managment | Mailing List | genba | Project News ]
How to make sure that translations sent in to mlhtimport at ffii org are automatically processed and corresponding web pages show up shortly thereafter. Keeping the mlht-import emacs process running on genba.
News & Chronology
2005-01-11 phm adds password check and PGP/GPG encryption support
- 2004-10-16 phm stops and restarts the mlht import process after minor improvements
- 2004-09-24 phm deletes outdated *-basis.el files in $SEL/mlht/init/. The valid versions are in $SEL/mlht/bas/ and had been covered by the outdated ones. This had made the system break on startup.
- 2004-09-24 phm adds instructions on how to detach the mlht import process from the initiating user's terminal.
- 2004-09 phm fixes bugs and greatly improves mlht. The database becomes the primary source for almost everything. Czech and Polish output using html and latex is perfectioned.
- 2004-08-29 phm adds local-variable syntax 'fmts: null' which allows sending text chunks to the database without recompiling and without becoming a translator for the whole document (useful e.g. for merely adding metadata to a document that will then appear when cited in another document)
- 2004-08-29 phm makes system largely independent of nodes.txt files, fixes various bugs
- 2004-08-26 phm makes system fold lines in txl files (after complaints from people working with bad editors)
2004-08-22 phm & bkaindl make mlht import system run on genba under user knecht
- 2004-08-11 import system breaks on a2e.de machine during phm's absence of 1 week, translation of Urgent Call stalls
- 2003 The mlht system accepts translations
How to tell the system to produce a new page
send translated source text to mlhtimport at ffii org, see translation system
- the source text causes a recompilation of the page from current data even if it does not contain any (new) translated lines.
How to watch (and meddle with) the running system on genba
- log in as user knecht
- say 'screen -x'
If several screen processes are running, you will be offered a choice and then have to retype 'screen -x <process-number.label>'
- To get an overview of running knecht screen processes, it may help to type
- ps aux | grep screen
ps <process-number>
- If no screen process is running, start one yourself, see instructions below.
- Detach the screen process from your user terminal by typing 'C-z d' within the screen/emacs process. This should put you back to the shell environmennt of user knecht.
How to keep the mlht import system running on genba.ffii.org
log in as user Knecht, consult KnechtPassEn or ask knecht-help for password
- display the shell variable $SEL (say 'echo $SEL'). If it isn't set, say 'source .bashrc'
- invoke 'ssh-agent bash;ssh-add' and type in the ssh passphrase. Ask knecht-help for ssh passphrase.
- Note: The ssh passphrase is needed to ensure that the mlht import process can fetch the latest sources from the cvs archive. This is not an ideal solution, given that the latest cvs sources may be erroneous. In the long run it would be better to regularly check the cvs-tree and download only validated sources. This checking and downloading should be detached from the mlht import process and performed by a cronjob.
- invoke 'screen -e ^Zz -S mlht'
- say 'cd $SEL/mlht/sys;cvs update mlht.el'; This must work automatically. If you encounter a 'ssh' prompt, interrupt, say 'ssh-add', enter ssh passphrase (see above) and try once more.
- invoke 'emacs -nw -T mlht'
- everything should be running now. You should be able to access the emacs interaction from another terminal by logging in and typing 'screen -x'
- detach the screen process from your user terminal by typing 'C-z d' within the screen/emacs process. This should put you back to the shell environmennt of user knecht.
- you can also detach the screen process from the shell side, by saying 'screen -d' within the shell.
in case several screen processes are operated by user 'knecht', the command will fail you will be asked to specify which process you want to detach by adding one of several proposed process identifiers. Retype the command 'screen -d <process-number.label>'.
- you can also detach the screen process from the shell side, by saying 'screen -d' within the shell.
Bugs, Limitations
the mail processing script expects perfect text files sent as mail bodies. A more tolerant script by Gilles Sadowski not yet deployed.
- the emacs system does not display multibyte characters correctly within a screen/xterm based system, no matter how well this system supports utf-8. However
- the emacs system does process utf-8 multibyte characters correctly
- the characters can be displayed when the system starts its own x window (this however is not useful for collaboration on genba)
- On genba, it is advisable to open files that contain multibyte characters (including 'mlht.el') as read-only (M-x find-file-read-only = C-x C-r)
- On some terminals and under some conditions the 'screen' program performs poorly. E.g. it can happen that the real and apparent cursor locations differ or that the bottom bar is not visible. In the former case it is best to log out of your shell and start from scratch. In the latter, it may help to set your display to "full-screen" (e.g. in settings menu of 'konsole').
Components of the system
- Emacs Lisp programs at /usr/share/emacs/21.3/site-lisp/mlht, especially sys/mlht.el, also found in CVS.ffii.org:/var/cvs/mlht
- configurations files of user knecht, e.g. ~/.bashrc, ~/.ssh
- MLHT applications = lisp-based documents
usr/share/emacs/21.3/site-lisp/mlht/app
- cvs.ffii.org:/var/cvs/mlht/app/: should always be kept up to date carefully. 'cvs update' is attempted before compilation
- currently out of sync: /var/www/mlht/app, should be brought together soon
var/spool/langtxt: directory into which the mlhtimport script places incoming mails
names of files: 1.txt 2.txt .. 9.txt 10.txt ... 99.txt 100.txt ... upper limit to numbering
- directory contents can be erased any time, counting will then start from 1 again
- file LASTMAIL should contain the number of the last-received mail in its last line
- files such as LASTMAIL and LASTLISP use a special format from a counter perl module
- /var/www/adm/bin/mlhtimport: script that reads mail and puts it into /var/spool/langtxt: possibly soon to be replaced by another script from Gilles Sadowski (the latter may be reachable by mail at mlhtimport2 at ffii org). If you want to edit the 'mlhtimport' script itself, check out from cvs.ffii.org/var/cvs/mlht/prg/mlhtimport
- !LaTeX: the tetex package,
- cjk-latex for east-asian languages
- site-specific files (macros, eps logos, bibliography, ...) at /var/www/mlht/tex and partically in CVS.ffii.org/var/cvs/mlht/tex
- graphicspaths statement in el2tex.tex must be adapted to point to the paths that contain the image files (this failed at first)
- after adding site-specific stuff, run 'texhash' and notify mlht-help at ffii org
- still need to add an UTF-8 font that is currently used for various languages at a2e.de
- still need good solution for Greek
- /ul: directory for backward compatibility with some hardcoded filenames in tex files generated by mlht
ΓΏ
