txt2po

txt2po allows you to use the same principles of PO files with normal text files. In PO only items that change are marked fuzzy and only new items need to be translated, unchanged items remain unchanged for the translation.

Usage

txt2po [options] <foo.txt> <foo.po>
po2txt [options] [-t <foo.txt>] <XX.po> <foo-XX.txt>

Where:

foo.txt

is the input plain text file

foo.po

is an empty PO file that may be translated

XX.po

is a PO file translated into the XX language

foo-XX.txt

is the foo.txt file translated into language XX

Options (txt2po):

--version

show program’s version number and exit

-h, --help

show this help message and exit

--manpage

output a manpage based on the help

--progress=PROGRESS

show progress as: dots, none, bar, names, verbose

--errorlevel=ERRORLEVEL

show errorlevel as: none, message, exception, traceback

-i INPUT, --input=INPUT

read from INPUT in *, txt formats

-x EXCLUDE, --exclude=EXCLUDE

exclude names matching EXCLUDE from input paths

-o OUTPUT, --output=OUTPUT

write to OUTPUT in po, pot formats

-S, --timestamp

skip conversion if the output file has newer timestamp

-P, --pot

output PO Templates (.pot) rather than PO files (.po)

--encoding=ENCODING

The encoding of the input file (default: UTF-8)

--flavour=FLAVOUR

The flavour of text file: plain (default), dokuwiki, mediawiki

--no-segmentation

Don’t segment the file, treat it like a single message

--duplicates=DUPLICATESTYLE

what to do with duplicate strings (identical source text): merge, msgctxt (default: ‘msgctxt’)

Options (po2txt):

--version

show program’s version number and exit

-h, --help

show this help message and exit

--manpage

output a manpage based on the help

--progress=PROGRESS

show progress as: dots, none, bar, names, verbose

--errorlevel=ERRORLEVEL

show errorlevel as: none, message, exception, traceback

-i INPUT, --input=INPUT

read from INPUT in po, pot formats

-x EXCLUDE, --exclude=EXCLUDE

exclude names matching EXCLUDE from input paths

-o OUTPUT, --output=OUTPUT

write to OUTPUT in txt format

-t TEMPLATE, --template=TEMPLATE

read from TEMPLATE in txt format

-S, --timestamp

skip conversion if the output file has newer timestamp

--encoding=ENCODING

The encoding of the template file (default: UTF-8)

-w WRAP, --wrap=WRAP

set number of columns to wrap text at

--threshold=PERCENT

only convert files where the translation completion is above PERCENT

--fuzzy

use translations marked fuzzy

--nofuzzy

don’t use translations marked fuzzy (default)

A roundtrip example

Preparing input files

With txt2po a text file is broken down into sections. Each section is separated by a line of whitespace. Each section will appear as a msgid in the PO file. Because of this simple method of breaking up the input file it might be necessary to alter the layout of your input file. For instance you might want to separate a heading from a paragraph by using whitespace.

For steps in a process you would want to leave a blank line between each step so that each step can be translated independently.

For a list of items you might want to group them together so that a translator could for example place them in alphabetic order for their translation.

Once the input file is prepared you can proceed to the next step.

Creating the POT files

This is simple:

txt2po -P TEXT_FILE text_file.pot

A translator would copy the POT file to their own PO file and then create translations of the entries. If you wish to create a PO file and not a POT file then leave off the -P option.

You might want to manually edit the POT file to remove items that should not be translated. For instance if part of the document is a license you might want to remove those if you do not want the license translated for legal reasons.

Translating

Translate as normal. However translators should be aware that writers of the text file may have used spaces, dashes, equals, underscores and other aids to indicate things such as:

* Headings and sub-headings
* Code examples, command lines examples
* Various lists
* etc

They will need to adapt these to work in their language being aware of how they will appear once they are merged with the original text document.

Creating a translated text file

With the translations complete you can create a translated text file like this:

po2txt -w 75 -t TEXT_FILE translated.po TEXT_FILE.translated

This uses the original text file as a template and creates a new translated text file using the translations found in the PO file.

The -w command allows you to reflow the translated text to N number of characters, otherwise the text will appear as one long line.

Help with Wiki syntax

dokuwiki

To retrieve the raw syntax for your dokuwiki page add ‘?do=export_raw’ to you URL. The following would retrieve the DokuWiki home page in raw dokuwiki format https://www.dokuwiki.org/dokuwiki?do=export_raw

wget https://www.dokuwiki.org/dokuwiki?do=export_raw -O txt2po.txt
txt2po --flavour=dokuwiki -P txt2po.txt txt2po.pot
# edit txt2po.pot
po2txt -t txt2po.txt fr.po fr.txt

First we retrieve the file in raw dokuwiki format, then we create a POT file for editing. We created a French translation and using po2txt plus the original file as a template we output fr.txt which is a French version of the original txt2po.txt. This file can now be uploaded to the wiki server.

MediaWiki

To retrieve the raw media wiki syntax add ‘?action=raw’ to you wiki URL. The following retrieves the Translate Toolkit page from Wikipedia in raw MediaWiki format Translate_Toolkit?action=raw.

To process follow the instructions above but substituting the MediaWiki retrieval method.