pocount will count the number of strings and words in translatable files.
Supported formates include: PO and XLIFF. Almost all bilingual file formats supported by the Translate Toolkit will work with pocount, including: TMX, TBX, Gettext .mo, Qt .qm, Wordfast .txt TM.
A number of other formats should be countable as the toolkit develops. Note that only multilingual formats based the storage base class are supported, but that includes almost all storage formats.
pocount [options] <directory|file(s)>
Where:
directory | will recurse and count all files in the specified directory |
file(s) | will count all files specified |
Options:
-h, --help | show this help message and exit |
--incomplete | skip 100% translated files |
Output format:
--full | (default) statistics in full, verbose format |
--csv | statistics in CSV format |
--short | same as –short-strings |
--short-strings | |
statistics of strings in short format – one line per file | |
--short-words | statistics of words in short format – one line per file |
pocount makes it easy to count the current state of a body of translations. The most interesting options are those that adjust the output style and decide what to count.
To count how much work is to be done in you project:
pocount project/
This will count all translatable files found in the directory project/ and output the results in --full format.
You might want to be more specific and only count certain files:
pocount *.po
This will count all PO files in the current directory but will ignore any other files that ‘pocount’ can count.
You can have full control of the files to count by using some of the abilities of the Unix commandline, these may work on Mac OS X but are unlikely to work on Windows.:
pocount $(find . -name "*.properties.po")
This will first find all files that match *.properties.po and then count them. That would make it easy to count the state of your Mozilla translations of .properties files.
To count what still needs to be done, ignoring what is 100% complete you can use the --incomplete option.:
pocount --incomplete --short *.xlf
We are now counting all XLIFF files by using the *.xlf expansion. We are only counting files that are not 100% complete and we’re outputing string counts using the --short option.
The output options provide the following types of output
This is the normal, or default, mode. It produces the most comprehensive and easy to read data, although the amount of data may overwhelm the user. It produces the following output:
avmedia/source/viewer.po
type strings words (source) words (translation)
translated: 73465 ( 99%) 538598 ( 99%) 513296
fuzzy: 13 ( 0%) 141 ( 0%) n/a
untranslated: 53 ( 0%) 602 ( 0%) n/a
Total: 73531 539341 513296
A grand total and file count is provided if the number of files is greater than one.
This format is useful if you want to reuse the data in a spreadsheet. In CSV mode the following output is shown:
Filename, Translated Messages, Translated Source Words, Translated Target Words, Fuzzy Messages, Fuzzy Source Words, Untranslated Messages, Untranslated Source Words, Review Messages, Review Source Words
avmedia/source/viewer.po, 1, 3, 3, 0, 0, 4, 22, 1, 3
Totals are not provided in CSV mode.
The focus is on easily accessible data in a compact form. This will only count strings and uses a short syntax to make it easy for an experienced localiser to read.:
test-po/fuzzy.po strings: total: 1 | 0t 1f 0u | 0%t 100%f 0%u
The filename is followed by a word indicating the type of count, here we are counting strings. The total give the total string count. While the letters t, f and u represent ‘translated’, ‘fuzzy’ and ‘untranslated’ and here indicate the string counts for each of those categories. The counts are followed by a percentage representation of the same categories.
The output is very similar to --short-strings above:
test-po/fuzzy.po source words: total: 3 | 0t 3f 0u | 0%t 100%f 0%u
But instead of counting string we are now counting words as indicated by the term ‘source words’