.. _formats: Translation Related File Formats ******************************** These are the different storage formats for translations and files associated with translations that are supported by the toolkit. See also :doc:`conformance` for standards conformance. The Translate Toolkit implements a set of :doc:`classes ` for handling translation files which allows for a uniform API which covers other issues such as :doc:`quoting and escaping ` of text. .. _formats#primary_translation_formats: Primary translation formats =========================== .. toctree:: :maxdepth: 1 xliff Gettext PO .. _formats#other_translation_formats: Other translation formats ========================= .. toctree:: :maxdepth: 1 :hidden: csv ini properties dtd gsi php ts rc strings flex catkeys android resx mozilla_lang * :doc:`csv` * :doc:`ini` (including Inno Setup .isl dialect) * Java :doc:`properties` (also Mozilla derived properties files) * Mozilla :doc:`dtd` * OpenOffice.org :doc:`gsi` (Also called SDF) * :doc:`php` translation arrays * Qt Linguist :doc:`ts` (both 1.0 and 1.1 supported, 1.0 has a converter) * Symbian localization files * Windows :doc:`rc` files * Mac OSX :doc:`strings` files (also used on the iPhone) (from version 1.8) * Adobe :doc:`flex` files (from version 1.8) * Haiku :doc:`catkeys` (from version 1.8) * :doc:`android` (supports storage, not conversion) * :doc:`resx` .NET Resource files (.resx) * Mozilla :doc:`.lang ` files .. _formats#translation_memory_formats: Translation Memory formats ========================== .. toctree:: :maxdepth: 1 :hidden: tmx wordfast * :doc:`tmx` * :doc:`wordfast`: TM * Trados: .txt TM (from v1.9.0 -- read only) .. _formats#glossary_formats: Glossary formats ================ .. toctree:: :maxdepth: 1 :hidden: omegat_glossary qt_phrase_book tbx utx * :doc:`omegat_glossary` (from v1.5.1) * :doc:`qt_phrase_book` * :doc:`tbx` * :doc:`utx` (from v1.9.0) .. _formats#formats_of_translatable_documents: Formats of translatable documents ================================= .. toctree:: :maxdepth: 1 :hidden: flatxml html ical json md odf subtitles text wiki yaml * :doc:`flatxml` (single-level XML) * :doc:`html` * :doc:`ical` * :doc:`json` * :doc:`md` * :wp:`OpenDocument` -- all ODF file types * :doc:`subtitles` -- various formats (v1.4) * :doc:`Text ` -- plain text with blocks separated by whitespace * :doc:`Wiki ` -- :wp:`DokuWiki` and :wp:`MediaWiki` supported * :doc:`yaml` .. _formats#machine_readable_formats: Machine readable formats ======================== .. toctree:: :maxdepth: 1 :hidden: mo qm * Gettext :doc:`mo` * Qt :doc:`qm` (read-only) .. _formats#in_development: In development ============== .. _formats#unsupported_formats: Unsupported formats =================== Formats that we would like to support but don't currently support: .. toctree:: :maxdepth: 1 :hidden: wml * Wordfast: * `Glossary `_ tab-delimited "source,target,comment" i.e. like OmegaT but unsure if any extension is required. * Apple: * `AppleGlot `_ * .plist -- see :issue:`633` and `plistlib `_ for Python * Adobe: * FrameMaker's Maker Interchange Format -- `MIF `_ (See also `python-gendoc `_, and `Perl MIF module `_) * FrameMaker's `Maker Markup Language `_ (MML) * Microsoft * Word, Excel, etc (probably through usage of OpenOffice.org) * :wp:`OOXML` (at least at the text level we don't have to deal with much of the mess inside OOXML). See also: `Open XML SDK v1 `_ * :wp:`Rich Text Format ` (RTF) see also `pyrtf-ng `_ * :wp:`Open XML Paper Specification ` * XML related * Generic XML * :wp:`DocBook` (can be handled by KDE's :man:`xml2pot`) * `SVG `_ * :wp:`DITA ` * :wp:`PDF ` see `spec `_, `PDFedit `_ * :wp:`LaTeX` -- see `plasTeX `_, a Python framework for processing LaTeX documents * `unoconv `_ -- Python bindings to OpenOffice.org UNO which could allow manipulation of all formats understood by OpenOffice.org. * Trados: * TTX (`Reverse Engineered DTD `_, `other discussion `_) * Multiterm XML `TSV to MiltiTerm conversion script `_ or `XLST `_ * .tmw * .txt (You can interchange using TMX) `Format explanation `_ with some `examples `_. * Tcl: .msg files. `Good documentation `_ * Installers: * NSIS installer: `Existing C++ implementation `_ * WiX -- MSI (Microsoft Installer) creator. `Localization instructions `_, `more notes on localisation `_. This is a custom XML format, another one! * catgets/`gencat `_: precedes gettext, looking in man packages is the best information I could find. Also `LSB requires it `_. There is some info about the source (msgfile) format on `GNU website `_ * :doc:`wml` * `GlossML `_ * Deja Vu External View: `Instructions sent to a translator `_, `Description of external view options and process `_ .. _formats#unlikely_to_be_supported: Unlikely to be supported ======================== These formats are either: too difficult to implement, undocumented, can be processed using some intermediate format or used by too few people to justify the effort. Or some combination or these issues. .. Mentioned but we want them at the end of the TOC or to move them to developer docs .. toctree:: :maxdepth: 1 :hidden: conformance base_classes quoting_and_escaping