storage

Classes that represent various storage formats for localization.

base

Base classes for storage interfaces.

exception translate.storage.base.ParseError(inner_exc)
class translate.storage.base.TranslationStore(unitclass=None, encoding=None)

Base class for stores for multiple translation units of type UnitClass.

Extensions = None

A list of file extentions associated with this store type

Mimetypes = None

A list of MIME types associated with this store type

Name = 'Base translation store'

The human usable name of this store type

UnitClass

The class of units that will be instantiated and used by this class

alias of TranslationUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(data)

parser to process the given source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

suggestions_in_format = False

Indicates if format can store suggestions and alternative translation for a unit

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.base.TranslationUnit(source=None)

Base class for translation units.

Our concept of a translation unit is influenced heavily by XLIFF.

As such most of the method- and variable names borrows from XLIFF terminology.

A translation unit consists of the following:

  • A source string. This is the original translatable text.
  • A target string. This is the translation of the source.
  • Zero or more notes on the unit. Notes would typically be some comments from a translator on the unit, or some comments originating from the source code.
  • Zero or more locations. Locations indicate where in the original source code this unit came from.
  • Zero or more errors. Some tools (eg. pofilter) can run checks on translations and produce error messages.
adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_parsers = []

A list of functions to use for parsing a string into a rich string tree.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

benchmark

class translate.storage.benchmark.TranslateBenchmarker(test_dir, storeclass)

class to aid in benchmarking Translate Toolkit stores

clear_test_dir()

removes the given directory

create_sample_files(num_dirs, files_per_dir, strings_per_file, source_words_per_string, target_words_per_string)

creates sample files for benchmarking

parse_files(file_dir=None)

parses all the files in the test directory into memory

parse_placeables()

parses placeables

bundleprojstore

class translate.storage.bundleprojstore.BundleProjectStore(fname)

Represents a translate project bundle (zip archive).

append_file(afile, fname, ftype='trans', delete_orig=False)

Append the given file to the project with the given filename, marked to be of type ftype (‘src’, ‘trans’, ‘tgt’).

Parameters:delete_orig – If True, as set by convert_forward(), afile is deleted after appending, if possible.

Note

For this implementation, the appended file will be deleted from disk if delete_orig is True.

cleanup()

Clean up our mess: remove temporary files.

get_file(fname)

Retrieve a project file (source, translation or target file) from the project archive.

get_filename_type(fname)

Get the type of file (‘src’, ‘trans’, ‘tgt’) with the given name.

get_proj_filename(realfname)

Try and find a project file name for the given real file name.

load(zipname)

Load the bundle project from the zip file of the given name.

remove_file(fname, ftype=None)

Remove the file with the given project name from the project.

save(filename=None)

Save all project files to the bundle zip file.

sourcefiles

Read-only access to self._sourcefiles.

targetfiles

Read-only access to self._targetfiles.

transfiles

Read-only access to self._transfiles.

update_file(pfname, infile)

Updates the file with the given project file name with the contents of infile.

Returns:the results from BundleProjStore.append_file().
exception translate.storage.bundleprojstore.InvalidBundleError

catkeys

Manage the Haiku catkeys translation format

The Haiku catkeys format is the translation format used for localisation of the Haiku operating system.

It is a bilingual base class derived format with CatkeysFile and CatkeysUnit providing file and unit level access. The file format is described here: http://www.haiku-os.org/blog/pulkomandy/2009-09-24_haiku_locale_kit_translator_handbook

Implementation

The implementation covers the full requirements of a catkeys file. The files are simple Tab Separated Value (TSV) files that can be read by Microsoft Excel and other spreadsheet programs. They use the .txt extension which does make it more difficult to automatically identify such files.

The dialect of the TSV files is specified by CatkeysDialect.

Encoding
The files are UTF-8 encoded.
Header
CatkeysHeader provides header management support.
Escaping

catkeys seem to escape things like in C++ (strings are just extracted from the source code unchanged, it seems.

Functions allow for _escape() and _unescape().

class translate.storage.catkeys.CatkeysDialect

Describe the properties of a catkeys generated TAB-delimited file.

class translate.storage.catkeys.CatkeysFile(inputfile=None, **kwargs)

A catkeys translation memory file

UnitClass

alias of CatkeysUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(newlang)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.catkeys.CatkeysHeader(header=None)

A catkeys translation memory header

settargetlanguage(newlang)

Set a human readable target language

class translate.storage.catkeys.CatkeysUnit(source=None)

A catkeys translation memory unit

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

dict

Get the dictionary of values for a catkeys line

getcontext()

Get the message context.

getdict()

Get the dictionary of values for a catkeys line

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(present=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setdict(newdict)

Set the dictionary of values for a catkeys line

Parameters:newdict (Dict) – a new dictionary with catkeys line elements
setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

translate.storage.catkeys.FIELDNAMES = ['source', 'context', 'comment', 'target']

Field names for a catkeys TU

translate.storage.catkeys.FIELDNAMES_HEADER = ['version', 'language', 'mimetype', 'checksum']

Field names for the catkeys header

translate.storage.catkeys.FIELDNAMES_HEADER_DEFAULTS = {'checksum': '', 'language': '', 'mimetype': '', 'version': '1'}

Default or minimum header entries for a catkeys file

cpo

csvl10n

classes that hold units of comma-separated values (.csv) files (csvunit) or entire files (csvfile) for use with localisation

class translate.storage.csvl10n.csvfile(inputfile=None, fieldnames=None, encoding='auto')

This class represents a .csv file with various lines. The default format contains three columns: location, source, target

UnitClass

alias of csvunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(csvsrc)

parser to process the given source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Write to file

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.csvl10n.csvunit(source=None)
add_spreadsheet_escapes(source, target)

add common spreadsheet escapes to two strings

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
match_header()

see if unit might be a header

merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
remove_spreadsheet_escapes(source, target)

remove common spreadsheet escapes from two strings

removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(value)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

translate.storage.csvl10n.detect_header(sample, dialect, fieldnames)

Test if file has a header or not, also returns number of columns in first row

translate.storage.csvl10n.valid_fieldnames(fieldnames)

Check if fieldnames are valid, that is at least one field is identified as the source.

directory

This module provides functionality to work with directories.

class translate.storage.directory.Directory(dir=None)

This class represents a directory.

file_iter()

Iterator over (dir, filename) for all files in this directory.

getfiles()

Returns a list of (dir, filename) tuples for all the file names in this directory.

getunits()

List of all the units in all the files in this directory.

scanfiles()

Populate the internal file data.

unit_iter()

Iterator over all the units in all the files in this directory.

dtd

Classes that hold units of .dtd files (dtdunit) or entire files (dtdfile).

These are specific .dtd files for localisation used by mozilla.

Specifications

The following information is provided by Mozilla:

Specification

There is a grammar for entity definitions, which isn’t really precise, as the spec says. There’s no formal specification for DTD files, it’s just “whatever makes this work” basically. The whole piece is clearly not the strongest point of the xml spec

XML elements are allowed in entity values. A number of things that are allowed will just break the resulting document, Mozilla forbids these in their DTD parser.

Dialects

There are two dialects:

  • Regular DTD
  • Android DTD

Both dialects are similar, but the Android DTD uses some particular escapes that regular DTDs don’t have.

Escaping in regular DTD

In DTD usually there are characters escaped in the entities. In order to ease the translation some of those escaped characters are unescaped when reading from, or converting, the DTD, and that are escaped again when saving, or converting to a DTD.

In regular DTD the following characters are usually or sometimes escaped:

  • The % character is escaped using &#037; or &#37; or &#x25;
  • The ” character is escaped using &quot;
  • The ‘ character is escaped using &apos; (partial roundtrip)
  • The & character is escaped using &amp;
  • The < character is escaped using &lt; (not yet implemented)
  • The > character is escaped using &gt; (not yet implemented)

Besides the previous ones there are a lot of escapes for a huge number of characters. This escapes usually have the form of &#NUMBER; where NUMBER represents the numerical code for the character.

There are a few particularities in DTD escaping. Some of the escapes are not yet implemented since they are not really necessary, or because its implementation is too hard.

A special case is the ‘ escaping using &apos; which doesn’t provide a full roundtrip conversion in order to support some special Mozilla DTD files.

Also the ” character is never escaped in the case that the previous character is = (the sequence =” is present on the string) in order to avoid escaping the ” character indicating an attribute assignment, for example in a href attribute for an a tag in HTML (anchor tag).

Escaping in Android DTD

It has the sames escapes as in regular DTD, plus this ones:

  • The ‘ character is escaped using &apos; or ‘ or u0027
  • The ” character is escaped using &quot;
translate.storage.dtd.accesskeysuffixes = ('.accesskey', '.accessKey', '.akey')

Accesskey Suffixes: entries with this suffix may be combined with labels ending in labelsuffixes into accelerator notation

class translate.storage.dtd.dtdfile(inputfile=None, android=False)

A .dtd file made up of dtdunits.

UnitClass

alias of dtdunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

makes self.id_index dictionary keyed on entities

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(dtdsrc)

read the source code of a dtd file in and include them as dtdunits in self.units

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Write content to file

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.dtd.dtdunit(source='', android=False)

An entity definition from a DTD file (and any associated comments).

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Set the entity to the given “location”.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

Return the entity as location (identifier).

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

getoutput()

convert the dtd entity back to string form

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isnull()

returns whether this dtdunit doesn’t actually have an entity definition

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
parse(dtdsrc)

read the first dtd element from the source code into this object, return linesprocessed

removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(new_id)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

source

gets the unquoted source string

target

gets the unquoted target string

unit_iter()

Iterator that only returns this unit.

translate.storage.dtd.labelsuffixes = ('.label', '.title')

Label suffixes: entries with this suffix are able to be comibed with accesskeys found in in entries ending with accesskeysuffixes

translate.storage.dtd.quoteforandroid(source)

Escapes a line for Android DTD files.

translate.storage.dtd.quotefordtd(source)

Quotes and escapes a line for regular DTD files.

translate.storage.dtd.removeinvalidamps(name, value)

Find and remove ampersands that are not part of an entity definition.

A stray & in a DTD file can break an application’s ability to parse the file. In Mozilla localisation this is very important and these can break the parsing of files used in XUL and thus break interface rendering. Tracking down the problem is very difficult, thus by removing potential broken ampersand and warning the users we can ensure that the output DTD will always be parsable.

Parameters:
  • name (String) – Entity name
  • value (String) – Entity text value
Return type:

String

Returns:

Entity value without bad ampersands

translate.storage.dtd.unquotefromandroid(source)

Unquotes a quoted Android DTD definition.

translate.storage.dtd.unquotefromdtd(source)

unquotes a quoted dtd definition

_factory_classes

Py2exe can’t find stuff that we import dynamically, so we have this file just for the sake of the Windows installer to easily pick up all the stuff that we need and ensure they make it into the installer.

factory

factory methods to build real storage objects that conform to base.py

translate.storage.factory.getclass(storefile, localfiletype=None, ignore=None, classes=None, classes_str=None, hiddenclasses=None)

Factory that returns the applicable class for the type of file presented. Specify ignore to ignore some part at the back of the name (like .gz).

translate.storage.factory.getobject(storefile, localfiletype=None, ignore=None, classes=None, classes_str=None, hiddenclasses=None)

Factory that returns a usable object for the type of file presented.

Parameters:storefile (file or str) – File object or file name.

Specify ignore to ignore some part at the back of the name (like .gz).

translate.storage.factory.supported_files()

Returns data about all supported files

Returns:list of type that include (name, extensions, mimetypes)
Return type:list

fpo

html

module for parsing html files for translation

class translate.storage.html.POHTMLParser(includeuntaggeddata=None, inputfile=None, callback=None)
UnitClass

alias of htmlunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
buildtag(tag, attrs=None, startend=False)

Create an HTML tag

close()

Handle any buffered data.

detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

do_encoding(htmlsrc)

Return the html text properly encoded based on a charset.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

feed(data)

Feed data to the parser.

Call this as often as you want, with as little or as much text as you want (may include ‘n’).

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
get_starttag_text()

Return full source of start tag: ‘<…>’.

getids(filename=None)

return a list of unit ids

getpos()

Return current line number and offset.

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

guess_encoding(htmlsrc)

Returns the encoding of the html text.

We look for ‘charset=’ within a meta tag to do this.

handle_charref(name)

Handle entries in the form &#NNNN; e.g. &#8417;

handle_entityref(name)

Handle named entities of the form &aaaa; e.g. &rsquo;

has_translatable_content(text)

Check if the supplied HTML snippet has any content that needs to be translated.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(htmlsrc)

parser to process the given source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

pi_escape(text)

Replaces all instances of process instruction with placeholders, and returns the new text and a dictionary of tags. The current implementation replaces <?foo?> with <?md5(foo)?>. The hash => code conversions are stored in self.pidict for later use in restoring the real PHP.

The purpose of this is to remove all potential “tag-like” code from inside PHP. The hash looks nothing like an HTML tag, but the following PHP:

$a < $b ? $c : ($d > $e ? $f : $g)

looks like it contains an HTML tag:

< $b ? $c : ($d >

to nearly any regex. Hence, we replace all contents of PHP with simple strings to help our regexes out.

pi_unescape(text)

Replaces the PHP placeholders in text with the real code

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

reset()

Reset this instance. Loses all unprocessed data.

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.html.htmlfile(includeuntaggeddata=None, inputfile=None, callback=None)
INCLUDEATTRS = ['alt', 'abbr', 'content', 'standby', 'summary', 'title']

Text from these attributes are extracted

MARKINGATTRS = []

Text from tags with these attributes will be extracted from the HTML document

MARKINGTAGS = ['address', 'caption', 'div', 'dt', 'dd', 'figcaption', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li', 'p', 'pre', 'title', 'th', 'td']

Text in these tags that will be extracted from the HTML document

SELF_CLOSING_TAGS = [u'area', u'base', u'basefont', u'br', u'col', u'frame', u'hr', u'img', u'input', u'link', u'meta', u'param']

HTML self-closing tags. Tags that should be specified as <img /> but might be <img>. Reference

UnitClass

alias of htmlunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
buildtag(tag, attrs=None, startend=False)

Create an HTML tag

close()

Handle any buffered data.

detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

do_encoding(htmlsrc)

Return the html text properly encoded based on a charset.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

feed(data)

Feed data to the parser.

Call this as often as you want, with as little or as much text as you want (may include ‘n’).

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
get_starttag_text()

Return full source of start tag: ‘<…>’.

getids(filename=None)

return a list of unit ids

getpos()

Return current line number and offset.

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

guess_encoding(htmlsrc)

Returns the encoding of the html text.

We look for ‘charset=’ within a meta tag to do this.

handle_charref(name)

Handle entries in the form &#NNNN; e.g. &#8417;

handle_entityref(name)

Handle named entities of the form &aaaa; e.g. &rsquo;

has_translatable_content(text)

Check if the supplied HTML snippet has any content that needs to be translated.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(htmlsrc)

parser to process the given source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

pi_escape(text)

Replaces all instances of process instruction with placeholders, and returns the new text and a dictionary of tags. The current implementation replaces <?foo?> with <?md5(foo)?>. The hash => code conversions are stored in self.pidict for later use in restoring the real PHP.

The purpose of this is to remove all potential “tag-like” code from inside PHP. The hash looks nothing like an HTML tag, but the following PHP:

$a < $b ? $c : ($d > $e ? $f : $g)

looks like it contains an HTML tag:

< $b ? $c : ($d >

to nearly any regex. Hence, we replace all contents of PHP with simple strings to help our regexes out.

pi_unescape(text)

Replaces the PHP placeholders in text with the real code

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

reset()

Reset this instance. Loses all unprocessed data.

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.html.htmlunit(source=None)

A unit of translatable/localisable HTML content

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

translate.storage.html.normalize_html(text)

Remove double spaces from HTML snippets

translate.storage.html.safe_escape(html)

Escape &, < and >

translate.storage.html.strip_html(text)

Strip unnecessary html from the text.

HTML tags are deemed unnecessary if it fully encloses the translatable text, eg. ‘<a href=”index.html”>Home Page</a>’.

HTML tags that occurs within the normal flow of text will not be removed, eg. ‘This is a link to the <a href=”index.html”>Home Page</a>.’

ical

Class that manages iCalender files for translation.

iCalendar files follow the RFC2445 specification.

The iCalendar specification uses the following naming conventions:

  • Component: an event, journal entry, timezone, etc
  • Property: a property of a component: summary, description, start time, etc
  • Attribute: an attribute of a property, e.g. language

The following are localisable in this implementation:

  • VEVENT component: SUMMARY, DESCRIPTION, COMMENT and LOCATION properties

While other items could be localised this is not seen as important until use cases arise. In such a case simply adjusting the component.name and property.name lists to include these will allow expanded localisation.

LANGUAGE Attribute
While the iCalendar format allows items to have a language attribute this is not used. The reason being that for most of the items that we localise they are only allowed to occur zero or once. Thus ‘summary’ would ideally be present in multiple languages in one file, the format does not allow such multiple entries. This is unfortunate as it prevents the creation of a single multilingual iCalendar file.
Future Format Support
As this format used vobject which supports various formats including vCard it is possible to expand this format to understand those if needed.
class translate.storage.ical.icalfile(inputfile=None, **kwargs)

An ical file

UnitClass

alias of icalunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.ical.icalunit(source=None, **kwargs)

An ical entry that is translatable

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

ini

Class that manages .ini files for translation

# a comment ; a comment

[Section] a = a string b : a string

class translate.storage.ini.Dialect

Base class for differentiating dialect options and functions

class translate.storage.ini.DialectDefault
class translate.storage.ini.DialectInno
class translate.storage.ini.inifile(inputfile=None, dialect='default', **kwargs)

An INI file

UnitClass

alias of iniunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

Parse the given file or file source string.

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.ini.iniunit(source=None, **kwargs)

A INI file entry

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

translate.storage.ini.register_dialect(dialect)

Decorator that registers the dialect.

jsonl10n

Class that manages JSON data files for translation

JSON is an acronym for JavaScript Object Notation, it is an open standard designed for human-readable data interchange.

JSON basic types:

  • Number (integer or real)
  • String (double-quoted Unicode with backslash escaping)
  • Boolean (true or false)
  • Array (an ordered sequence of values, comma-separated and enclosed in square brackets)
  • Object (a collection of key:value pairs, comma-separated and enclosed in curly braces)
  • null

Example:

{
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address": {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber": [
         {
           "type": "home",
           "number": "212 555-1234"
         },
         {
           "type": "fax",
           "number": "646 555-4567"
         }
     ]
}

TODO:

  • Handle \u and other escapes in Unicode
  • Manage data type storage and conversion. True –> “True” –> True
class translate.storage.jsonl10n.I18NextFile(inputfile=None, filter=None, **kwargs)

A i18next v3 format, this is nested JSON with several additions.

See https://www.i18next.com/

UnitClass

alias of I18NextUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.jsonl10n.I18NextUnit(source=None, item=None, notes=None, **kwargs)

A i18next v3 format, JSON with plurals.

See https://www.i18next.com/

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

getvalue()

Return value to be stored in JSON file.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

class translate.storage.jsonl10n.JsonFile(inputfile=None, filter=None, **kwargs)

A JSON file

UnitClass

alias of JsonUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.jsonl10n.JsonNestedFile(inputfile=None, filter=None, **kwargs)

A JSON file with nested keys

UnitClass

alias of JsonNestedUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.jsonl10n.JsonNestedUnit(source=None, item=None, notes=None, **kwargs)
adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

getvalue()

Return value to be stored in JSON file.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

class translate.storage.jsonl10n.JsonUnit(source=None, item=None, notes=None, **kwargs)

A JSON entry

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

getvalue()

Return value to be stored in JSON file.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

class translate.storage.jsonl10n.WebExtensionJsonFile(inputfile=None, filter=None, **kwargs)

WebExtension JSON file

See following URLs for doc:

https://developer.chrome.com/extensions/i18n https://developer.mozilla.org/en-US/Add-ons/WebExtensions/Internationalization

UnitClass

alias of WebExtensionJsonUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parse the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.jsonl10n.WebExtensionJsonUnit(source=None, item=None, notes=None, **kwargs)
adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

getvalue()

Return value to be stored in JSON file.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

lisa

Parent class for LISA standards (TMX, TBX, XLIFF)

class translate.storage.lisa.LISAfile(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)

A class representing a file store for one of the LISA file formats.

UnitClass

alias of LISAunit

add_unit_to_index(unit)

Add a unit to source and location idexes

addheader()

Method to be overridden to initialise headers, etc.

addsourceunit(source)

Adds and returns a new unit with the given string as first entry.

addunit(unit, new=True)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

initbody()

Initialises self.body so it never needs to be retrieved from the XML again.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
namespaced(name)

Returns name in Clark notation.

For example namespaced("source") in an XLIFF document might return:

{urn:oasis:names:tc:xliff:document:1.1}source

This is needed throughout lxml.

parse(xml)

Populates this object from the given xml string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out=None)

Converts to a string containing the file’s XML

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.lisa.LISAunit(source, empty=False, **kwargs)

A single unit in the file. Provisional work is done to make several languages possible.

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

createlanguageNode(lang, text, purpose=None)

Returns a xml Element setup with given parameters to represent a single language entry. Has to be overridden.

getNodeText(languageNode, xml_space='preserve')

Retrieves the term from the given languageNode.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlanguageNode(lang=None, index=None)

Retrieves a languageNode either by language or by index.

getlanguageNodes()

Returns a list of all nodes that contain per language information.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettarget(lang=None)

retrieves the “target” text (second entry), or the entry in the specified language, if it exists

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
namespaced(name)

Returns name in Clark notation.

For example namespaced("source") in an XLIFF document might return:

{urn:oasis:names:tc:xliff:document:1.1}source

This is needed throughout lxml.

removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

settarget(target, lang='xx', append=False)

Sets the “target” string (second language), or alternatively appends to the list

unit_iter()

Iterator that only returns this unit.

mo

Module for parsing Gettext .mo files for translation.

The coding of .mo files was produced from Gettext documentation, Pythons msgfmt.py and by observing and testing existing .mo files in the wild.

The hash algorithm is implemented for MO files, this should result in faster access of the MO file. The hash is optional for Gettext and is not needed for reading or writing MO files, in this implementation it is always on and does produce sometimes different results to Gettext in very small files.

class translate.storage.mo.mofile(inputfile=None, **kwargs)

A class representing a .mo file.

UnitClass

alias of mounit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getheaderplural()

Returns the nplural and plural values from the header.

getids(filename=None)

return a list of unit ids

getprojectstyle()

Return the project based on information in the header.

The project is determined in the following sequence:
  1. Use the ‘X-Project-Style’ entry in the header.
  2. Use ‘Report-Msgid-Bug-To’ entry
  3. Use the ‘X-Accelerator’ entry
  4. Use the Project ID
  5. Analyse the file itself (not yet implemented)
getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Return the target language based on information in the header.

The target language is determined in the following sequence:
  1. Use the ‘Language’ entry in the header.
  2. Poedit’s custom headers.
  3. Analysing the ‘Language-Team’ entry.
getunits()

Return a list of all units in this store.

header()

Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.

init_headers(charset='UTF-8', encoding='8bit', **kwargs)

sets default values for po headers

isempty()

Return True if the object doesn’t contain any translation units.

makeheader(**kwargs)

Create a header for the given filename.

Check .makeheaderdict() for information on parameters.

makeheaderdict(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)

Create a header dictionary with useful defaults.

pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)

Returns:Dictionary with the header items
Return type:dict of strings
makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
mergeheaders(otherstore)

Merges another header with this header.

This header is assumed to be the template.

parse(input)

parses the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

parseheader()

Parses the PO header and returns the interpreted values as a dictionary.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Output a string representation of the MO data file

setprojectstyle(project_style)

Set the project in the header.

Parameters:project_style (str) – the new project
setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(lang)

Set the target language in the header.

This removes any custom Poedit headers if they exist.

Parameters:lang (str) – the new target language code
translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

updatecontributor(name, email=None)

Add contribution comments if necessary.

updateheader(add=False, **kwargs)

Updates the fields in the PO style header.

This will create a header if add == True.

updateheaderplural(nplurals, plural)

Update the Plural-Form PO header.

class translate.storage.mo.mounit(source=None, **kwargs)

A class representing a .mo translation message.

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Is this a header entry?

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Is this message translateable?

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

translate.storage.mo.mounpack(filename='messages.mo')

Helper to unpack Gettext MO files into a Python string

mozilla_lang

A class to manage Mozilla .lang files.

See https://github.com/mozilla-l10n/langchecker/wiki/.lang-files-format for specifications on the format.

class translate.storage.mozilla_lang.LangStore(inputfile=None, mark_active=False, **kwargs)

We extend TxtFile, since that has a lot of useful stuff for encoding

UnitClass

alias of LangUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(lines)

Read in text lines and create txtunits from the blocks of text

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.mozilla_lang.LangUnit(source=None)

This is just a normal unit with a weird string output

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

getcontext()

Get the message context.

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

odf_io

odf_shared

omegat

Manage the OmegaT glossary format

OmegaT glossary format is used by the OmegaT computer aided translation tool.

It is a bilingual base class derived format with OmegaTFile and OmegaTUnit providing file and unit level access.

Format Implementation

The OmegaT glossary format is a simple Tab Separated Value (TSV) file with the columns: source, target, comment.

The dialect of the TSV files is specified by OmegaTDialect.

Encoding
The files are either UTF-8 or encoded using the system default. UTF-8 encoded files use the .utf8 extension while system encoded files use the .tab extension.
translate.storage.omegat.OMEGAT_FIELDNAMES = ['source', 'target', 'comment']

Field names for an OmegaT glossary unit

class translate.storage.omegat.OmegaTDialect

Describe the properties of an OmegaT generated TAB-delimited glossary file.

class translate.storage.omegat.OmegaTFile(inputfile=None, **kwargs)

An OmegaT glossary file

UnitClass

alias of OmegaTUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parsese the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.omegat.OmegaTFileTab(inputfile=None, **kwargs)

An OmegaT glossary file in the default system encoding

UnitClass

alias of OmegaTUnit

add_unit_to_index(unit)

Add a unit to source and location idexes

addsourceunit(source)

Add and returns a new unit with the given source string.

Return type:TranslationUnit
addunit(unit)

Append the given unit to the object’s list of units.

This method should always be used rather than trying to modify the list manually.

Parameters:unit (TranslationUnit) – The unit that will be added.
detect_encoding(text, default_encodings=None)

Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.

fallback_detection(text)

Simple detection based on BOM in case chardet is not available.

findid(id)

find unit with matching id by checking id_index

findunit(source)

Find the unit with the given source string.

Return type:TranslationUnit or None
findunits(source)

Find the units with the given source string.

Return type:TranslationUnit or None
getids(filename=None)

return a list of unit ids

getprojectstyle()

Get the project type for this store.

getsourcelanguage()

Get the source language for this store.

gettargetlanguage()

Get the target language for this store.

getunits()

Return a list of all units in this store.

isempty()

Return True if the object doesn’t contain any translation units.

makeindex()

Indexes the items in this store. At least .sourceindex should be useful.

merge_on

The matching criterion to use when merging on.

Returns:The default matching criterion for all the subclasses.
Return type:string
parse(input)

parsese the given file or file source string

classmethod parsefile(storefile)

Reads the given file (or opens the given filename) and parses back to an object.

classmethod parsestring(storestring)

Convert the string representation back to an object.

remove_unit_from_index(unit)

Remove a unit from source and locaton indexes

require_index()

make sure source index exists

save()

Save to the file that data was originally read from, if available.

savefile(storefile)

Write the string representation to the given file (or filename).

serialize(out)

Converts to a bytes representation that can be parsed back using parsestring(). out should be an open file-like objects to write to.

setprojectstyle(project_style)

Set the project type for this store.

setsourcelanguage(sourcelanguage)

Set the source language for this store.

settargetlanguage(targetlanguage)

Set the target language for this store.

translate(source)

Return the translated string for a given source string.

Return type:String or None
unit_iter()

Iterator over all the units in this store.

class translate.storage.omegat.OmegaTUnit(source=None)

An OmegaT glossary unit

adderror(errorname, errortext)

Adds an error message to this unit.

Parameters:
  • errorname (string) – A single word to id the error.
  • errortext (string) – The text describing the error.
addlocation(location)

Add one location to the list of locations.

Note

Shouldn’t be implemented if the format doesn’t support it.

addlocations(location)

Add a location or a list of locations.

Note

Most classes shouldn’t need to implement this, but should rather implement TranslationUnit.addlocation().

Warning

This method might be removed in future.

addnote(text, origin=None, position='append')

Adds a note (comment).

Parameters:
  • text (string) – Usually just a sentence or two.
  • origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
classmethod buildfromunit(unit)

Build a native unit from a foreign unit, preserving as much information as possible.

dict

Get the dictionary of values for a OmegaT line

getcontext()

Get the message context.

getdict()

Get the dictionary of values for a OmegaT line

geterrors()

Get all error messages.

Return type:Dictionary
getid()

A unique identifier for this unit.

Return type:string
Returns:an identifier for this unit that is unique in the store

Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.

getlocations()

A list of source code locations.

Return type:List

Note

Shouldn’t be implemented if the format doesn’t support it.

getnotes(origin=None)

Returns all notes about this unit.

It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see getlocations()).

gettargetlen()

Returns the length of the target string.

Return type:Integer

Note

Plural forms might be combined.

getunits()

This unit in a list.

hasplural()

Tells whether or not this specific unit has plural strings.

infer_state()

Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.

isblank()

Used to see if this unit has no source or target string.

Note

This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.

isfuzzy()

Indicates whether this unit is fuzzy.

isheader()

Indicates whether this unit is a header.

isobsolete()

indicate whether a unit is obsolete

isreview()

Indicates whether this unit needs review.

istranslatable()

Indicates whether this unit can be translated.

This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.

istranslated()

Indicates whether this unit is translated.

This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).

makeobsolete()

Make a unit obsolete

markfuzzy(value=True)

Marks the unit as fuzzy or not.

markreviewneeded(needsreview=True, explanation=None)

Marks the unit to indicate whether it needs review.

Parameters:
  • needsreview – Defaults to True.
  • explanation – Adds an optional explanation as a note.
merge(otherunit, overwrite=False, comments=True, authoritative=False)

Do basic format agnostic merging.

multistring_to_rich(mulstring)

Convert a multistring to a list of “rich” string trees:

>>> target = multistring([u'foo', u'bar', u'baz'])
>>> TranslationUnit.multistring_to_rich(target)
[<StringElem([<StringElem([u'foo'])>])>,
 <StringElem([<StringElem([u'bar'])>])>,
 <StringElem([<StringElem([u'baz'])>])>]
removenotes()

Remove all the translator’s notes.

rich_source
rich_target
classmethod rich_to_multistring(elem_list)

Convert a “rich” string tree to a multistring:

>>> from translate.storage.placeables.interfaces import X
>>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])]
>>> TranslationUnit.rich_to_multistring(rich)
multistring(u'foo bar')
setcontext(context)

Set the message context

setdict(newdict)

Set the dictionary of values for a OmegaT line

Parameters:newdict (Dict) – a new dictionary with OmegaT line elements
setid(value)

Sets the unique identified for this unit.

only implemented if format allows ids independant from other unit properties like source or context

unit_iter()

Iterator that only returns this unit.

oo

Classes that hold units of .oo files (oounit) or entire files (oofile).

These are specific .oo files for localisation exported by OpenOffice.org - SDF format (previously knows as GSI files).

The behaviour in terms of escaping is explained in detail in the programming comments.

translate.storage.oo.escape_help_text(text)

Escapes the help text as it would be in an SDF file.

<, >, ” are only escaped in <[[:lower:]]> tags. Some HTML tags make it in in lowercase so those are dealt with. Some OpenOffice.org help tags are not escaped.

translate.storage.oo.escape_text(text)

Escapes SDF text to be suitable for unit consumption.

translate.storage.oo.makekey(ookey, long_keys)

converts an oo key tuple into a unique identifier

Parameters:
  • ookey (tuple) – an oo key
  • long_keys (Boolean) – Use long keys
Return type:

str

Returns:

unique ascii identifier

translate.storage.oo.normalizefilename(filename)

converts any non-alphanumeric (standard roman) characters to _

class translate.storage.oo.oofile(input=None)

this represents an entire .oo file

UnitClass

alias of oounit

addline(thisline)

adds a parsed line to the file

getoutput(skip_source=False, fallback_lang=None)

converts all the lines back to tab-delimited form

parse(input)

parses lines and adds them to the file

serialize(out, skip_source=False, fallback_lang=None)

convert to a string. double check that unicode is handled

class translate.storage.oo.ooline(parts=None)

this represents one line, one translation in an .oo file

getkey()

get the key that identifies the resource

getoutput()

return a line in tab-delimited form

getparts()

return a list of parts in this line

gettext()

Obtains the text column and handle escaping.

setparts(parts)

create a line from its tab-delimited parts

settext(text)

Sets the text column and handle escaping.

text

Obtains the text column and handle escaping.

class translate.storage.oo.oomultifile(filename, mode=None, multifilestyle='single')

this takes a huge GSI file and represents it as multiple smaller files…

createsubfileindex()

reads in all the lines and works out the subfiles

getoofile(subfile)

returns an oofile built up from the given subfile’s lines

getsubfilename(line)

looks up the subfile name for the line

getsubfilesrc(subfile)

returns the list of lines matching the subfile

listsubfiles()

returns a list of subfiles in the file

openinputfile(subfile)

returns a pseudo-file object for the given subfile

openoutputfile(subfile)

returns a pseudo-file object for the given subfile

class translate.storage.oo.oounit

this represents a number of translations of a resource

addline(line)

add a line to the oounit

getoutput(skip_source=False, fallback_lang=None)

return the lines in tab-delimited form

translate.storage.oo.unescape_help_text(text)

Unescapes normal text to be suitable for writing to the SDF file.

translate.storage.oo.unescape_text(text)

Unescapes SDF text to be suitable for unit consumption.

class translate.storage.oo.unormalizechar(normalchars)
clear() → None. Remove all items from D.
copy() → a shallow copy of D
fromkeys(S[, v]) → New dict with keys from S and values equal to v.

v defaults to None.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
has_key(k) → True if D has a key k, else False
items() → list of D's (key, value) pairs, as 2-tuples
iteritems() → an iterator over the (key, value) items of D
iterkeys() → an iterator over the keys of D
itervalues() → an iterator over the values of D
keys() → list of D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) → None. Update D from dict/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → list of D's values
viewitems() → a set-like object providing a view on D's items
viewkeys() → a set-like object providing a view on D's keys
viewvalues() → an object providing a view on D's values

placeables

This module implements basic functionality to support placeables.

A placeable is used to represent things like:
  1. Substitutions

    For example, in ODF, footnotes appear in the ODF XML where they are defined; so if we extract a paragraph with some footnotes, the translator will have a lot of additional XML to with; so we separate the footnotes out into separate translation units and mark their positions in the original text with placeables.

  2. Hiding of inline formatting data

    The translator doesn’t want to have to deal with all the weird formatting conventions of wherever the text came from.

  3. Marking variables

    This is an old issue - translators translate variable names which should remain untranslated. We can wrap placeables around variable names to avoid this.

The placeables model follows the XLIFF standard’s list of placeables. Please refer to the XLIFF specification to get a better understanding.

base

Contains base placeable classes with names based on XLIFF placeables. See the XLIFF standard for more information about what the names mean.

class translate.storage.placeables.base.Bpt(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.Ept(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.Ph(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.It(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.G(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.Bx(id=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.Ex(id=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.X(id=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.base.Sub(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.

general

Contains general placeable implementations. That is placeables that does not fit into any other sub-category.

class translate.storage.placeables.general.AltAttrPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)

Placeable for the “alt=…” attributes inside XML tags.

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

classmethod parse(pstr)

A parser method to extract placeables from a string based on a regular expression. Use this function as the @parse() method of a placeable class.

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.general.XMLEntityPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)

Placeable handling XML entities (&xxxxx;-style entities).

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

classmethod parse(pstr)

A parser method to extract placeables from a string based on a regular expression. Use this function as the @parse() method of a placeable class.

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.general.XMLTagPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)

Placeable handling XML tags.

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

classmethod parse(pstr)

A parser method to extract placeables from a string based on a regular expression. Use this function as the @parse() method of a placeable class.

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.

interfaces

This file contains abstract (semantic) interfaces for placeable
implementations.
class translate.storage.placeables.interfaces.BasePlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)

Base class for all placeables.

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.interfaces.InvisiblePlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.interfaces.MaskingPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.interfaces.ReplacementPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.interfaces.SubflowPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.

lisa

parse

Contains the parse function that parses normal strings into StringElem- based “rich” string element trees.

translate.storage.placeables.parse.parse(tree, parse_funcs)

Parse placeables from the given string or sub-tree by using the parsing functions provided.

The output of this function is heavily dependent on the order of the parsing functions. This is because of the algorithm used.

An over-simplification of the algorithm: the leaves in the StringElem tree are expanded to the output of the first parsing function in parse_funcs. The next level of recursion is then started on the new set of leaves with the used parsing function removed from parse_funcs.

Parameters:tree (unicode|StringElem) – The string or string element sub-tree to parse.

strelem

Contains the base StringElem class that represents a node in a parsed rich-string tree. It is the base class of all placeables.

exception translate.storage.placeables.strelem.ElementNotFoundError
class translate.storage.placeables.strelem.StringElem(sub=None, id=None, rid=None, xid=None, **kwargs)

This class represents a sub-tree of a string parsed into a rich structure. It is also the base class of all placeables.

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

has_content = True

Whether this string can have sub-elements.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

iseditable = True

Whether this string should be changable by the user. Not used at the moment.

isfragile = False

Whether this element should be deleted in its entirety when partially deleted. Only checked when iseditable = False

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
istranslatable = True

Whether this string is translatable into other languages.

isvisible = True

Whether this string should be visible to the user. Not used at the moment.

iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

classmethod parse(pstr)

Parse an instance of this class from the start of the given string. This method should be implemented by any sub-class that wants to parseable by translate.storage.placeables.parse.

Parameters:pstr (unicode) – The string to parse into an instance of this class.
Returns:An instance of the current class, or None if the string not parseable by this class.
print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

renderer = None

An optional function that returns the Unicode representation of the string.

sub = []

The sub-elements that make up this this string.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.

terminology

Contains the placeable that represents a terminology term.

class translate.storage.placeables.terminology.TerminologyPlaceable(*args, **kwargs)

Terminology distinguished from the rest of a string by being a placeable.

apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

matchers = []

A list of matcher objects to use to identify terminology.

classmethod parse(pstr)

Parse an instance of this class from the start of the given string. This method should be implemented by any sub-class that wants to parseable by translate.storage.placeables.parse.

Parameters:pstr (unicode) – The string to parse into an instance of this class.
Returns:An instance of the current class, or None if the string not parseable by this class.
print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
translations = []

The available translations for this placeable.

xliff

Contains XLIFF-specific placeables.

class translate.storage.placeables.xliff.Bpt(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.xliff.Ept(sub=None, id=None, rid=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.xliff.X(id=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.

delete_range(start_index, end_index)

Delete the text in the range given by the string-indexes start_index and end_index.

Partial nodes will only be removed if they are editable.

Returns:A StringElem representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from. None is returned for the parent value if the root was deleted. If the parent and offset values are not None, parent.insert(offset, deleted) effectively undoes the delete.
depth_first(filter=None)

Returns a list of the nodes in the tree in depth-first order.

elem_at_offset(offset)

Get the StringElem in the tree that contains the string rendered at the given offset.

elem_offset(elem)

Find the offset of elem in the current tree.

This cannot be reliably used if self.renderer is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal the StringElemGUI.index() method is used as replacement for this one.

Returns:The string index where element e starts, or -1 if e was not found.
encode(encoding='ascii')

More unicode class emulation.

find(x)

Find sub-string x in this string tree and return the position at which it starts.

find_elems_with(x)

Find all elements in the current sub-tree containing x.

flatten(filter=None)

Flatten the tree by returning a depth-first search over the tree’s leaves.

get_index_data(index)

Get info about the specified range in the tree.

Returns:A dictionary with the following items:
  • elem: The element in which index resides.
  • index: Copy of the index parameter
  • offset: The offset of index into 'elem'.
get_parent_elem(child)

Searches the current sub-tree for and returns the parent of the child element.

insert(offset, text, preferred_parent=None)

Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.

insert_between(left, right, text)

Insert the given text between the two parameter StringElems.

isleaf()

Whether or not this instance is a leaf node in the StringElem tree.

A node is a leaf node if it is a StringElem (not a sub-class) and contains only sub-elements of type str or unicode.

Return type:bool
iter_depth_first(filter=None)

Iterate through the nodes in the tree in dept-first order.

map(f, filter=None)

Apply f to all nodes for which filter returned True (optional).

print_tree(indent=0, verbose=False)

Print the tree from the current instance’s point in an indented manner.

prune()

Remove unnecessary nodes to make the tree optimal.

remove_type(ptype)

Replace nodes with type ptype with base StringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.

translate()

Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.

Returns:The transformed Unicode string representing the sub-tree.
class translate.storage.placeables.xliff.Bx(id=None, xid=None, **kwargs)
apply_to_strings(f)

Apply f to all actual strings in the tree.

Parameters:f – Must take one (str or unicode) argument and return a string or unicode.
copy()

Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.

Note

self.renderer is not copied.