storage¶
Classes that represent various storage formats for localization.
base¶
Base classes for storage interfaces.
- class translate.storage.base.DictStore(unitclass=None, encoding=None)¶
-
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
TranslationUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(data) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.base.DictUnit(source=None)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.base.EncodingDict¶
- clear()¶
Remove all items from the dict.
- copy()¶
Return a shallow copy of the dict.
- classmethod fromkeys(iterable, value=None, /)¶
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items()¶
Return a set-like object providing a view on the dict’s items.
- keys()¶
Return a set-like object providing a view on the dict’s keys.
- pop(k[, d]) v, remove specified key and return the corresponding value.¶
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()¶
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from mapping/iterable E and F.¶
If E is present and has a .keys() method, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- values()¶
Return an object providing a view on the dict’s values.
- class translate.storage.base.MetadataTranslationUnit(*args, metadata=None, **kwargs)¶
Base class for translation units that store field data in an internal dictionary.
This class provides a common implementation for storage formats (catkeys, omegat, utx, wordfast, ARB, RESJSON) that manage unit data through an internal dictionary accessible via a metadata property with getters and setters.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getmetadata() dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- property metadata: dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- setmetadata(newdict: dict[str, Any]) None¶
Set the dictionary of metadata/field values for this unit.
- Parameters:
newdict – A new dictionary with field values
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- exception translate.storage.base.ParseError(inner_exc)¶
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class translate.storage.base.PreparedInput(data, from_handle)¶
- count(value, /)¶
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)¶
Return first index of value.
Raises ValueError if the value is not present.
- exception translate.storage.base.SerializationError¶
Raised when in-memory content cannot be serialized.
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- exception translate.storage.base.TranslateToolkitError¶
Base class for toolkit-defined storage exceptions.
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class translate.storage.base.TranslationStore(unitclass=None, encoding=None)¶
Base class for stores for multiple translation units of type UnitClass.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
The class of units that will be instantiated and used by this class
alias of
TranslationUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(data) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.base.TranslationUnit(source=None)¶
Base class for translation units.
Our concept of a translation unit is influenced heavily by XLIFF.
As such most of the method- and variable names borrows from XLIFF terminology.
A translation unit consists of the following:
A source string. This is the original translatable text.
A target string. This is the translation of the source.
Zero or more notes on the unit. Notes would typically be some comments from a translator on the unit, or some comments originating from the source code.
Zero or more locations. Locations indicate where in the original source code this unit came from.
Zero or more errors. Some tools (eg.
pofilter) can run checks on translations and produce error messages.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers: list[Callable[[str], StringElem | list[StringElem] | None]] = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.base.get_input_name(value: object, *, include_filename_attr: bool = False) str¶
Extract a filename-like identifier from an input object.
This is used to preserve
.filenamemetadata without reinterpreting already-read content as a path.
- translate.storage.base.is_path_input(value: object) TypeGuard[str | PathLike[str]]¶
Return whether the value should be treated as a filesystem path.
- translate.storage.base.path_input_str(value: str | PathLike[str]) str¶
Convert a path-like value to a normalized string path.
- translate.storage.base.prepare_input(value: Any, *, close_handle: bool = False) PreparedInput¶
Materialize file-like input while preserving whether it came from a handle.
Callers can use
from_handleto decide whether a string result should be treated as text content or as a direct filesystem path argument.By default this helper leaves passed-in file objects open. Callers that need compatibility with older ownership semantics can request closing after read.
catkeys¶
Manage the Haiku catkeys translation format.
The Haiku catkeys format is the translation format used for localisation of the Haiku operating system.
It is a bilingual base class derived format with CatkeysFile and
CatkeysUnit providing file and unit level access. The file format is
described here:
http://www.haiku-os.org/blog/pulkomandy/2009-09-24_haiku_locale_kit_translator_handbook
- Implementation
The implementation covers the full requirements of a catkeys file. The files are simple Tab Separated Value (TSV) files that can be read by Microsoft Excel and other spreadsheet programs. They use the .txt extension which does make it more difficult to automatically identify such files.
The dialect of the TSV files is specified by
CatkeysDialect.- Encoding
The files are UTF-8 encoded.
- Header
CatkeysHeaderprovides header management support.- Escaping
catkeys seem to escape things like in C++ (strings are just extracted from the source code unchanged, it seems.
Functions allow for
_escape()and_unescape().
- class translate.storage.catkeys.CatkeysDialect¶
Describe the properties of a catkeys generated TAB-delimited file.
- class translate.storage.catkeys.CatkeysFile(inputfile=None, **kwargs)¶
A catkeys translation memory file.
- Extensions: ClassVar[list[str]] = ['catkeys']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-catkeys']¶
A list of MIME types associated with this store type
- Name = 'Haiku catkeys file'¶
The human usable name of this store type
- UnitClass¶
alias of
CatkeysUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.catkeys.CatkeysHeader(header=None)¶
A catkeys translation memory header.
- class translate.storage.catkeys.CatkeysUnit(*args, metadata=None, **kwargs)¶
A catkeys translation memory unit.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getmetadata() dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- property metadata: dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- setmetadata(newdict: dict[str, str]) None¶
Set the dictionary of values for a catkeys line.
Overrides the parent’s setmetadata() to filter and validate field names, ensuring only valid catkeys fields are stored.
- Parameters:
newdict – a new dictionary with catkeys line elements
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.catkeys.FIELDNAMES = ['source', 'context', 'comment', 'target']¶
Field names for a catkeys TU
- translate.storage.catkeys.FIELDNAMES_HEADER = ['version', 'language', 'mimetype', 'checksum']¶
Field names for the catkeys header
- translate.storage.catkeys.FIELDNAMES_HEADER_DEFAULTS = {'checksum': '', 'language': '', 'mimetype': '', 'version': '1'}¶
Default or minimum header entries for a catkeys file
cpo¶
csvl10n¶
classes that hold units of comma-separated values (.csv) files (csvunit) or entire files (csvfile) for use with localisation.
- class translate.storage.csvl10n.DefaultDialect¶
- class translate.storage.csvl10n.csvfile(inputfile=None, fieldnames=None, encoding='auto')¶
This class represents a .csv file with various lines. The default format contains three columns: location, source, target.
- Extensions: ClassVar[list[str]] = ['csv']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['text/comma-separated-values', 'text/csv']¶
A list of MIME types associated with this store type
- Name = 'Comma Separated Value'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(csvsrc, sample_length: int | None = 1024, *, dialect: str | None = None) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.csvl10n.csvunit(source=None)¶
- add_spreadsheet_escape(value)¶
Add a spreadsheet escape to a string when it starts like a formula.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- isfuzzy()¶
Indicates whether this unit is fuzzy.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- match_header()¶
See if unit might be a header.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.csvl10n.detect_header(inputfile: StringIO, dialect: str | type[Dialect], fieldnames: list[str]) tuple[list[str], bool]¶
Test if file has a header or not.
- Args:
inputfile: CSV file to read dialect: CSV dialect to use fieldnames: Default field names if no header found
- Returns:
Tuple of (fieldnames, has_header) where has_header is True if the first row is a valid header.
- translate.storage.csvl10n.try_dialects(inputfile: StringIO, fieldnames: list[str] | None, dialect: str | type[Dialect], has_header: bool = False) DictReader¶
Create a CSV DictReader with the appropriate dialect.
- Args:
inputfile: CSV file to read fieldnames: Field names to use, or None to use first row dialect: CSV dialect to use has_header: Whether file has a header row
- translate.storage.csvl10n.valid_fieldnames(fieldnames: list[str]) bool¶
Check if fieldnames are valid.
For bilingual CSV files, at least one field should be identified as “source”. For monolingual CSV files, we accept files with “id”, “context”, or “target” fields without requiring a “source” field.
dtd¶
Classes that hold units of .dtd files (dtdunit) or entire files
(dtdfile).
These are specific .dtd files for localisation used by mozilla.
- Specifications
The following information is provided by Mozilla:
There is a grammar for entity definitions, which isn’t really precise, as the spec says. There’s no formal specification for DTD files, it’s just “whatever makes this work” basically. The whole piece is clearly not the strongest point of the xml spec
XML elements are allowed in entity values. A number of things that are allowed will just break the resulting document, Mozilla forbids these in their DTD parser.
- Dialects
There are two dialects:
Regular DTD
Android DTD
Both dialects are similar, but the Android DTD uses some particular escapes that regular DTDs don’t have.
- Escaping in regular DTD
In DTD usually there are characters escaped in the entities. In order to ease the translation some of those escaped characters are unescaped when reading from, or converting, the DTD, and that are escaped again when saving, or converting to a DTD.
In regular DTD the following characters are usually or sometimes escaped:
The % character is escaped using % or % or %
The “ character is escaped using "
The ‘ character is escaped using ' (partial roundtrip)
The & character is escaped using &
The < character is escaped using < (not yet implemented)
The > character is escaped using > (not yet implemented)
Besides the previous ones there are a lot of escapes for a huge number of characters. This escapes usually have the form of &#NUMBER; where NUMBER represents the numerical code for the character.
There are a few particularities in DTD escaping. Some of the escapes are not yet implemented since they are not really necessary, or because its implementation is too hard.
A special case is the ‘ escaping using ' which doesn’t provide a full roundtrip conversion in order to support some special Mozilla DTD files.
Also the “ character is never escaped in the case that the previous character is = (the sequence =” is present on the string) in order to avoid escaping the “ character indicating an attribute assignment, for example in a href attribute for an a tag in HTML (anchor tag).
- Escaping in Android DTD
It has the sames escapes as in regular DTD, plus this ones:
The ‘ character is escaped using ' or ' or u0027
The “ character is escaped using "
- class translate.storage.dtd.DTDValidationResolver(*args: Any, **kwargs: Any)¶
Resolve only the in-memory DTD used for validation.
- translate.storage.dtd.accesskeysuffixes = ('.accesskey', '.accessKey', '.akey')¶
Accesskey Suffixes: entries with this suffix may be combined with labels ending in
labelsuffixesinto accelerator notation
- class translate.storage.dtd.dtdfile(inputfile=None, android=False)¶
A .dtd file made up of dtdunits.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(dtdsrc) None¶
Read the source code of a dtd file in and include them as dtdunits in self.units.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.dtd.dtdunit(source='', android=False)¶
An entity definition from a DTD file (and any associated comments).
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
Return the entity as location (identifier).
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput()¶
Convert the dtd entity back to string form.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank()¶
Returns whether this dtdunit doesn’t actually have an entity definition.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(new_id) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- property source¶
Gets the unquoted source string.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- property target¶
Gets the unquoted target string.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.dtd.labelsuffixes = ('.label', '.title')¶
Label suffixes: entries with this suffix are able to be comibed with accesskeys found in in entries ending with
accesskeysuffixes
- translate.storage.dtd.quoteforandroid(source)¶
Escapes a line for Android DTD files.
- translate.storage.dtd.quotefordtd(source)¶
Quotes and escapes a line for regular DTD files.
- translate.storage.dtd.removeinvalidamps(name: str, value: str) str¶
Find and remove ampersands that are not part of an entity definition.
A stray & in a DTD file can break an application’s ability to parse the file. In Mozilla localisation this is very important and these can break the parsing of files used in XUL and thus break interface rendering. Tracking down the problem is very difficult, thus by removing potential broken ampersand and warning the users we can ensure that the output DTD will always be parsable. :param name: Entity name :param value: Entity text value :return: Entity value without bad ampersands
- translate.storage.dtd.unquotefromandroid(source)¶
Unquotes a quoted Android DTD definition.
- translate.storage.dtd.unquotefromdtd(source)¶
Unquotes a quoted dtd definition.
_factory_classes¶
factory¶
factory methods to build real storage objects that conform to base.py.
- translate.storage.factory.getclass(storefile, localfiletype=None, ignore=None, classes=None, classes_str=None, hiddenclasses=None)¶
Factory that returns the applicable class for the type of file presented. Specify ignore to ignore some part at the back of the name (like .gz).
- translate.storage.factory.getobject(storefile: str | TranslationStore, localfiletype: str | None = None, ignore: str | None = None, classes: dict | None = None, classes_str: dict | None = None, hiddenclasses: dict | None = None) TranslationStore¶
Factory that returns a usable object for the type of file presented. :param storefile: File object or file name.
Specify ignore to ignore some part at the back of the name (like .gz).
fpo¶
html¶
module for parsing html files for translation.
- class translate.storage.html.POHTMLParser(inputfile=None, callback=None)¶
- EMPTY_HTML_ELEMENTS = ['area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 'track', 'wbr']¶
An empty element is an element that cannot have any child nodes (i.e., nested elements or text nodes). In HTML, using a closing tag on an empty element is usually invalid. Reference https://developer.mozilla.org/en-US/docs/Glossary/Empty_element
- Name = 'Base translation store'¶
The human usable name of this store type
- TRANSLATABLE_ATTRIBUTES = ['abbr', 'alt', 'lang', 'summary', 'title', 'value']¶
Text from these HTML attributes will be extracted as translation units. Note: the content attribute of meta tags is a special case.
- TRANSLATABLE_ELEMENTS = ['address', 'article', 'aside', 'blockquote', 'button', 'caption', 'dd', 'dt', 'div', 'figcaption', 'footer', 'header', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'label', 'li', 'main', 'nav', 'option', 'p', 'pre', 'section', 'td', 'th', 'title']¶
These HTML elements (tags) will be extracted as translation units, unless they lack translatable text content. In case one translatable element is embedded in another, the outer translation unit will be split into the parts before and after the inner translation unit.
- TRANSLATABLE_METADATA = ['description', 'keywords', 'og:title', 'og:description', 'og:site_name', 'og:image:alt', 'twitter:title', 'twitter:description', 'twitter:image:alt', 'video:actor:role', 'video:tag', 'article:section', 'article:tag', 'payment:description']¶
Document metadata from meta elements with these names will be extracted as translation units. Includes standard meta tags and common social media tags (Open Graph and Twitter Cards). Reference https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta/name
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- close()¶
Handle any buffered data.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- do_encoding(htmlsrc)¶
Return the html text properly encoded based on a charset.
- static escape_attribute_value(value: str) str¶
Escape text for double-quoted HTML attributes, preserving entities.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- feed(data)¶
Feed data to the parser.
Call this as often as you want, with as little or as much text as you want (may include ‘n’).
- findid(id)¶
Find unit with matching id by checking id_index.
- get_starttag_text()¶
Return full source of start tag: ‘<…>’.
- getids()¶
Return a list of unit ids.
- getpos()¶
Return current line number and offset.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- guess_encoding(htmlsrc)¶
Returns the encoding of the html text.
We look for ‘charset=’ within a meta tag to do this.
- is_extraction_ignored()¶
Check if we’re currently in an ignored section.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(htmlsrc) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- reset()¶
Reset this instance. Loses all unprocessed data.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.html.htmlfile(inputfile=None, callback=None)¶
- EMPTY_HTML_ELEMENTS = ['area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 'track', 'wbr']¶
An empty element is an element that cannot have any child nodes (i.e., nested elements or text nodes). In HTML, using a closing tag on an empty element is usually invalid. Reference https://developer.mozilla.org/en-US/docs/Glossary/Empty_element
- Name = 'Base translation store'¶
The human usable name of this store type
- TRANSLATABLE_ATTRIBUTES = ['abbr', 'alt', 'lang', 'summary', 'title', 'value']¶
Text from these HTML attributes will be extracted as translation units. Note: the content attribute of meta tags is a special case.
- TRANSLATABLE_ELEMENTS = ['address', 'article', 'aside', 'blockquote', 'button', 'caption', 'dd', 'dt', 'div', 'figcaption', 'footer', 'header', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'label', 'li', 'main', 'nav', 'option', 'p', 'pre', 'section', 'td', 'th', 'title']¶
These HTML elements (tags) will be extracted as translation units, unless they lack translatable text content. In case one translatable element is embedded in another, the outer translation unit will be split into the parts before and after the inner translation unit.
- TRANSLATABLE_METADATA = ['description', 'keywords', 'og:title', 'og:description', 'og:site_name', 'og:image:alt', 'twitter:title', 'twitter:description', 'twitter:image:alt', 'video:actor:role', 'video:tag', 'article:section', 'article:tag', 'payment:description']¶
Document metadata from meta elements with these names will be extracted as translation units. Includes standard meta tags and common social media tags (Open Graph and Twitter Cards). Reference https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta/name
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- close()¶
Handle any buffered data.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- do_encoding(htmlsrc)¶
Return the html text properly encoded based on a charset.
- static escape_attribute_value(value: str) str¶
Escape text for double-quoted HTML attributes, preserving entities.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- feed(data)¶
Feed data to the parser.
Call this as often as you want, with as little or as much text as you want (may include ‘n’).
- findid(id)¶
Find unit with matching id by checking id_index.
- get_starttag_text()¶
Return full source of start tag: ‘<…>’.
- getids()¶
Return a list of unit ids.
- getpos()¶
Return current line number and offset.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- guess_encoding(htmlsrc)¶
Returns the encoding of the html text.
We look for ‘charset=’ within a meta tag to do this.
- is_extraction_ignored()¶
Check if we’re currently in an ignored section.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(htmlsrc) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- reset()¶
Reset this instance. Loses all unprocessed data.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.html.htmlunit(source=None)¶
A unit of translatable/localisable HTML content.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
Returns a unique identifier for this unit.
- getlocations()¶
Get the list of locations for this unit.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
ical¶
Class that manages iCalender files for translation.
iCalendar files follow the RFC2445 specification.
The iCalendar specification uses the following naming conventions:
Component: an event, journal entry, timezone, etc
Property: a property of a component: summary, description, start time, etc
Attribute: an attribute of a property, e.g. language
The following are localisable in this implementation:
VEVENT component: SUMMARY, DESCRIPTION, COMMENT and LOCATION properties
While other items could be localised this is not seen as important until use cases arise. In such a case simply adjusting the component.name and property.name lists to include these will allow expanded localisation.
- LANGUAGE Attribute
While the iCalendar format allows items to have a language attribute this is not used. The reason being that for most of the items that we localise they are only allowed to occur zero or once. Thus ‘summary’ would ideally be present in multiple languages in one file, the format does not allow such multiple entries. This is unfortunate as it prevents the creation of a single multilingual iCalendar file.
- Future Format Support
As this format used vobject which supports various formats including vCard it is possible to expand this format to understand those if needed.
- class translate.storage.ical.icalfile(inputfile=None, **kwargs)¶
An ical file.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.ical.icalunit(source=None, **kwargs)¶
An ical entry that is translatable.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
ini¶
Class that manages .ini files for translation.
# a comment ; a comment
[Section] a = a string b : a string
- class translate.storage.ini.Dialect¶
Base class for differentiating dialect options and functions.
- class translate.storage.ini.DialectDefault¶
- class translate.storage.ini.DialectInno¶
- class translate.storage.ini.inifile(inputfile=None, dialect='default', **kwargs)¶
An INI file.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parse INI data from a path or in-memory content.
Direct
strandos.PathLikeinputs are treated as filesystem paths and opened. In-memory content should be passed asbytesor as a readable stream. Text streams are parsed as their stream content, not reopened as paths, so raw INI text should be wrapped inStringIOinstead of passed as a plainstr.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.ini.iniunit(source=None, **kwargs)¶
A INI file entry.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.ini.register_dialect(dialect)¶
Decorator that registers the dialect.
jsonl10n¶
Class that manages JSON data files for translation.
JSON is an acronym for JavaScript Object Notation, it is an open standard designed for human-readable data interchange.
JSON basic types:
Number (integer or real)
String (double-quoted Unicode with backslash escaping)
Boolean (true or false)
Array (an ordered sequence of values, comma-separated and enclosed in square brackets)
Object (a collection of key:value pairs, comma-separated and enclosed in curly braces)
null
Example:¶
{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}
Todo:¶
Handle
\uand other escapes in UnicodeManage data type storage and conversion. True –> “True” –> True
- class translate.storage.jsonl10n.ARBJsonFile(inputfile=None, filter=None, **kwargs)¶
ARB JSON file.
See following URLs for doc:
https://github.com/google/app-resource-bundle/wiki/ApplicationResourceBundleSpecification https://docs.flutter.dev/development/accessibility-and-localization/internationalization#dart-tools
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
ARBJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.ARBJsonUnit(source=None, item=None, notes=None, placeholders=None, metadata=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getmetadata() dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- isheader()¶
Indicates whether this unit is a header.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- property metadata: dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- setmetadata(newdict: dict[str, Any]) None¶
Set the dictionary of metadata/field values for this unit.
- Parameters:
newdict – A new dictionary with field values
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.BaseJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
A JSON entry.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.DumpArgsType¶
- clear()¶
Remove all items from the dict.
- copy()¶
Return a shallow copy of the dict.
- classmethod fromkeys(iterable, value=None, /)¶
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items()¶
Return a set-like object providing a view on the dict’s items.
- keys()¶
Return a set-like object providing a view on the dict’s keys.
- pop(k[, d]) v, remove specified key and return the corresponding value.¶
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()¶
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from mapping/iterable E and F.¶
If E is present and has a .keys() method, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- values()¶
Return an object providing a view on the dict’s values.
- class translate.storage.jsonl10n.FlatI18NextV4File(inputfile=None, filter=None, **kwargs)¶
Flat json file with support of i18next v4 format plurals.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
FlatI18NextV4Unit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.FlatI18NextV4Unit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.FlatJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.FormatJSJsonFile(inputfile=None, filter=None, **kwargs)¶
FormatJS JSON file.
See following URLs for doc:
https://formatjs.github.io/docs/getting-started/message-extraction/
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
FormatJSJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.FormatJSJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.GoI18NJsonFile(inputfile=None, filter=None, **kwargs)¶
go-i18n JSON file.
See following URLs for doc:
https://github.com/nicksnyder/go-i18n/tree/v1 https://pkg.go.dev/github.com/nicksnyder/go-i18n
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
GoI18NJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.GoI18NJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.GoI18NV2JsonFile(inputfile=None, filter=None, **kwargs)¶
go-i18n v2 JSON file.
See following URLs for doc:
https://github.com/nicksnyder/go-i18n https://pkg.go.dev/github.com/nicksnyder/go-i18n/v2
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
GoI18NV2JsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.GoI18NV2JsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.GoTextJsonFile(inputfile=None, filter=None, **kwargs)¶
gotext JSON file.
See following URLs for doc:
https://pkg.go.dev/golang.org/x/text/cmd/gotext https://github.com/golang/text/tree/master/cmd/gotext/examples/extract/locales/en-US
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
GoTextJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.GoTextJsonUnit(source=None, item=None, notes=None, placeholders=None, comment=None, message=None, meaning=None, key=None, fuzzy=None, position=None, **kwargs)¶
- IdClass¶
alias of
GoTextUnitId
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.GoTextUnitId(parts)¶
Preserves id as stored in the JSON file.
- class translate.storage.jsonl10n.I18NextFile(inputfile=None, filter=None, **kwargs)¶
A i18next v3 format, this is nested JSON with several additions.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
I18NextUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.I18NextUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
A i18next v3 format, JSON with plurals.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.I18NextV4File(inputfile=None, filter=None, **kwargs)¶
A i18next v4 format, this is nested JSON with several additions.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
I18NextV4Unit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.I18NextV4Unit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
A i18next v4 format, JSON with plurals.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.JsonFile(inputfile=None, filter=None, **kwargs)¶
A JSON file.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
FlatJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.JsonNestedFile(inputfile=None, filter=None, **kwargs)¶
A JSON file with nested keys.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
JsonNestedUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.JsonNestedUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
A nested JSON entry.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.NextcloudJsonFile(inputfile: str | bytes | TextIO | BinaryIO | None = None, filter: Any = None, **kwargs)¶
Nextcloud JSON file.
Nextcloud apps use a JSON format with translations wrapped in a “translations” key. Plurals follow gettext conventions with keys like
_%n singular_::_%n plural_and array values.See: https://docs.nextcloud.com/server/stable/developer_manual/basics/translations.html https://github.com/nextcloud-libraries/nextcloud-l10n/
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
NextcloudJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.NextcloudJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
A Nextcloud JSON entry.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.RESJSONFile(inputfile=None, filter=None, **kwargs)¶
RESJSON (JavaScript Resource File) format.
This format uses _KEY.DATA syntax to attach metadata to translation strings.
See following URL for doc:
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
RESJSONUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.RESJSONUnit(source=None, item=None, notes=None, placeholders=None, metadata=None, **kwargs)¶
A RESJSON entry with metadata support.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getmetadata() dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- property metadata: dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- setmetadata(newdict: dict[str, Any]) None¶
Set the dictionary of metadata/field values for this unit.
- Parameters:
newdict – A new dictionary with field values
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.jsonl10n.WebExtensionJsonFile(inputfile=None, filter=None, **kwargs)¶
WebExtension JSON file.
See following URLs for doc:
https://developer.chrome.com/extensions/i18n https://developer.mozilla.org/en-US/Add-ons/WebExtensions/Internationalization
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
WebExtensionJsonUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.jsonl10n.WebExtensionJsonUnit(source=None, item=None, notes=None, placeholders=None, **kwargs)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- getvalue()¶
Returns dictionary for serialization.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value, unitid=None) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
lisa¶
Parent class for LISA standards (TMX, TBX, XLIFF).
- class translate.storage.lisa.LISAfile(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶
A class representing a file store for one of the LISA file formats.
- Name = 'Base translation store'¶
The human usable name of this store type
- addsourceunit(source)¶
Adds and returns a new unit with the given string as first entry.
- addunit(unit, new=True) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.lisa.LISAunit(source, empty=False, **kwargs)¶
A single unit in the file. Provisional work is done to make several languages possible.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- copy() LISAunit¶
Make a copy of the translation unit.
Copy the XML subtree directly instead of serializing and reparsing it.
- static createlanguageNode(lang, text, purpose=None) None¶
Returns a xml Element setup with given parameters to represent a single language entry. Has to be overridden.
- getNodeText(languageNode, xml_space='preserve')¶
Retrieves the term from the given
languageNode.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlanguageNode(lang=None, index=None)¶
Retrieves a
languageNodeeither by language or by index.
- getlanguageNodes()¶
Returns a list of all nodes that contain per language information.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- gettarget(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- settarget(target, lang='xx', append=False) None¶
Sets the “target” string (second language), or alternatively appends to the list.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
markdown¶
Module for parsing Markdown files for translation.
The principles for extraction of translation units are as follows:
Extract all content relevant for translation, at the cost of also including some formatting.
One translation unit per paragraph.
Keep formatting out of the translation units as much as possible. Exceptions include phrase emphasis and inline code. Use placeholders {1}, {2}, …, as needed.
Avoid HTML entities in the translation units. Use Unicode equivalents if possible.
White space within translation units is normalized, because the PO format does not preserve white space, and the translated Markdown content may have to be reflowed anyway.
- class translate.storage.markdown.MarkdownFile(inputfile=None, callback=None, max_line_length=None, extract_code_blocks=True, extract_frontmatter=True, no_placeholders=False)¶
-
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
MarkdownUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.markdown.MarkdownUnit(source=None)¶
A unit of translatable/localisable markdown content.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.markdown.TranslatingMarkdownRenderer(*args: Any, **kwargs: Any)¶
- expand_placeholders(fragments: Iterable[Fragment]) Iterable[Fragment]¶
Expands placeholder fragments, recursively.
- classmethod insert_placeholder_markers(fragments: Iterable[Fragment]) Iterable[Fragment]¶
Sets the text of the (top-level) placeholder fragments to “{n}”. Returns an ordered list of placeholders.
- classmethod merge_adjacent_placeholders(fragments: Iterable[Fragment]) Iterable[Fragment]¶
Replaces sequences of placeholders and whitespace with larger placeholders.
- remove_placeholder_markers(markdown: str, placeholders: Iterable[Fragment]) str¶
Replaces placeholder markers in the given markdown with placeholder content.
- span_to_lines(tokens: Iterable[span_token.SpanToken], max_line_length: int) Iterable[str]¶
Renders a sequence of span tokens to markdown, with translation.
- classmethod trim_flanking_placeholders(fragments: Iterable[Fragment]) tuple[Iterable[Fragment], Iterable[Fragment], Iterable[Fragment]]¶
Splits leading and trailing placeholders and whitespace, and the main content, into separate lists. Placeholders marked as important are kept with the main content.
mo¶
Module for parsing Gettext .mo files for translation.
The coding of .mo files was produced from Gettext documentation, Pythons msgfmt.py and by observing and testing existing .mo files in the wild.
The hash algorithm is implemented for MO files, this should result in faster access of the MO file. The hash is optional for Gettext and is not needed for reading or writing MO files, in this implementation it is always on and does produce sometimes different results to Gettext in very small files.
- class translate.storage.mo.mofile(inputfile=None, **kwargs)¶
A class representing a .mo file.
- Extensions: ClassVar[list[str]] = ['mo', 'gmo']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-gettext-catalog', 'application/x-mo']¶
A list of MIME types associated with this store type
- Name = 'Gettext MO file'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getheaderplural()¶
Returns the nplural and plural values from the header.
- getids()¶
Return a list of unit ids.
- getprojectstyle() str | None¶
Return the project based on information in the header.
- The project is determined in the following sequence:
Use the ‘X-Project-Style’ entry in the header.
Use ‘Report-Msgid-Bug-To’ entry
Use the ‘X-Accelerator’ entry
Use the Project ID
Analyse the file itself (not yet implemented)
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Return the target language based on information in the header.
- The target language is determined in the following sequence:
Use the ‘Language’ entry in the header.
Poedit’s custom headers.
Analysing the ‘Language-Team’ entry.
- header()¶
Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
- init_headers(charset='UTF-8', encoding='8bit', **kwargs)¶
Sets default values for po headers.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- makeheader(**kwargs)¶
Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
- makeheaderdict(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs) dict[str, str]¶
Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
- Returns:
Dictionary with the header items
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- mergeheaders(otherstore) None¶
Merges another header with this header.
This header is assumed to be the template.
- Parameters:
otherstore – The other store to merge headers from.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- parseheader()¶
Parses the PO header and returns the interpreted values as a dictionary.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- setprojectstyle(project_style: str) None¶
Set the project in the header.
- Parameters:
project_style – the new project
- settargetlanguage(lang: str) None¶
Set the target language in the header.
This removes any custom Poedit headers if they exist.
- Parameters:
lang – the new target language code
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- updatecontributor(name: str, email: str | None = None) None¶
Add contribution comments if necessary.
- updateheader(add=False, **kwargs)¶
Updates the fields in the PO style header.
This will create a header if add == True.
- class translate.storage.mo.mounit(source=None, **kwargs)¶
A class representing a .mo translation message.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Is this message translatable?.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
mozilla_lang¶
A class to manage Mozilla .lang files.
See https://github.com/mozilla-l10n/langchecker/wiki/.lang-files-format for specifications on the format.
- class translate.storage.mozilla_lang.LangStore(inputfile=None, mark_active=False, **kwargs)¶
We extend TxtFile, since that has a lot of useful stuff for encoding.
- Extensions: ClassVar[list[str]] = ['lang']¶
A list of file extensions associated with this store type
- Name = 'Mozilla .lang'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(lines: bytes | list[bytes]) None¶
Read in text lines and create txtunits from the blocks of text.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.mozilla_lang.LangUnit(source=None)¶
This is just a normal unit with a weird string output.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
odf_io¶
omegat¶
Manage the OmegaT glossary format.
OmegaT glossary format is used by the OmegaT computer aided translation tool.
It is a bilingual base class derived format with OmegaTFile
and OmegaTUnit providing file and unit level access.
- Format Implementation
The OmegaT glossary format is a simple Tab Separated Value (TSV) file with the columns: source, target, comment.
The dialect of the TSV files is specified by
OmegaTDialect.- Encoding
The files are either UTF-8 or encoded using the system default. UTF-8 encoded files use the .utf8 extension while system encoded files use the .tab extension.
- translate.storage.omegat.OMEGAT_FIELDNAMES = ['source', 'target', 'comment']¶
Field names for an OmegaT glossary unit
- class translate.storage.omegat.OmegaTDialect¶
Describe the properties of an OmegaT generated TAB-delimited glossary file.
- class translate.storage.omegat.OmegaTFile(inputfile=None, **kwargs)¶
An OmegaT glossary file.
- Extensions: ClassVar[list[str]] = ['utf8']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-omegat-glossary']¶
A list of MIME types associated with this store type
- Name = 'OmegaT Glossary'¶
The human usable name of this store type
- UnitClass¶
alias of
OmegaTUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.omegat.OmegaTFileTab(inputfile=None, **kwargs)¶
An OmegaT glossary file in the default system encoding.
- Extensions: ClassVar[list[str]] = ['tab']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-omegat-glossary']¶
A list of MIME types associated with this store type
- Name = 'OmegaT Glossary'¶
The human usable name of this store type
- UnitClass¶
alias of
OmegaTUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.omegat.OmegaTUnit(*args, metadata=None, **kwargs)¶
An OmegaT glossary unit.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getmetadata() dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- property metadata: dict[str, Any]¶
Get the dictionary of metadata/field values for this unit.
- Returns:
The internal dictionary containing field values
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- setmetadata(newdict: dict[str, Any]) None¶
Set the dictionary of metadata/field values for this unit.
- Parameters:
newdict – A new dictionary with field values
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
oo¶
Classes that hold units of .oo files (oounit) or entire files (oofile).
These are specific .oo files for localisation exported by OpenOffice.org - SDF format (previously knows as GSI files).
The behaviour in terms of escaping is explained in detail in the programming comments.
- exception translate.storage.oo.UnsafeOOSubfilePath¶
Raised when an OO/SDF line derives an unsafe recursive subfile path.
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- translate.storage.oo.escape_help_text(text)¶
Escapes the help text as it would be in an SDF file.
<, >, “ are only escaped in <[[:lower:]]> tags. Some HTML tags make it in in lowercase so those are dealt with. Some OpenOffice.org help tags are not escaped.
- translate.storage.oo.escape_text(text)¶
Escapes SDF text to be suitable for unit consumption.
- translate.storage.oo.makekey(ookey: tuple, long_keys: bool) str¶
Converts an oo key tuple into a unique identifier.
- Parameters:
ookey – an oo key
long_keys – Use long keys
- Returns:
unique ascii identifier
- class translate.storage.oo.normalizechar(normalchars)¶
- clear()¶
Remove all items from the dict.
- copy()¶
Return a shallow copy of the dict.
- classmethod fromkeys(iterable, value=None, /)¶
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)¶
Return the value for key if key is in the dictionary, else default.
- items()¶
Return a set-like object providing a view on the dict’s items.
- keys()¶
Return a set-like object providing a view on the dict’s keys.
- pop(k[, d]) v, remove specified key and return the corresponding value.¶
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()¶
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from mapping/iterable E and F.¶
If E is present and has a .keys() method, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- values()¶
Return an object providing a view on the dict’s values.
- translate.storage.oo.normalizefilename(filename)¶
Converts any non-alphanumeric (standard roman) characters to _.
- translate.storage.oo.normalizesubfilepath(pathname)¶
Normalizes OO/SDF-derived subfile paths and rejects path traversal.
- class translate.storage.oo.oofile(input=None)¶
this represents an entire .oo file.
- getoutput(skip_source=False, fallback_lang=None)¶
Converts all the lines back to tab-delimited form.
- class translate.storage.oo.ooline(parts=None)¶
this represents one line, one translation in an .oo file.
- getkey()¶
Get the key that identifies the resource.
- getoutput()¶
Return a line in tab-delimited form.
- getparts()¶
Return a list of parts in this line.
- gettext()¶
Obtains the text column and handle escaping.
- property text¶
Obtains the text column and handle escaping.
- class translate.storage.oo.oomultifile(filename, mode=None, multifilestyle='single')¶
this takes a huge GSI file and represents it as multiple smaller files…
- getoofile(subfile)¶
Returns an oofile built up from the given subfile’s lines.
- getsubfilesrc(subfile)¶
Returns the list of lines matching the subfile.
- listsubfiles()¶
Returns a list of subfiles in the file.
- openinputfile(subfile)¶
Returns a pseudo-file object for the given subfile.
- openoutputfile(subfile)¶
Returns a pseudo-file object for the given subfile.
- class translate.storage.oo.oounit¶
this represents a number of translations of a resource.
- getoutput(skip_source=False, fallback_lang=None)¶
Return the lines in tab-delimited form.
- translate.storage.oo.unescape_help_text(text)¶
Unescapes normal text to be suitable for writing to the SDF file.
- translate.storage.oo.unescape_text(text)¶
Unescapes SDF text to be suitable for unit consumption.
placeables¶
This module implements basic functionality to support placeables.
- A placeable is used to represent things like:
Substitutions
For example, in ODF, footnotes appear in the ODF XML where they are defined; so if we extract a paragraph with some footnotes, the translator will have a lot of additional XML to with; so we separate the footnotes out into separate translation units and mark their positions in the original text with placeables.
Hiding of inline formatting data
The translator doesn’t want to have to deal with all the weird formatting conventions of wherever the text came from.
Marking variables
This is an old issue - translators translate variable names which should remain untranslated. We can wrap placeables around variable names to avoid this.
The placeables model follows the XLIFF standard’s list of placeables. Please refer to the XLIFF specification to get a better understanding.
base¶
Contains base placeable classes with names based on XLIFF placeables. See the XLIFF standard for more information about what the names mean.
- class translate.storage.placeables.base.Bpt(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.Bx(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.Ept(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.Ex(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.G(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.It(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.Ph(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.Sub(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.base.X(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = False¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = True¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
general¶
Contains general placeable implementations. That is placeables that does not fit into any other sub-category.
- class translate.storage.placeables.general.AltAttrPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
Placeable for the “alt=…” attributes inside XML tags.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) StringElem | list[StringElem] | None¶
Parse a string into placeables based on a regular expression.
This classmethod is provided by
RegexParseMixinand is intended to be inherited by placeable subclasses that use regex-based parsing. Subclasses must define aregexclass attribute (an instance ofre.Pattern) which will be used to find matching segments inpstr.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.general.XMLEntityPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
Placeable handling XML entities (
&xxxxx;-style entities).- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = False¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) StringElem | list[StringElem] | None¶
Parse a string into placeables based on a regular expression.
This classmethod is provided by
RegexParseMixinand is intended to be inherited by placeable subclasses that use regex-based parsing. Subclasses must define aregexclass attribute (an instance ofre.Pattern) which will be used to find matching segments inpstr.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.general.XMLTagPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
Placeable handling XML tags.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) StringElem | list[StringElem] | None¶
Parse a string into placeables based on a regular expression.
This classmethod is provided by
RegexParseMixinand is intended to be inherited by placeable subclasses that use regex-based parsing. Subclasses must define aregexclass attribute (an instance ofre.Pattern) which will be used to find matching segments inpstr.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
interfaces¶
- This file contains abstract (semantic) interfaces for placeable
implementations.
- class translate.storage.placeables.interfaces.BasePlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
Base class for all placeables.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.interfaces.InvisiblePlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.interfaces.MaskingPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.interfaces.ReplacementPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.interfaces.SubflowPlaceable(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
lisa¶
parse¶
Contains the parse function that parses normal strings into StringElem-
based “rich” string element trees.
- translate.storage.placeables.parse.parse(tree: str | StringElem, parse_funcs: list[Callable[[str], StringElem | list[StringElem] | None]]) StringElem¶
Parse placeables from the given string or sub-tree by using the parsing functions provided.
The output of this function is heavily dependent on the order of the parsing functions. This is because of the algorithm used.
An over-simplification of the algorithm: the leaves in the
StringElemtree are expanded to the output of the first parsing function inparse_funcs. The next level of recursion is then started on the new set of leaves with the used parsing function removed fromparse_funcs.- Parameters:
tree – The string or string element sub-tree to parse.
parse_funcs – A list of parsing functions. Each function takes one argument (a
unicodestring to parse) and return a list ofStringElem``s which, together, form the original string. If nothing could be parsed, it should return ``None.
strelem¶
Contains the base StringElem class that represents a node in a
parsed rich-string tree. It is the base class of all placeables.
- exception translate.storage.placeables.strelem.ElementNotFoundError¶
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class translate.storage.placeables.strelem.StringElem(sub=None, id=None, rid=None, xid=None, **kwargs)¶
This class represents a sub-tree of a string parsed into a rich structure. It is also the base class of all placeables.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
terminology¶
Contains the placeable that represents a terminology term.
- class translate.storage.placeables.terminology.TerminologyPlaceable(*args, **kwargs)¶
Terminology distinguished from the rest of a string by being a placeable.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) StringElem | list[StringElem] | None¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
xliff¶
Contains XLIFF-specific placeables.
- class translate.storage.placeables.xliff.Bpt(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.Bx(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.Ept(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.Ex(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.G(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.It(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.Ph(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.Sub(sub=None, id=None, rid=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.UnknownXML(sub=None, id=None, rid=None, xid=None, xml_node=None, **kwargs)¶
Placeable for unrecognized or umimplemented XML nodes. It’s main purpose is to preserve all associated XML data.
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = True¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = True¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = False¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = True¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
- class translate.storage.placeables.xliff.X(id=None, xid=None, **kwargs)¶
- apply_to_strings(f) None¶
Apply
fto all actual strings in the tree.- Parameters:
f – Must take one (str or unicode) argument and return a string or unicode.
- copy()¶
Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.rendereris not copied.
- delete_range(start_index, end_index)¶
Delete the text in the range given by the string-indexes
start_indexandend_index.Partial nodes will only be removed if they are editable.
- Returns:
A
StringElemrepresenting the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.Noneis returned for the parent value if the root was deleted. If the parent and offset values are notNone,parent.insert(offset, deleted)effectively undoes the delete.
- depth_first(filter=None)¶
Returns a list of the nodes in the tree in depth-first order.
- elem_at_offset(offset)¶
Get the
StringElemin the tree that contains the string rendered at the given offset.
- elem_offset(elem)¶
Find the offset of
elemin the current tree.This cannot be reliably used if
self.rendereris used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()method is used as replacement for this one.- Returns:
The string index where element
estarts, or -1 ifewas not found.
- encode(encoding='utf-8')¶
More
unicodeclass emulation.
- find(x)¶
Find sub-string
xin this string tree and return the position at which it starts.
- find_elems_with(x)¶
Find all elements in the current sub-tree containing
x.
- flatten(filter=None)¶
Flatten the tree by returning a depth-first search over the tree’s leaves.
- get_index_data(index)¶
Get info about the specified range in the tree.
- Returns:
A dictionary with the following items:
elem: The element in which
indexresides.index: Copy of the
indexparameteroffset: The offset of
indexinto'elem'.
- get_parent_elem(child)¶
Searches the current sub-tree for and returns the parent of the
childelement.
- has_content = False¶
Whether this string can have sub-elements.
- insert(offset, text, preferred_parent=None)¶
Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
- insert_between(left, right, text) bool¶
Insert the given text between the two parameter
StringElems.
- iseditable = False¶
Whether this string should be changeable by the user. Not used at the moment.
- isfragile = True¶
Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
- isleaf() bool¶
Whether or not this instance is a leaf node in the
StringElemtree.A node is a leaf node if it is a
StringElem(not a sub-class) and contains only sub-elements of typestrorunicode.
- istranslatable = False¶
Whether this string is translatable into other languages.
- isvisible = True¶
Whether this string should be visible to the user. Not used at the moment.
- iter_depth_first(filter=None)¶
Iterate through the nodes in the tree in dept-first order.
- classmethod parse(pstr: str) ParseResult¶
Parse an instance of this class from the start of the given string. This method should be implemented by any subclass that wants to parseable by
translate.storage.placeables.parse.- Parameters:
pstr – The string to parse into an instance of this class.
- Returns:
An instance of the current class, or
Noneif the string not parseable by this class.
- print_tree(indent=0, verbose=False) None¶
Print the tree from the current instance’s point in an indented manner.
- remove_type(ptype) None¶
Replace nodes with type
ptypewith baseStringElems, containing the same sub-elements. This is only applicable to elements below the element tree root node.
- renderer = None¶
An optional function that returns the Unicode representation of the string.
- sub: list[str | StringElem] = []¶
The sub-elements that make up this this string.
- translate()¶
Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
- Returns:
The transformed Unicode string representing the sub-tree.
php¶
Classes that hold units of PHP localisation files phpunit or
entire files phpfile. These files are used in translating many
PHP based applications.
Only PHP files written with these conventions are supported:
<?php
$lang['item'] = "vale"; # Array of values
$some_entity = "value"; # Named variables
define("ENTITY", "value");
$lang = array(
'item1' => 'value1' , #Supports space before comma
'item2' => 'value2',
);
$lang = array( # Nested arrays
'item1' => 'value1',
'item2' => array(
'key' => 'value' , #Supports space before comma
'key2' => 'value2',
),
);
Nested arrays without key for nested array are not supported:
<?php
$lang = array(array('key' => 'value'));
The working of PHP strings and specifically the escaping conventions which differ between single quote (’) and double quote (”) characters are implemented as outlined in the PHP documentation for the String type.
- class translate.storage.php.LaravelPHPFile(inputfile=None, **kwargs)¶
-
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
LaravelPHPUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.php.LaravelPHPUnit(source='')¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
Return the key without the Laravel return prefix.
- getlocations()¶
Return locations without the Laravel return prefix.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput(indent='', name=None)¶
Convert the unit back into formatted lines for a php file.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.php.PHPLexer(*args: Any, **kwargs: Any)¶
- extract_comments(end)¶
Extract comments related to given parser positions.
Must be called sequentially for consequent statements.
- extract_name(terminator, start, end)¶
Extract current value name.
- translate.storage.php.phpdecode(text, quotechar="'")¶
Convert PHP escaped string to a Python string.
- translate.storage.php.phpencode(text, quotechar="'")¶
Convert Python string to PHP escaping.
The encoding is implemented for ‘single quote’ and “double quote” syntax.
heredoc and nowdoc are not implemented and it is not certain whether this would ever be needed for PHP localisation needs.
- class translate.storage.php.phpfile(inputfile=None, **kwargs)¶
This class represents a PHP file, made up of phpunits.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.php.phpunit(source='')¶
A unit of a PHP file: a name, a value, and any comments associated.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput(indent='', name=None)¶
Convert the unit back into formatted lines for a php file.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.php.wrap_production(func)¶
Decorator for production functions to store lexer positions.
pocommon¶
- translate.storage.pocommon.extract_msgid_comment(text: str) str¶
The one definitive way to extract a msgid comment out of an unescaped unicode string that might contain it.
- class translate.storage.pocommon.pofile(inputfile=None, noheader=False, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['po', 'pot']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['text/x-gettext-catalog', 'text/x-gettext-translation', 'text/x-po', 'text/x-pot']¶
A list of MIME types associated with this store type
- Name = 'Gettext PO file'¶
The human usable name of this store type
- UnitClass¶
alias of
TranslationUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getheaderplural()¶
Returns the nplural and plural values from the header.
- getids()¶
Return a list of unit ids.
- getprojectstyle() str | None¶
Return the project based on information in the header.
- The project is determined in the following sequence:
Use the ‘X-Project-Style’ entry in the header.
Use ‘Report-Msgid-Bug-To’ entry
Use the ‘X-Accelerator’ entry
Use the Project ID
Analyse the file itself (not yet implemented)
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Return the target language based on information in the header.
- The target language is determined in the following sequence:
Use the ‘Language’ entry in the header.
Poedit’s custom headers.
Analysing the ‘Language-Team’ entry.
- header()¶
Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
- init_headers(charset='UTF-8', encoding='8bit', **kwargs)¶
Sets default values for po headers.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- makeheader(**kwargs)¶
Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
- makeheaderdict(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs) dict[str, str]¶
Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
- Returns:
Dictionary with the header items
- property merge_on¶
The matching criterion to use when merging on.
- mergeheaders(otherstore) None¶
Merges another header with this header.
This header is assumed to be the template.
- Parameters:
otherstore – The other store to merge headers from.
- parse(data) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- parseheader()¶
Parses the PO header and returns the interpreted values as a dictionary.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out: IO[bytes]) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- setprojectstyle(project_style: str) None¶
Set the project in the header.
- Parameters:
project_style – the new project
- settargetlanguage(lang: str) None¶
Set the target language in the header.
This removes any custom Poedit headers if they exist.
- Parameters:
lang – the new target language code
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- updatecontributor(name: str, email: str | None = None) None¶
Add contribution comments if necessary.
- updateheader(add=False, **kwargs)¶
Updates the fields in the PO style header.
This will create a header if add == True.
- class translate.storage.pocommon.pounit(source=None)¶
-
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- geterrors()¶
Get all error messages.
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable() bool¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review. Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.pocommon.quote_plus(text)¶
Quote the query fragment of a URL; replacing ‘ ‘ with ‘+’.
- translate.storage.pocommon.unquote_plus(text)¶
unquote(‘%7e/abc+def’) -> ‘~/abc def’.
poheader¶
class that handles all header functions for a header in a po file.
- translate.storage.poheader.parseheaderstring(input)¶
Parses an input string with the definition of a PO header and returns the interpreted values as a dictionary.
- class translate.storage.poheader.poheader¶
This class implements functionality for manipulation of po file headers. This class is a mix-in class and useless on its own. It must be used from all classes which represent a po file.
- getheaderplural()¶
Returns the nplural and plural values from the header.
- getprojectstyle() str | None¶
Return the project based on information in the header.
- The project is determined in the following sequence:
Use the ‘X-Project-Style’ entry in the header.
Use ‘Report-Msgid-Bug-To’ entry
Use the ‘X-Accelerator’ entry
Use the Project ID
Analyse the file itself (not yet implemented)
- gettargetlanguage()¶
Return the target language based on information in the header.
- The target language is determined in the following sequence:
Use the ‘Language’ entry in the header.
Poedit’s custom headers.
Analysing the ‘Language-Team’ entry.
- header()¶
Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
- init_headers(charset='UTF-8', encoding='8bit', **kwargs)¶
Sets default values for po headers.
- makeheader(**kwargs)¶
Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
- makeheaderdict(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs) dict[str, str]¶
Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
- Returns:
Dictionary with the header items
- mergeheaders(otherstore) None¶
Merges another header with this header.
This header is assumed to be the template.
- Parameters:
otherstore – The other store to merge headers from.
- parseheader()¶
Parses the PO header and returns the interpreted values as a dictionary.
- setprojectstyle(project_style: str) None¶
Set the project in the header.
- Parameters:
project_style – the new project
- settargetlanguage(lang: str) None¶
Set the target language in the header.
This removes any custom Poedit headers if they exist.
- Parameters:
lang – the new target language code
- updatecontributor(name: str, email: str | None = None) None¶
Add contribution comments if necessary.
- updateheader(add=False, **kwargs)¶
Updates the fields in the PO style header.
This will create a header if add == True.
poparser¶
- From the GNU gettext manual:
WHITE-SPACE # TRANSLATOR-COMMENTS #. AUTOMATIC-COMMENTS #| PREVIOUS MSGID (Gettext 0.16 - check if this is the correct position - not yet implemented) #: REFERENCE… #, FLAG… #= FLAG… msgctxt CONTEXT (Gettext 0.15) msgid UNTRANSLATED-STRING msgstr TRANSLATED-STRING.
- exception translate.storage.poparser.PoParseError(parse_state: PoParseState, message: str | None = None, lineno: int | None = None, error_line: str | None = None)¶
- add_note(object, /)¶
Exception.add_note(note) – add a note to the exception
- with_traceback(object, /)¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
po¶
poxliff¶
XLIFF classes specifically suited for handling the PO representation in XLIFF.
This way the API supports plurals as if it was a PO file, for example.
- class translate.storage.poxliff.PoXliffFile(*args, **kwargs)¶
a file for the po variant of Xliff files.
- Extensions: ClassVar[list[str]] = ['xlf', 'xliff', 'sdlxliff']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-xliff', 'application/x-xliff+xml']¶
A list of MIME types associated with this store type
- Name = 'XLIFF Translation File'¶
The human usable name of this store type
- UnitClass¶
alias of
PoXliffUnit
- addsourceunit(source, filename='NoName', createifmissing=False)¶
Adds the given trans-unit to the last used body node if the filename has changed it uses the slow method instead (will create the nodes required if asked). Returns success.
- addunit(unit, new=True) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- createfilenode(filename, sourcelanguage='en-US', datatype='po') lxml.etree.Element¶
Creates a filenode with the given filename. All parameters are needed for XLIFF compliance.
- creategroup(filename='NoName', createifmissing=False, restype=None)¶
Adds a group tag into the specified file.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getbodynode(filenode, createifmissing=False)¶
Finds the body node for the given filenode.
- getdatatype(filename=None)¶
Returns the datatype of the stored file. If no filename is given, the datatype of the first file is given.
- getdate(filename=None) str | None¶
Returns the date attribute for the file.
If no filename is given, the date of the first file is given. If the date attribute is not specified, None is returned.
- Returns:
Date attribute of file
- getfilenames()¶
Returns all file identifiers in this XLIFF file.
- getfilenode(filename, createifmissing=False)¶
Finds the file node with the given identifier.
- getheadernode(filenode, createifmissing=False)¶
Finds the header node for the given filenode.
- getheaderplural()¶
Returns the nplural and plural values from the header.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- header()¶
Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
- init_headers(charset='UTF-8', encoding='8bit', **kwargs)¶
Sets default values for po headers.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- makeheader(**kwargs)¶
Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
- makeheaderdict(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs) dict[str, str]¶
Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
- Returns:
Dictionary with the header items
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- mergeheaders(otherstore) None¶
Merges another header with this header.
This header is assumed to be the template.
- Parameters:
otherstore – The other store to merge headers from.
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- parseheader()¶
Parses the PO header and returns the interpreted values as a dictionary.
- classmethod parsestring(storestring)¶
Parses the string to return the correct file object.
- removedefaultfile() None¶
We want to remove the default file-tag as soon as possible if we know if still present and empty.
- removeunit(unit) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- static setfilename(filenode, filename)¶
Set the name of the given file.
- property sourcelanguage¶
The type of the None singleton.
- suggestions_in_format = True¶
xliff units have alttrans tags which can be used to store suggestions
- switchfile(filename: str, createifmissing: bool = False) bool¶
Adds the given trans-unit (will create the nodes required if asked).
- Returns:
Success
- property targetlanguage¶
The type of the None singleton.
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- updatecontributor(name: str, email: str | None = None) None¶
Add contribution comments if necessary.
- updateheader(add=False, **kwargs)¶
Updates the fields in the PO style header.
This will create a header if add == True.
- class translate.storage.poxliff.PoXliffUnit(source=None, empty=False, **kwargs)¶
A class to specifically handle the plural units created from a po file.
- addalttrans(txt: str, origin: str | None = None, lang: str | None = None, sourcetxt: str | None = None, matchquality: str | None = None, context: str | None = None) None¶
Adds an alt-trans tag and alt-trans components to the unit. :param txt: Alternative translation of the source text.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- copy() LISAunit¶
Make a copy of the translation unit.
Copy the XML subtree directly instead of serializing and reparsing it.
- static correctorigin(node, origin)¶
Check against node tag’s origin (e.g note or alt-trans).
- createcontextgroup(name, contexts=None, purpose=None) None¶
Add the context group to the trans-unit with contexts a list with (type, text) tuples describing each context.
- createlanguageNode(lang, text, purpose)¶
Returns an xml Element setup with given parameters.
- delalttrans(alternative) None¶
Remove an alternate translation, including aggregated plural handles.
- getNodeText(languageNode, xml_space='preserve')¶
Retrieves the term from the given
languageNode.
- get_rich_target(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- getalttrans(origin=None)¶
Returns <alt-trans> for the given origin as a list of units. No origin means all alternatives.
- getautomaticcomments()¶
Returns the automatic comments (x-po-autocomment), which corresponds to the #. style po comments.
- getcontextgroups(name)¶
Returns the contexts in the context groups with the specified name.
- getcontextgroupsbyattribute(attributeName, attributeValue)¶
Returns the contexts in the context groups with the specified attributeName and attributeValue.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- geterrors()¶
Get all error messages.
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlanguageNode(lang=None, index=None)¶
Retrieves a
languageNodeeither by language or by index.
- getlanguageNodes()¶
We override this to get source and target nodes.
- getlocations()¶
Returns all the references (source locations).
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getrestype()¶
Returns the restype attribute in the trans-unit tag.
- gettarget(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- gettranslatorcomments()¶
Returns the translator comments (x-po-trancomment), which corresponds to the # style po comments.
- getunits()¶
This unit in a list.
- hasplural()¶
Tells whether or not this specific unit has plural strings.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isapproved()¶
States whether this unit is approved.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- isfuzzy()¶
Indicates whether this unit is fuzzy.
- isheader()¶
Indicates whether this unit is a header.
- isreview()¶
States whether this unit needs to be reviewed.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = [<bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.NewlinePlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.XMLTagPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.AltAttrPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.XMLEntityPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.PythonFormattingPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.JavaMessageFormatPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.FormattingPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.QtFormattingPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.UrlPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.FilePlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.EmailPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.CapsPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.CamelCasePlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.OptionPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.PunctuationPlaceable'>>, <bound method RegexParseMixin.parse of <class 'translate.storage.placeables.general.NumberPlaceable'>>]¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(id) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
properties¶
Classes that hold units of .properties, and similar, files that are used in translating Java, Mozilla, MacOS and other software.
The propfile class is a monolingual class with propunit
providing unit level access.
The .properties store has become a general key value pair class with
Dialect providing the ability to change the behaviour of the
parsing and handling of the various dialects.
Currently we support:
Java .properties
Mozilla .properties
Adobe Flex files
MacOS X .strings files
Skype .lang files
XWiki .properties
The following provides references and descriptions of the various dialects supported:
- Java
Java .properties are supported completely except for the ability to drop pairs that are not translated.
The following .properties file description gives a good references to the .properties specification.
Properties file may also hold Java MessageFormat messages. No special handling is provided in this storage class for MessageFormat, but this may be implemented in future.
All delimiter types, comments, line continuations and spaces handling in delimiters are supported.
- Mozilla
Mozilla files use ‘=’ as a delimiter, are UTF-8 encoded and thus don’t need \u escaping. Any \U values will be converted to correct Unicode characters.
- Strings
Mac OS X strings files are implemented using these two articles as references.
- Flex
Adobe Flex files seem to be normal .properties files but in UTF-8 just like Mozilla files. This page provides the information used to implement the dialect.
- Skype
Skype .lang files seem to be UTF-16 encoded .properties files.
- XWiki
XWiki translations files are standard Java .properties but with specific escaping support for simple quotes, and support of missing translations. This XWiki document provides the information used to implement the dialect.
A simple summary of what is permissible follows.
Comments supported:
# a comment
// a comment (only at the beginning of a line)
# The following are # escaped to render in docs
# ! is standard but not widely supported
#! a comment
# /* is non-standard but used on some implementations
#/* a comment (not across multiple lines) */
Name and Value pairs:
# Delimiters
key = value
key : value
# Whitespace delimiter
# key[sp]value
# Space in key and around value
\ key\ = \ value
# Note that the b and c are escaped for reST rendering
b = a string with escape sequences \\t \\n \\r \\\\ \\" \\' \\ (space) \u0123
c = a string with a continuation line \\
continuation line
# Special cases
# key with no value
//key (escaped; doesn't render in docs)
# value no key (extractable in prop2po but not mergeable in po2prop)
=value
# .strings specific
"key" = "value";
- class translate.storage.properties.Dialect¶
Settings for the various behaviours in key=value files.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectFlex¶
-
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectGaia¶
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectGwt¶
- classmethod encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectJava¶
-
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectJavaUtf16¶
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectJavaUtf8¶
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectJoomla¶
- classmethod encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- static value_strip(value)¶
Strip unneeded characters from the value.
- class translate.storage.properties.DialectMozilla¶
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectSkype¶
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.DialectStrings¶
- classmethod encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- static key_strip(key)¶
Strip unneeded characters from the key.
- static strip_inline_comments_from_line(line: str) tuple[str, list[str]]¶
Strip all C-style
/* */comments from a line, respecting quoted strings.Returns tuple of (line_without_comments, list_of_comments_found)
This handles comments that can appear anywhere in .strings files:
Between key and equals:
"key" /* comment */ = "value";Between equals and value:
"key" = /* comment */ "value";After value:
"key" = "value" /* comment */;
- class translate.storage.properties.DialectStringsUtf8¶
- classmethod encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- static key_strip(key)¶
Strip unneeded characters from the key.
- static strip_inline_comments_from_line(line: str) tuple[str, list[str]]¶
Strip all C-style
/* */comments from a line, respecting quoted strings.Returns tuple of (line_without_comments, list_of_comments_found)
This handles comments that can appear anywhere in .strings files:
Between key and equals:
"key" /* comment */ = "value";Between equals and value:
"key" = /* comment */ "value";After value:
"key" = "value" /* comment */;
- class translate.storage.properties.DialectXWiki¶
XWiki dialect is mainly a Java properties behaviour but with special handling of simple quotes: they are escaped by doubling them when an argument on the form “{X}” is provided, X being a number.
- static encode(string, encoding=None)¶
Encode the string.
- classmethod find_delimiter(line: str) tuple[str | None, int]¶
Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (any whitespace character). We find the position of each delimiter, then find the one that appears first.
- Parameters:
line – A properties line
delimiters – valid delimiters
- Returns:
delimiter character and offset within line
- class translate.storage.properties.XWikiFullPage(*args, **kwargs)¶
Represents a full XWiki Page translation: this file does not contains properties but its whole content needs to be translated. More information on https://dev.xwiki.org/xwiki/bin/view/Community/XWiki%20Translations%20Formats/#HXWikiFullContentTranslation.
- Extensions: ClassVar[list[str]] = ['xml']¶
A list of file extensions associated with this store type
- Name = 'XWiki Full Page'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.XWikiPageProperties(*args, **kwargs)¶
Represents an XWiki Page containing translation properties as described in https://dev.xwiki.org/xwiki/bin/view/Community/XWiki%20Translations%20Formats/#HXWikiPageProperties.
- Extensions: ClassVar[list[str]] = ['xml']¶
A list of file extensions associated with this store type
- Name = 'XWiki Page Properties'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- translate.storage.properties.accesskeysuffixes = ('.accesskey', '.accessKey', '.akey')¶
Accesskey Suffixes: entries with this suffix may be combined with labels ending in
labelsuffixesinto accelerator notation
- translate.storage.properties.get_comment_end(line: str) str | None¶
Determine whether a line ends a new multi-line comment.
- Parameters:
line – A properties line
- Returns:
True if line ends a new multi-line comment
- translate.storage.properties.get_comment_one_line(line: str) str | None¶
Determine whether a line is a one-line comment.
- Parameters:
line – A properties line
- Returns:
True if line is a one-line comment
- translate.storage.properties.get_comment_start(line: str) str | None¶
Determine whether a line starts a new multi-line comment.
- Parameters:
line – A properties line
- Returns:
True if line starts a new multi-line comment
- class translate.storage.properties.gwtfile(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['properties']¶
A list of file extensions associated with this store type
- Name = 'Gwt Properties'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- translate.storage.properties.is_line_continuation(line: str) bool¶
Determine whether line has a line continuation marker.
.properties files can be terminated with a backslash (\) indicating that the ‘value’ continues on the next line. Continuation is only valid if there are an odd number of backslashses (an even number would result in a set of N/2 slashes not an escape)
- Parameters:
line – A properties line
- Returns:
Does line end with a line continuation
- class translate.storage.properties.javafile(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['properties']¶
A list of file extensions associated with this store type
- Name = 'Java Properties'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.javautf16file(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['properties']¶
A list of file extensions associated with this store type
- Name = 'Java Properties (UTF-16)'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.javautf8file(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['properties']¶
A list of file extensions associated with this store type
- Name = 'Java Properties (UTF-8)'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.joomlafile(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['ini']¶
A list of file extensions associated with this store type
- Name = 'Joomla Translations'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- translate.storage.properties.labelsuffixes = ('.label', '.title')¶
Label suffixes: entries with this suffix are able to be comibed with accesskeys found in in entries ending with
accesskeysuffixes
- class translate.storage.properties.propfile(inputfile=None, personality='java', encoding=None)¶
this class represents a .properties file, made up of propunits.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.proppluralunit(source='', personality='java')¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- hasplural(key=None)¶
Tells whether or not this specific unit has plural strings.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.properties.propunit(source='', personality='java')¶
An element of a properties file i.e. a name and value, and any comments associated.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput()¶
Convert the element back into formatted lines for a .properties file.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static strip_missing_part(line)¶
Remove the missing prefix from the line.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.properties.register_dialect(dialect: type[Dialect]) type[Dialect]¶
Decorator that registers the dialect.
- class translate.storage.properties.stringsfile(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['strings']¶
A list of file extensions associated with this store type
- Name = 'OS X Strings'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.stringsutf8file(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['strings']¶
A list of file extensions associated with this store type
- Name = 'OS X Strings (UTF-8)'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.xwikifile(*args, **kwargs)¶
- Extensions: ClassVar[list[str]] = ['properties']¶
A list of file extensions associated with this store type
- Name = 'XWiki Properties'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.properties.xwikiunit(source='', personality='xwiki')¶
- Represents an XWiki translation unit. The difference with a propunit is twofold:
the dialect used is xwiki for simple quote escape handling
missing translations are output with a dedicated “### Missing: “ prefix.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput()¶
Convert the element back into formatted lines for a .properties file.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- classmethod represents_missing(line)¶
Return true if the line represents a missing translation.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- classmethod strip_missing_part(line)¶
Remove the missing prefix from the line.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
pypo¶
qm¶
Module for parsing Qt .qm files.
Note
Based on documentation from Gettext’s .qm implementation (see write-qt.c) and on observation of the output of lrelease.
Note
Certain deprecated section tags are not implemented. These will break and print out the missing tag. They are easy to implement and should follow the structure in 03 (Translation). We could find no examples that use these so we’d rather leave it unimplemented until we actually have test data.
Note
Many .qm files are unable to be parsed as they do not have the source text. We assume that since they use a hash table to lookup the data there is actually no need for the source text. It seems however that in Qt4’s lrelease all data is included in the resultant .qm file.
Note
We can only parse, not create, a .qm file. The main issue is that we need to implement the hashing algorithm (which seems to be identical to the Gettext hash algorithm). Unlike Gettext it seems that the hash is required, but that has not been validated.
Note
The code can parse files correctly. But it could be cleaned up to be more readable, especially the part that breaks the file into sections.
http://qt.gitorious.org/+kde-developers/qt/kde-qt/blobs/master/tools/linguist/shared/qm.cpp Plural information QLocale languages
- class translate.storage.qm.qmfile(inputfile=None, **kwargs)¶
A class representing a .qm file.
- Mimetypes: ClassVar[list[str]] = ['application/x-qm']¶
A list of MIME types associated with this store type
- Name = 'Qt .qm file'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.qm.qmunit(source=None)¶
A class representing a .qm translation message.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- translate.storage.qm.qmunpack(file_='messages.qm')¶
Helper to unpack Qt .qm files into a Python string.
qph¶
Module for handling Qt Linguist Phrase Book (.qph) files.
Extract from the Qt Linguist Manual: Translators: .qph Qt Phrase Book Files are human-readable XML files containing standard phrases and their translations. These files are created and updated by Qt Linguist and may be used by any number of projects and applications.
A DTD to define the format does not seem to exist, but the following code provides the reference implementation for the Qt Linguist product.
- class translate.storage.qph.QphFile(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶
Class representing a QPH file store.
- Extensions: ClassVar[list[str]] = ['qph']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-qph']¶
A list of MIME types associated with this store type
- Name = 'Qt Phrase Book'¶
The human usable name of this store type
- addsourceunit(source)¶
Adds and returns a new unit with the given string as first entry.
- addunit(unit, new=True) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage() str¶
Get the source language for this .qph file.
We don’t implement setsourcelanguage as users really shouldn’t be altering the source language in .qph files, it should be set correctly by the extraction tools.
- Returns:
ISO code e.g. af, fr, pt_BR
- gettargetlanguage() str¶
Get the target language for this .qph file.
- Returns:
ISO code e.g. af, fr, pt_BR
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Write the XML document to the file out.
- We have to override this to ensure mimic the Qt convention:
no XML declaration
- settargetlanguage(targetlanguage: str) None¶
Set the target language for this .qph file to targetlanguage.
- Parameters:
targetlanguage – ISO code e.g. af, fr, pt_BR
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.qph.QphUnit(source, empty=False, **kwargs)¶
A single term in the qph file.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- copy() LISAunit¶
Make a copy of the translation unit.
Copy the XML subtree directly instead of serializing and reparsing it.
- createlanguageNode(lang, text, purpose)¶
Returns an xml Element setup with given parameters.
- getNodeText(languageNode, xml_space='preserve')¶
Retrieves the term from the given
languageNode.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlanguageNode(lang=None, index=None)¶
Retrieves a
languageNodeeither by language or by index.
- getlanguageNodes()¶
We override this to get source and target nodes.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- gettarget(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- settarget(target, lang='xx', append=False) None¶
Sets the “target” string (second language), or alternatively appends to the list.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
rc¶
Classes that hold units of .rc files (rcunit) or entire files
(rcfile) used in translating Windows Resources.
- translate.storage.rc.escape_to_python(string)¶
Unescape a given .rc string into a valid Python string.
- translate.storage.rc.escape_to_rc(string)¶
Escape a given Python string into a valid .rc string.
- translate.storage.rc.generate_dialog_caption_name(block_type, identifier) str¶
Return the name generated for a caption of a dialog.
- translate.storage.rc.generate_dialog_control_name(block_type, block_id, control_type, identifier) str¶
Return the name generated for a control of a dialog.
Return the pre-name generated for elements of a menu.
Return the name generated for a menuitem of a popup.
- translate.storage.rc.generate_popup_caption_name(pre_name) str¶
Return the name generated for a caption of a popup.
- translate.storage.rc.generate_popup_pre_name(pre_name, caption) str¶
Return the pre-name generated for subelements of a popup.
- Parameters:
pre_name – The pre_name that already have the popup.
caption – The caption (without quotes) of the popup.
- Returns:
The subelements pre-name based in the pre-name of the popup and its caption.
- translate.storage.rc.generate_stringtable_name(identifier) str¶
Return the name generated for a stringtable element.
- translate.storage.rc.rc_statement() ParserElement¶
Generate a RC statement parser that can be used to parse a RC file.
- class translate.storage.rc.rcfile(inputfile=None, lang=None, sublang=None, encoding=None, **kwargs)¶
This class represents a .rc file, made up of rcunits.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.rc.rcunit(source='', **kwargs)¶
A unit of an rc file.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getoutput()¶
Convert the element back into formatted lines for a .rc file.
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
statistics¶
subtitles¶
Class that manages subtitle files for translation.
This class makes use of the subtitle functionality of aeidon.
- class translate.storage.subtitles.AdvSubStationAlphaFile(*args, **kwargs)¶
specialized class for Advanced Substation Alpha files only.
- Extensions: ClassVar[list[str]] = ['ass']¶
A list of file extensions associated with this store type
- Name = 'Advanced Substation Alpha subtitles file'¶
The human usable name of this store type
- UnitClass¶
alias of
SubtitleUnit
- addsourceunit(source: str) TranslationUnit¶
Add a unit with default SSA metadata.
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Parse the given file.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.subtitles.MicroDVDFile(*args, **kwargs)¶
specialized class for SubRipFile’s only.
- Extensions: ClassVar[list[str]] = ['sub']¶
A list of file extensions associated with this store type
- Name = 'MicroDVD subtitles file'¶
The human usable name of this store type
- UnitClass¶
alias of
MicroDVDUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Parse the given file.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.subtitles.MicroDVDUnit(source: str | None = None, **kwargs)¶
MicroDVD unit, it uses frames instead of time as start/end.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None) str¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- set_ssa_metadata(style: str | None = None, layer: int | None = None, name: str | None = None, margin_l: int | None = None, margin_r: int | None = None, margin_v: int | None = None, effect: str | None = None) None¶
Store SSA/ASS subtitle metadata (style, layer, margins, etc.).
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
- class translate.storage.subtitles.SubRipFile(*args, **kwargs)¶
specialized class for SubRipFile’s only.
- Extensions: ClassVar[list[str]] = ['srt']¶
A list of file extensions associated with this store type
- Name = 'SubRip subtitles file'¶
The human usable name of this store type
- UnitClass¶
alias of
SubtitleUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Parse the given file.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.subtitles.SubStationAlphaFile(*args, **kwargs)¶
specialized class for Substation Alpha files only.
- Extensions: ClassVar[list[str]] = ['ssa']¶
A list of file extensions associated with this store type
- Name = 'Substation Alpha subtitles file'¶
The human usable name of this store type
- UnitClass¶
alias of
SubtitleUnit
- addsourceunit(source: str) TranslationUnit¶
Add a unit with default SSA metadata.
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Parse the given file.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.subtitles.SubtitleFile(inputfile=None, **kwargs)¶
A subtitle file.
- Name = 'Base translation store'¶
The human usable name of this store type
- UnitClass¶
alias of
SubtitleUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Parse the given file.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.subtitles.SubtitleUnit(source: str | None = None, **kwargs)¶
A subtitle entry that is translatable.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None) str¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- set_ssa_metadata(style: str | None = None, layer: int | None = None, name: str | None = None, margin_l: int | None = None, margin_r: int | None = None, margin_v: int | None = None, effect: str | None = None) None¶
Store SSA/ASS subtitle metadata (style, layer, margins, etc.).
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
symbian¶
tbx¶
module for handling TBX glossary files.
- class translate.storage.tbx.tbxfile(inputfile=None, sourcelanguage=None, targetlanguage=None, **kwargs)¶
Class representing a TBX file store.
- Extensions: ClassVar[list[str]] = ['tbx']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-tbx']¶
A list of MIME types associated with this store type
- Name = 'TBX Glossary'¶
The human usable name of this store type
- addsourceunit(source)¶
Adds and returns a new unit with the given string as first entry.
- addunit(unit, new=True) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring, sourcelanguage=None, targetlanguage=None)¶
Convert the string representation back to an object.
- removeunit(unit) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.tbx.tbxunit(source, empty=False, **kwargs)¶
A single term in the TBX file. Provisional work is done to make several languages possible.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- copy() LISAunit¶
Make a copy of the translation unit.
Copy the XML subtree directly instead of serializing and reparsing it.
- createlanguageNode(lang, text, purpose)¶
Returns a langset xml Element setup with given parameters.
- getNodeText(languageNode, xml_space='preserve')¶
Retrieves the term from the given
languageNode.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid()¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlanguageNode(lang=None, index=None)¶
Retrieves a
languageNodeeither by language or by index.
- getlanguageNodes()¶
Returns a list of all nodes that contain per language information.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- gettarget(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- isobsolete() bool¶
Indicate whether a unit is obsolete.
The deprecated administrative status in TBX basic maps to translate toolkit’s concept of obsolete units.
- istranslatable() bool¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value)¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- settarget(target, lang=None, append=False) None¶
Sets the “target” string (second language), or alternatively appends to the list.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
tiki¶
Class that manages TikiWiki files for translation.
Tiki files are <strike>ugly and inconsistent</strike> formatted as a single large PHP array with several special sections identified by comments. Example current as of 2008-12-01:
<?php
// Many comments at the top
$lang=Array(
// ### Start of unused words
"aaa" => "zzz",
// ### end of unused words
// ### start of untranslated words
// "bbb" => "yyy",
// ### end of untranslated words
// ### start of possibly untranslated words
"ccc" => "xxx",
// ### end of possibly untranslated words
"ddd" => "www",
"###end###"=>"###end###");
?>
In addition there are several auto-generated //-style comments scattered through the page and array, some of which matter when being parsed.
This has all been gleaned from the TikiWiki source. As far as I know no detailed documentation exists for the tiki language.php files.
- class translate.storage.tiki.TikiStore(inputfile=None)¶
Represents a tiki language.php file.
- Name = 'Base translation store'¶
The human usable name of this store type
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parse the given input into source units.
- Parameters:
input – the source, either a string or filehandle
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.tiki.TikiUnit(source=None, **kwargs)¶
A tiki unit entry.
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Location is defined by the comments in the file. This function will only set valid locations.
- Parameters:
location – Where the string is located in the file. Must be a valid location.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- getid() str¶
A unique identifier for this unit.
- Returns:
an identifier for this unit that is unique in the store
Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
- getlocations()¶
Returns the a list of the location(s) of the string.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
tmx¶
module for parsing TMX translation memory files.
- class translate.storage.tmx.tmxfile(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶
Class representing a TMX file store.
- Extensions: ClassVar[list[str]] = ['tmx']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-tmx']¶
A list of MIME types associated with this store type
- Name = 'TMX Translation Memory'¶
The human usable name of this store type
- addsourceunit(source)¶
Adds and returns a new unit with the given string as first entry.
- addtranslation(source, srclang, translation, translang, comment=None, context=None) None¶
Addtranslation method for testing old unit tests.
- addunit(unit, new=True) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- translate(sourcetext, sourcelang=None, targetlang=None)¶
Method to test old unit tests.
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.tmx.tmxunit(source, empty=False, **kwargs)¶
A single unit in the TMX file.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text, origin=None, position='append') None¶
Add a note specifically in a “note” tag.
The origin parameter is ignored
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- copy() LISAunit¶
Make a copy of the translation unit.
Copy the XML subtree directly instead of serializing and reparsing it.
- createlanguageNode(lang, text, purpose)¶
Returns a langset xml Element setup with given parameters.
- getNodeText(languageNode, xml_space='preserve')¶
Retrieves the term from the given
languageNode.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getcontext()¶
Get the message context.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).
- geterrors()¶
Get all error messages.
- getid()¶
Returns the identifier for this unit. The optional tuid property is used if available, otherwise we inherit .getid(). Note that the tuid property is only mandated to be unique from TMX 2.0.
- getlanguageNode(lang=None, index=None)¶
Retrieves a
languageNodeeither by language or by index.
- getlanguageNodes()¶
Returns a list of all nodes that contain per language information.
- static getlocations() list[str]¶
A list of source code locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- getnotes(origin=None)¶
Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()).
- getpreviouscontext()¶
Get the context value suitable for previous-message metadata.
- gettarget(lang=None)¶
Retrieves the “target” text (second entry), or the entry in the specified language, if it exists.
- getunits()¶
This unit in a list.
- infer_state() None¶
Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
- isblank() bool¶
Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
- istranslatable()¶
Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
- istranslated()¶
Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
- property line_number: int | None¶
Line number in the source file where this unit was found.
The line number is 1-based (first line is line 1). Returns None if the format doesn’t support line numbering or if the information is not available.
- markreviewneeded(needsreview=True, explanation=None) None¶
Marks the unit to indicate whether it needs review.
- Parameters:
needsreview – Defaults to True.
explanation – Adds an optional explanation as a note.
- merge(otherunit, overwrite=False, comments=True, authoritative=False) None¶
Do basic format agnostic merging.
- multistring_to_rich(mulstring)¶
Convert a multistring to a list of “rich” string trees.
>>> target = multistring(['foo', 'bar', 'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem(['foo'])>])>, <StringElem([<StringElem(['bar'])>])>, <StringElem([<StringElem(['baz'])>])>]
- namespaced(name)¶
Returns name in Clark notation.
For example
namespaced("source")in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
- property prev_context¶
Previous source context for fuzzy/reused units, if available.
- property prev_source¶
Previous source text for fuzzy/reused units, if available.
- property prev_target¶
Previous target text for formats that expose it.
- rich_parsers = []¶
A list of functions to use for parsing a string into a rich string tree.
- property rich_source¶
See also
- property rich_target¶
See also
- classmethod rich_to_multistring(elem_list)¶
Convert a “rich” string tree to a
multistring.>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring('foo bar')
- set_as_previous(unit) None¶
Store another unit’s current source/context as this unit’s previous one.
- setid(value) None¶
Sets the unique identified for this unit.
only implemented if format allows ids independent from other unit properties like source or context
- settarget(target, lang='xx', append=False) None¶
Sets the “target” string (second language), or alternatively appends to the list.
- static sync_plural_count(target: list[str] | str | multistring, plural_tags: list[str]) list[str]¶
Ensure that plural count in string matches tags definition.
- unit_iter() Generator[Self]¶
Iterator that only returns this unit.
trados¶
Manage the Trados .txt Translation Memory format.
A Trados file looks like this:
<TrU>
<CrD>18012000, 13:18:35
<CrU>CAROL-ANN
<UsC>0
<Seg L=EN_GB>Association for Road Safety \endash Conference
<Seg L=DE_DE>Tagung der Gesellschaft für Verkehrssicherheit
</TrU>
<TrU>
<CrD>18012000, 13:19:14
<CrU>CAROL-ANN
<UsC>0
<Seg L=EN_GB>Road Safety Education in our Schools
<Seg L=DE_DE>Verkehrserziehung an Schulen
</TrU>
- translate.storage.trados.RTF_ESCAPES = {'\\-': '\xad', '\\_': '‑', '\\bullet': '•', '\\emdash': '—', '\\emspace': '\u2003', '\\endash': '–', '\\enspace': '\u2002', '\\ldblquote': '“', '\\lquote': '‘', '\\rdblquote': '”', '\\rquote': '’', '\\~': '\xa0'}¶
RTF control to Unicode map. See http://msdn.microsoft.com/en-us/library/aa140283(v=office.10).aspx
- translate.storage.trados.TRADOS_TIMEFORMAT = '%d%m%Y, %H:%M:%S'¶
Time format used by Trados .txt
- class translate.storage.trados.TradosTxtDate(newtime=None)¶
Manages the timestamps in the Trados .txt format of DDMMYYY, hh:mm:ss.
- get_time()¶
Get the time_struct object.
- get_timestring()¶
Get the time in the Trados time format.
- set_time(newtime: struct_time | None) None¶
Set the time_struct object.
- Parameters:
newtime – a new time object
- set_timestring(timestring: str) None¶
Set the time_struct object using a Trados time formatted string.
- Parameters:
timestring – A Trados time string (DDMMYYYY, hh:mm:ss)
- property time¶
Get the time_struct object.
- property timestring¶
Get the time in the Trados time format.
- class translate.storage.trados.TradosTxtTmFile(inputfile=None, **kwargs)¶
A Trados translation memory file.
- Extensions: ClassVar[list[str]] = ['txt']¶
A list of file extensions associated with this store type
- Mimetypes: ClassVar[list[str]] = ['application/x-trados-tm']¶
A list of MIME types associated with this store type
- Name = 'Trados Translation Memory'¶
The human usable name of this store type
- UnitClass¶
alias of
TradosUnit
- addunit(unit: U) None¶
Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- detect_encoding(text: bytes, default_encodings: list[str] | None = None) tuple[str | None, str | None]¶
Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
- static fallback_detection(text: bytes) EncodingDict¶
Simple detection based on BOM in case chardet is not available.
- findid(id)¶
Find unit with matching id by checking id_index.
- getids()¶
Return a list of unit ids.
- getprojectstyle()¶
Get the project type for this store.
- getsourcelanguage()¶
Get the source language for this store.
- gettargetlanguage()¶
Get the target language for this store.
- isempty()¶
Return True if the object doesn’t contain any translation units.
- property merge_on: str¶
The matching criterion to use when merging on.
- Returns:
The default matching criterion for all the subclasses.
- parse(input) None¶
Parser to process the given source string.
Note
This method should be overridden by subclasses to provide format-specific parsing.
- classmethod parsefile(storefile)¶
Read and parse the given file path or file-like object.
When passed a filename, this method opens and closes the file internally. When passed an existing readable file object, it consumes and closes that handle for compatibility with the historical API.
- classmethod parsestring(storestring)¶
Convert the string representation back to an object.
- removeunit(unit: U) None¶
Remove the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
- Parameters:
unit – The unit that will be added.
- serialize(out) None¶
Converts to a bytes representation that can be parsed back using
parsestring(). out should be an open file-like objects to write to.Note
This method should be overridden by subclasses to provide format-specific serialization.
- suggestions_in_format = False¶
Indicates if format can store suggestions and alternative translation for a unit
- unit_iter() Generator[U]¶
Iterator over all the units in this store.
- class translate.storage.trados.TradosUnit(source=None)¶
- adderror(errorname: str, errortext: str) None¶
Adds an error message to this unit.
- Parameters:
errorname – A single word to id the error.
errortext – The text describing the error.
- addlocation(location) None¶
Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
- addlocations(location) None¶
Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation().Warning
This method might be removed in future.
- addnote(text: str, origin: str | None = None, position: Literal['append', 'replace', 'merge'] = 'append') None¶
Adds a note (comment).
- Parameters:
text – Usually just a sentence or two.
origin – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
- classmethod buildfromunit(unit: TranslationUnit) Self¶
Build a native unit from a foreign unit.
Preserving as much information as possible.
- getalttrans(origin=None)¶
Return alternate translations derived from previous metadata.
- getdocpath() str¶
A logical location path within the document structure.
Unlike
getlocations(), which may include line numbers that differ between translations, the document path provides a stable structural identifier based on the logical position within the document (e.g.body/h1[1]/p[2]).