filters

Filters that can be used on translations...

autocorrect

A set of autocorrect functions that fix common punctuation and space problems automatically

translate.filters.autocorrect.correct(source, target)

Runs a set of easy and automatic corrections

Current corrections include:
  • Ellipses - align target to use source form of ellipses (either three dots or the Unicode ellipses characters)
  • Missing whitespace and start or end of the target
  • Missing punction (.:?) at the end of the target

checks

This is a set of validation checks that can be performed on translation units.

Derivatives of UnitChecker (like StandardUnitChecker) check translation units, and derivatives of TranslationChecker (like StandardChecker) check (source, target) translation pairs.

When adding a new test here, please document and explain their behaviour on the pofilter tests page.

class translate.filters.checks.CheckerConfig(targetlanguage=None, accelmarkers=None, varmatches=None, notranslatewords=None, musttranslatewords=None, validchars=None, punctuation=None, endpunctuation=None, ignoretags=None, canchangetags=None, criticaltests=None, credit_sources=None)

Object representing the configuration of a checker.

update(otherconfig)

Combines the info in otherconfig into this config object.

updatetargetlanguage(langcode)

Updates the target language in the config to the given target language.

updatevalidchars(validchars)

Updates the map that eliminates valid characters.

exception translate.filters.checks.FilterFailure(messages)

This exception signals that a Filter didn’t pass, and gives an explanation or a comment.

exception translate.filters.checks.SeriousFilterFailure(messages)

This exception signals that a Filter didn’t pass, and the bad translation might break an application (so the string will be marked fuzzy)

class translate.filters.checks.StandardChecker(checkerconfig=None, excludefilters=None, limitfilters=None, errorhandler=None)

The basic test suite for source -> target translations.

accelerators(*args, **kwargs)

Checks whether accelerators are consistent between the two strings.

acronyms(*args, **kwargs)

Checks that acronyms that appear are unchanged.

blank(*args, **kwargs)

Checks whether a translation only contains spaces.

brackets(*args, **kwargs)

Checks that the number of brackets in both strings match.

compendiumconflicts(*args, **kwargs)

Checks for Gettext compendium conflicts (#-#-#-#-#).

credits(*args, **kwargs)

Checks for messages containing translation credits instead of normal translations.

doublequoting(*args, **kwargs)

Checks whether doublequoting is consistent between the two strings.

doublespacing(*args, **kwargs)

Checks for bad double-spaces by comparing to original.

doublewords(*args, **kwargs)

Checks for repeated words in the translation.

emails(*args, **kwargs)

Checks that emails are not translated.

endpunc(*args, **kwargs)

Checks whether punctuation at the end of the strings match.

endwhitespace(*args, **kwargs)

Checks whether whitespace at the end of the strings matches.

escapes(*args, **kwargs)

Checks whether escaping is consistent between the two strings.

filepaths(*args, **kwargs)

Checks that file paths have not been translated.

filteraccelerators_by_list(str1, acceptlist=None)

Filter out accelerators from str1.

functions(*args, **kwargs)

Checks that function names are not translated.

getfilters(excludefilters=None, limitfilters=None)

Returns dictionary of available filters, including/excluding those in the given lists.

kdecomments(*args, **kwargs)

Checks to ensure that no KDE style comments appear in the translation.

long(*args, **kwargs)

Checks whether a translation is much longer than the original string.

musttranslatewords(*args, **kwargs)

Checks that words configured as definitely translatable don’t appear in the translation.

newlines(*args, **kwargs)

Checks whether newlines are consistent between the two strings.

notranslatewords(*args, **kwargs)

Checks that words configured as untranslatable appear in the translation too.

numbers(*args, **kwargs)

Checks whether numbers of various forms are consistent between the two strings.

options(*args, **kwargs)

Checks that options are not translated.

printf(*args, **kwargs)

Checks whether printf format strings match.

puncspacing(*args, **kwargs)

Checks for bad spacing after punctuation.

purepunc(*args, **kwargs)

Checks that strings that are purely punctuation are not changed.

run_filters(unit, categorised=False)

Do some optimisation by caching some data of the unit for the benefit of run_test().

run_test(test, unit)

Runs the given test on the given unit.

Note that this can raise a FilterFailure as part of normal operation.

sentencecount(*args, **kwargs)

Checks that the number of sentences in both strings match.

setconfig(config)

Sets the accelerator list.

setsuggestionstore(store)

Sets the filename that a checker should use for evaluating suggestions.

short(*args, **kwargs)

Checks whether a translation is much shorter than the original string.

simplecaps(*args, **kwargs)

Checks the capitalisation of two strings isn’t wildly different.

simpleplurals(*args, **kwargs)

Checks for English style plural(s) for you to review.

singlequoting(*args, **kwargs)

Checks whether singlequoting is consistent between the two strings.

spellcheck(*args, **kwargs)

Checks words that don’t pass a spell check.

startcaps(*args, **kwargs)

Checks that the message starts with the correct capitalisation.

startpunc(*args, **kwargs)

Checks whether punctuation at the beginning of the strings match.

startwhitespace(*args, **kwargs)

Checks whether whitespace at the beginning of the strings matches.

tabs(*args, **kwargs)

Checks whether tabs are consistent between the two strings.

unchanged(*args, **kwargs)

Checks whether a translation is basically identical to the original string.

untranslated(*args, **kwargs)

Checks whether a string has been translated at all.

urls(*args, **kwargs)

Checks that URLs are not translated.

validchars(*args, **kwargs)

Checks that only characters specified as valid appear in the translation.

variables(*args, **kwargs)

Checks whether variables of various forms are consistent between the two strings.

xmltags(*args, **kwargs)

Checks that XML/HTML tags have not been translated.

class translate.filters.checks.StandardUnitChecker(checkerconfig=None, excludefilters=None, limitfilters=None, errorhandler=None)

The standard checks for common checks on translation units.

filteraccelerators_by_list(str1, acceptlist=None)

Filter out accelerators from str1.

getfilters(excludefilters=None, limitfilters=None)

Returns dictionary of available filters, including/excluding those in the given lists.

hassuggestion(*args, **kwargs)

Checks if there is at least one suggested translation for this unit.

isfuzzy(*args, **kwargs)

Check if the unit has been marked fuzzy.

isreview(*args, **kwargs)

Check if the unit has been marked review.

nplurals(*args, **kwargs)

Checks for the correct number of noun forms for plural translations.

run_filters(unit, categorised=False)

Run all the tests in this suite.

Return type:Dictionary
Returns:Content of the dictionary is as follows:
{'testname': { 'message': message_or_exception, 'category': failure_category } }
run_test(test, unit)

Runs the given test on the given unit.

Note that this can raise a FilterFailure as part of normal operation.

setconfig(config)

Sets the accelerator list.

setsuggestionstore(store)

Sets the filename that a checker should use for evaluating suggestions.

class translate.filters.checks.TeeChecker(checkerconfig=None, excludefilters=None, limitfilters=None, checkerclasses=None, errorhandler=None, languagecode=None)

A Checker that controls multiple checkers.

categories = {}

Categories where each checking function falls into Function names are used as keys, categories are the values

getfilters(excludefilters=None, limitfilters=None)

Returns a dictionary of available filters, including/excluding those in the given lists.

run_filters(unit, categorised=False)

Run all the tests in the checker’s suites.

setsuggestionstore(store)

Sets the filename that a checker should use for evaluating suggestions.

class translate.filters.checks.TranslationChecker(checkerconfig=None, excludefilters=None, limitfilters=None, errorhandler=None)

A checker that passes source and target strings to the checks, not the whole unit.

This provides some speedup and simplifies testing.

filteraccelerators_by_list(str1, acceptlist=None)

Filter out accelerators from str1.

getfilters(excludefilters=None, limitfilters=None)

Returns dictionary of available filters, including/excluding those in the given lists.

run_filters(unit, categorised=False)

Do some optimisation by caching some data of the unit for the benefit of run_test().

run_test(test, unit)

Runs the given test on the given unit.

Note that this can raise a FilterFailure as part of normal operation.

setconfig(config)

Sets the accelerator list.

setsuggestionstore(store)

Sets the filename that a checker should use for evaluating suggestions.

class translate.filters.checks.UnitChecker(checkerconfig=None, excludefilters=None, limitfilters=None, errorhandler=None)

Parent Checker class which does the checking based on functions available in derived classes.

categories = {}

Categories where each checking function falls into Function names are used as keys, categories are the values

filteraccelerators_by_list(str1, acceptlist=None)

Filter out accelerators from str1.

getfilters(excludefilters=None, limitfilters=None)

Returns dictionary of available filters, including/excluding those in the given lists.

run_filters(unit, categorised=False)

Run all the tests in this suite.

Return type:Dictionary
Returns:Content of the dictionary is as follows:
{'testname': { 'message': message_or_exception, 'category': failure_category } }
run_test(test, unit)

Runs the given test on the given unit.

Note that this can raise a FilterFailure as part of normal operation.

setconfig(config)

Sets the accelerator list.

setsuggestionstore(store)

Sets the filename that a checker should use for evaluating suggestions.

translate.filters.checks.batchruntests(pairs)

Runs test on a batch of string pairs.

translate.filters.checks.intuplelist(pair, list)

Tests to see if pair == (a,b,c) is in list, but handles None entries in list as wildcards (only allowed in positions “a” and “c”). We take a shortcut by only considering “c” if “b” has already matched.

translate.filters.checks.runtests(str1, str2, ignorelist=())

Verifies that the tests pass for a pair of strings.

translate.filters.checks.tagname(string)

Returns the name of the XML/HTML tag in string

translate.filters.checks.tagproperties(strings, ignore)

Returns all the properties in the XML/HTML tag string as (tagname, propertyname, propertyvalue), but ignore those combinations specified in ignore.

decoration

functions to get decorative/informative text out of strings...

translate.filters.decoration.countaccelerators(accelmarker, acceptlist=None)

returns a function that counts the number of accelerators marked with the given marker

translate.filters.decoration.findaccelerators(str1, accelmarker, acceptlist=None)

returns all the accelerators and locations in str1 marked with a given marker

translate.filters.decoration.findmarkedvariables(str1, startmarker, endmarker, ignorelist=[])

returns all the variables and locations in str1 marked with a given marker

translate.filters.decoration.getaccelerators(accelmarker, acceptlist=None)

returns a function that gets a list of accelerators marked using accelmarker

translate.filters.decoration.getemails(str1)

returns the email addresses that are in a string

translate.filters.decoration.getfunctions(str1)

returns the functions() that are in a string, while ignoring the trailing punctuation in the given parameter

translate.filters.decoration.getnumbers(str1)

returns any numbers that are in the string

translate.filters.decoration.geturls(str1)

returns the URIs in a string

translate.filters.decoration.getvariables(startmarker, endmarker)

returns a function that gets a list of variables marked using startmarker and endmarker

translate.filters.decoration.ispurepunctuation(str1)

checks whether the string is entirely punctuation

translate.filters.decoration.isvalidaccelerator(accelerator, acceptlist=None)

returns whether the given accelerator character is valid

Parameters:
  • accelerator (character) – A character to be checked for accelerator validity
  • acceptlist (String) – A list of characters that are permissible as accelerators
Return type:

Boolean

Returns:

True if the supplied character is an acceptable accelerator

translate.filters.decoration.puncend(str1, punctuation)

returns all the punctuation from the end of the string

translate.filters.decoration.puncstart(str1, punctuation)

returns all the punctuation from the start of the string

translate.filters.decoration.spaceend(str1)

returns all the whitespace from the end of the string

translate.filters.decoration.spacestart(str1)

returns all the whitespace from the start of the string

helpers

a set of helper functions for filters...

translate.filters.helpers.countmatch(str1, str2, countstr)

checks whether countstr occurs the same number of times in str1 and str2

translate.filters.helpers.countsmatch(str1, str2, countlist)

checks whether each element in countlist occurs the same number of times in str1 and str2

translate.filters.helpers.filtercount(str1, func)

returns the number of characters in str1 that pass func

translate.filters.helpers.filtertestmethod(testmethod, strfilter)

returns a version of the testmethod that operates on filtered strings using strfilter

translate.filters.helpers.funcmatch(str1, str2, func, *args)

returns whether the result of func is the same for str1 and str2

translate.filters.helpers.funcsmatch(str1, str2, funclist)

checks whether the results of each func in funclist match for str1 and str2

translate.filters.helpers.multifilter(str1, strfilters, *args)

passes str1 through a list of filters

translate.filters.helpers.multifiltertestmethod(testmethod, strfilters)

returns a version of the testmethod that operates on filtered strings using strfilter

pofilter

Perform quality checks on Gettext PO, XLIFF and TMX localization files.

Snippet files are created whenever a test fails. These can be examined, corrected and merged back into the originals using pomerge.

See: http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/pofilter.html for examples and usage instructions and http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/pofilter_tests.html for full descriptions of all tests.

class translate.filters.pofilter.FilterOptionParser(formats)

A specialized Option Parser for filter tools...

add_option(Option)

add_option(opt_str, ..., kwarg=val, ...)

check_values(values : Values, args : [string])

-> (values : Values, args : [string])

Check that the supplied option values and leftover arguments are valid. Returns the option values and leftover arguments (possibly adjusted, possibly completely new – whatever you like). Default implementation just returns the passed-in values; subclasses may override as desired.

checkoutputsubdir(options, subdir)

Checks to see if subdir under options.output needs to be created, creates if neccessary.

define_option(option)

Defines the given option, replacing an existing one of the same short name if neccessary...

destroy()

Declare that you are done with this OptionParser. This cleans up reference cycles so the OptionParser (and all objects referenced by it) can be garbage-collected promptly. After calling destroy(), the OptionParser is unusable.

disable_interspersed_args()

Set parsing to stop on the first non-option. Use this if you have a command processor which runs another command that has options of its own and you want to make sure these options don’t get confused.

enable_interspersed_args()

Set parsing to not stop on the first non-option, allowing interspersing switches with command arguments. This is the default behavior. See also disable_interspersed_args() and the class documentation description of the attribute allow_interspersed_args.

error(msg : string)

Print a usage message incorporating ‘msg’ to stderr and exit. If you override this in a subclass, it should not return – it should either exit or raise an exception.

finalizetempoutputfile(options, outputfile, fulloutputpath)

Write the temp outputfile to its final destination.

format_manpage()

returns a formatted manpage

getformathelp(formats)

Make a nice help string for describing formats...

getfullinputpath(options, inputpath)

Gets the absolute path to an input file.

getfulloutputpath(options, outputpath)

Gets the absolute path to an output file.

getfulltemplatepath(options, templatepath)

Gets the absolute path to a template file.

getoutputname(options, inputname, outputformat)

Gets an output filename based on the input filename.

getoutputoptions(options, inputpath, templatepath)

Works out which output format and processor method to use...

getpassthroughoptions(options)

Get the options required to pass to the filtermethod...

gettemplatename(options, inputname)

Gets an output filename based on the input filename.

getusageman(option)

returns the usage string for the given option

getusagestring(option)

returns the usage string for the given option

initprogressbar(allfiles, options)

Sets up a progress bar appropriate to the options and files.

isexcluded(options, inputpath)

Checks if this path has been excluded.

isrecursive(fileoption, filepurpose='input')

Checks if fileoption is a recursive file.

isvalidinputname(options, inputname)

Checks if this is a valid input filename.

mkdir(parent, subdir)

Makes a subdirectory (recursively if neccessary).

openinputfile(options, fullinputpath)

Opens the input file.

openoutputfile(options, fulloutputpath)

Opens the output file.

opentemplatefile(options, fulltemplatepath)

Opens the template file (if required).

opentempoutputfile(options, fulloutputpath)

Opens a temporary output file.

parse_args(args=None, values=None)

Parses the command line options, handling implicit input/output args.

parse_noinput(option, opt, value, parser, *args, **kwargs)

This sets an option to True, but also sets input to - to prevent an error.

print_help(file : file = stdout)

Print an extended help message, listing all options and any help text provided with them, to ‘file’ (default stdout).

print_manpage(file=None)

outputs a manpage for the program using the help information

print_usage(file : file = stdout)

Print the usage message for the current program (self.usage) to ‘file’ (default stdout). Any occurrence of the string “%prog” in self.usage is replaced with the name of the current program (basename of sys.argv[0]). Does nothing if self.usage is empty or not defined.

print_version(file : file = stdout)

Print the version message for this program (self.version) to ‘file’ (default stdout). As with print_usage(), any occurrence of “%prog” in self.version is replaced by the current program’s name. Does nothing if self.version is empty or undefined.

processfile(fileprocessor, options, fullinputpath, fulloutputpath, fulltemplatepath)

Process an individual file.

recurseinputfilelist(options)

Use a list of files, and find a common base directory for them.

recurseinputfiles(options)

Recurse through directories and return files to be processed.

recursiveprocess(options)

Recurse through directories and process files.

reportprogress(filename, success)

Shows that we are progressing...

run()

Parses the arguments, and runs recursiveprocess with the resulting options.

set_usage(usage=None)

sets the usage string - if usage not given, uses getusagestring for each option

seterrorleveloptions()

Sets the errorlevel options.

setformats(formats, usetemplates)

Sets the format options using the given format dictionary.

Parameters:formats (Dictionary) –

The dictionary keys should be:

  • Single strings (or 1-tuples) containing an input format (if not usetemplates)
  • Tuples containing an input format and template format (if usetemplates)
  • Formats can be None to indicate what to do with standard input

The dictionary values should be tuples of outputformat (string) and processor method.

setmanpageoption()

creates a manpage option that allows the optionparser to generate a manpage

setprogressoptions()

Sets the progress options.

splitext(pathname)

Splits pathname into name and ext, and removes the extsep.

Parameters:pathname (string) – A file path
Returns:root, ext
Return type:tuple
splitinputext(inputpath)

Splits an inputpath into name and extension.

splittemplateext(templatepath)

Splits a templatepath into name and extension.

templateexists(options, templatepath)

Returns whether the given template exists...

warning(msg, options=None, exc_info=None)

Print a warning message incorporating ‘msg’ to stderr and exit.

translate.filters.pofilter.build_checkerconfig(options)

Prepare the checker config from the given options. This is mainly factored out for the sake of unit tests.

translate.filters.pofilter.runfilter(inputfile, outputfile, templatefile, checkfilter=None)

Reads in inputfile, filters using checkfilter, writes to outputfile.

prefilters

Filters that strings can be passed through before certain tests.

translate.filters.prefilters.filteraccelerators(accelmarker)

Returns a function that filters accelerators marked using accelmarker from a strings.

Parameters:accelmarker (string) – Accelerator marker character
Return type:Function
Returns:fn(str1, acceplist=None)
translate.filters.prefilters.filtervariables(startmarker, endmarker, varfilter)

Returns a function that filters variables marked using startmarker and endmarker from a string.

Parameters:
  • startmarker (string) – Start of variable marker
  • endmarker (string) – End of variable marker
  • varfilter (Function) – fn(variable, startmarker, endmarker)
Return type:

Function

Returns:

fn(str1)

translate.filters.prefilters.filterwordswithpunctuation(str1)

Goes through a list of known words that have punctuation and removes the punctuation from them.

translate.filters.prefilters.removekdecomments(str1)

Remove KDE-style PO comments.

KDE comments start with _:[space] and end with a literal \n. Example:

"_: comment\n"
translate.filters.prefilters.varname(variable, startmarker, endmarker)

Variable filter that returns the variable name without the marking punctuation.

Note

Currently this function simply returns variable unchanged, no matter what *marker’s are set to.

Return type:String
Returns:Variable name with the supplied startmarker and endmarker removed.
translate.filters.prefilters.varnone(variable, startmarker, endmarker)

Variable filter that returns an empty string.

Return type:String
Returns:Empty string

spelling

An API to provide spell checking for use in checks or elsewhere.