It is possible to translate any plaintext l10n file in Wordfast (or Trados, if you know what you’re doing). Here’s how:
If you’re a Wordfast user, you will already know how to do steps 3 and 4. Translating a l10n file in Wordfast is no different from translating any other file, except that Wordfast will automatically prevent you from translating certain stuff, and certain pieces of text have to inserted using Wordfast’s “placeable” feature (which is quite simple, actually, and is described in the Wordfast user manual)
When opening a text file in MS Word, MS Word will try to interpret the file. This is not good, for our purposes – we want MS Word to treat plain text as plain text. If you open an XLIFF file in MS Word, MS Word will recognise it as an XML file, and will parse it as an XML file, rendering the file useful for a passive user, but useless for a translator.
To prevent MS Word from parsing the file, do the following (MS Word 2000 instructions):
MS Word doesn’t seem to remember this setting, so you have to check that it is enabled everytime you open a l10n file.
Normally there are many ways to open a document in MS Word, eg drag-and-drop,
Ctrl+O, , and double-clicking. For MS
Word to respect the setting mentioned above, the l10n file has to be opend
using the method.
Usually, when opening a l10n file, MS Word will ask you to confirm the type of file. Do not choose “HTML”, for example. Choose “Encoded Text”, and when prompted for the encoding, select the encoding applicable to the file. Do not choose the option named “Unicode” – be more specific than that. If your file is in UTF-8, choose UTF-8 as the file open type.
Once the file is open in MS Word, save it as a DOC file by pressing
and selecting the DOC option.
At the end of the translation process, if you’re certain that no hidden codes
are left in the file, you can save it as plain text again, by pressing
F12 and selecting “Encoded Text” as the file save type, and choosing the
correct encoding (again, do not choose simply “Unicode” but be more specific).
MS Word might complain that you will loose formatting, but hey, that’s exactly
what you want.
To understand how Wordfast knows which text should be translated and which shouldn’t, one has to understand the concept of styles.
A word processor like MS Word (and OpenOffice.org) uses a concept called “styles” to simplify document formatting, although very few people use it. You can apply a style to any piece of text, and that text will look and act according to how the style defined.
You can create a style named “text123” and define it as “red, bold, Times New Roman”. If you then select any text and choose the style “text123”, the text will become red, bold and Times New Roman. In addition, the text will carry the hidden label “text123”.
Once text is marked with a certain style name, you can do all sorts of things with it. You can tell MS Word to delete all “text123” text, and it will delete only the text marked in that style, even if there are other pieces of text that look exactly the same (red, bold, Times New Roman). You can also change the style’s definition to, say, “red, italics, Times New Roman”, and all text marked in the “text123” style will automatically become italic and non-bold.
For Wordfast, what a style looks like, is irrelevant, but the name of the style is important. Wordfast specifically looks for two styles, called tw4winExternal and tw4winInternal. The former is usually grey, the latter is usually red.
The easiest way to create Wordfast styles, is to open a document that has Wordfast styles in it, and then add those styles to your computer’s normal.dot generic template. If you do this, your MS Word’s Wordfast styles will look like those of everyone else in the world (grey or red, hidden, Courier New, etc).
You can download a prepared tw4winstyles.doc document.
You can create the requires styles in MS Word yourself, manually, but you may find that the styles do not behave exactly the way that standard Wordfast styles behave. For example, if you forget to specify “No Proofing” as an attribute, you may find that MS Word tries to spellcheck the raw l10n code.
To do this manually, you need to know what the exact style names are, because really that’s all Wordfast really cares about. Here’s how:
The style is now part of the current document. To have the style available for other documents, you should add it to the normal.dot template, described above in “the easy way”. You may also create the style very time you open a new l10n file, if you enjoy doing that.
We suggest the following formatting for the tw4win styles:
If you know how to install external macros (i.e. if you know where you should copy a file in MS Windows’ hidden folder structure), you can install AndoTools into MS Word, which has a function to insert all tw4win styles into any document easily. Once you’ve installed AndoTools, in MS Word go . Click “Add tw4win styles” to add them to the current document.
The concept of preparing a l10n file for Wordfast, is actually quite simple. All you need to do, is to mark text that shouldn’t be translated, as tw4winExternal, and possibly any text that may be moved around, as tw4winInternal. What’s more, the tw4winInternal is really only for advanced, complex stuff like certain types of XML. And even if a document can use tw4winInternal, not having it will not make a difference as long as the translator knows which pieces of text he should and shouldn’t change.
For example, in the following line:
| The <bold>quick</bold> brown fox... |
the translator should know that <bold> and </bold> should not be translated, but kept in “English”. These two pieces of text can be marked as tw4winInternal, to help a translator copy them easier, but it isn’t absolutely necessary.
Marking tw4winInternal is a lot more work than marking tw4winExternal, so don’t bother, to begin with.
I’m going to show how to prepare a file the hard way because it offers a useful introduction to MS Word’s advanced find/replace functions. MS Word can do limited regular expressions, with certain types of backreferences, which can be quite useful.
What we’re going to do, is to mark a document with tw4winExternal. It is
assumed that either normal.dot or the document will itself have a style called
tw4winExternal already defined. The easiest document to practice on, is a
Mozilla DTD file called about.dtd. Open the
about.dtd in MS Word as describe above. The encoding is UTF-8.
The file looks like this:
<!ENTITY about "About"> <!ENTITY version "Version:"> <!ENTITY createdBy "Created By:"> <!ENTITY homepage "Home Page:">
The stuff that needs translating, is between quotes. The quotes themselves
should not be translated – they do not form part of the “translatable” text.
Therefore, we must mark everything from
<! ENTITY to
tw4winExternal, and everything that is
"> should also be marked as
Here’s how we do it:
Ctrl+H(the find/replace box). Click “More” to open advanced features.
The result should look like this or like this:
This DOC file can now be sent to a Wordfast user, who can translate it without having to worry about which texts he should touch and which not, because Wordfast will only prompt him to translate the black text.
The DTD file above had a very simple structure, and it was simple to tag using find/replace. However, some formats are more complex, requiring many, many steps of finding and replacing. Luckily, MS Word allows us to record a number of steps and save it as a macro. The ideal would therefore be to create a macro for each type of l10n file, and simply use the macro.
In MS Word, a macro can be embedded in a document so that it can be transported and included into another document (or ideally in the project manager’s normal.dot template).
Let’s add the following macro to MS Word’s normal.dot.
Selection.HomeKey Unit:=wdStory Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting With Selection.Find .Text = "" .Replacement.Text = "" .Forward = True .Wrap = wdFindContinue .Format = False .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Style = ActiveDocument.Styles("tw4winExternal") With Selection.Find .Text = "(\<\!ENTITY)(*)(\"")" .Replacement.Text = "\1\2\3" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchAllWordForms = False .MatchSoundsLike = False .MatchWildcards = True End With Selection.Find.Execute Replace:=wdReplaceAll Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Style = ActiveDocument.Styles("tw4winExternal") With Selection.Find .Text = """>" .Replacement.Text = "" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll
This macro was recorded, and I’m sure any Visual Basic programmer could trim it down to less lines.
To add the above macro, do the following:
Ctrl+Sto save, and exit the macro writer
The macro is now added to normal.dot, and can be used for any document that is opened in MS Word. Incidently, the above macro does exactly what we did in the advanced find/replace operation above.
Adding a macro to normal.dot from an existing document is similar to what we did in the “easy way” for adding styles. I assume you have a document with a macro embedded in it. I’ve embedded the above macro for you, in a document.
To add the macro to normal.dot, here’s how:
And that’s it. Now a macro called apple.apple is part of normal.dot, and can be used on any document you open in MS Word.
When running the macros described above, it is assumed that you have tw4winExternal as a style in normal.dot, or in the document that you’re about to tag. What we’re going to do, is to run the macro apple or apple.apple, which will perform the find/replace operation mentioned previously. This will mark the necessary text as “untranslatable”, so that Wordfast will ignore it.
If everything went well, your document should now be tagged, as per the images above.
(next write a short intro, plus upload a number of macros for XLIFF, TMX, PO, etc.