Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ah, that's why Word 2007 ended up with »Keine Gliederung« (»no (document) outline«) for the setting where you can select a shape outline colour ...

In all seriousness, translators make mistakes by getting the context wrong all the time; this applies to both professional translators and amateurs, perhaps in different severity. But in my eyes that's not really a problem of the translators, but rather of the tools we use. A long list of strings that have maybe only a vague context is a horribly UX for the translator. Who would fault them to translate »outline« as »Gliederung« in a word processing application? A string table is probably the most convenient format for programmers, but not very much so for those maintaining translations.

Qt gets a bit of this right. If the translator has access to the source code and uses Qt Linguist, at least the strings that are directly in .ui files are shown with a mock-up of the respective window where the control with that text is highlighted [0]. That already helps a lot with context errors. Of course, it does nothing for text in the source code, and so our translator went ahead and translated »Breite« (line width) with »latitude« because the application in question was, after all, related to maps, geography and GPS; it just happened to have a setting to change the width of certain lines, which, in German, then mapped to the same word.

Qt also has a nice way of handling plurals in that both Linguist and the QTranslator class are aware of the languages and the special rules concerning plural forms in them. You create a translatable string like »Searched %n directories« and then create translations for that. English, German and lots of others just map to two forms (1, rest). Russian gets three, I think (1/21/31/..., 2/22/32/..., 11/12/rest), and so on. Downside is that you only get to handle a single plural in a string [1], and you have to create an en→en translation as well (to account for multiple forms in the source language). But generally it's a quite nice implementation. Gettext has something similar where you can write your own plural form matching rules somehow, but most translators I met don't really want to write math.

Visual Studio with Windows Forms has a mode of creating the string tables for translation where you can just change the language in the Form's properties and then proceed to change each control text. This nicely solves the context problem in that the translator edits the window directly and it looks like it normally does. But it also creates a whole bunch of other problems: Translators need VS, they need to edit the project directly and can accidentally mess up the UI with a mouse twitch when selecting a button to translate. They also might miss things that are buried in menus since there is no real measurement of completion and what's still missing. I've seen various projects, especially in web environments adopt a similar custom-written approach, though, where you can edit the UI directly in the application to translate it. Still with the problem that translators might miss hard-to-find and buried strings (one might argue that you should get rid of buried and obscure places in your UI anyway, though).

Long ramblings ... it was a topic I considered for my Diploma thesis while studying. But I couldn't think of a good way that could retain context for the translator in a general case, or at least most of the time. I thought about replacing each and every translatable string in a program with a custom identifier and then later trying to find those again via UI Automation or maybe screenshots and OCR to be able to map strings to parts of screenshots of the program. Would have required running the program once with those custom identifiers and once in normal mode and somehow matching up identifiered screenshots, normal screenshots and the string tables. And still with the problem that you'd need to manually go down each and every dialog and menu, including context menus, messages that only appear in certain states, etc.

Perhaps there just is no good solution, except maybe for developers to properly annotate each translatable string they use. In the »Breite«/»width«/»latitude« case I went through the source and added translator comments detailing the meaning of the word for every instance, but with large applications having thousands of translatable strings that could become unwieldy quickly.

__________

[0] http://geoinformatics.fsv.cvut.cz/wiki/images/thumb/3/39/Qt_...

[1] http://stackoverflow.com/q/5348990/73070



Translation is like dates: hard to get right.

I like the Visual Studio approach. I usually resort to running the software I'm translating to get context. Sometimes it's very hard to get some of the text to show up.

I can only think of one way to give translators context: Comments, the exact same way you give other programmers context for your code.


> But I couldn't think of a good way that could retain context for the translator in a general case, or at least most of the time.

How about assigning identifiers to all strings, then adding tooltips for those identifiers in the application? So the translator can hover over a menu item to determine which string they are supposed to translate.

This requires the translator to exercise the whole application which is kind of difficult, as well, of course.


The goal in my mind was to try generating context information for translators as automatically as possible while retaining the usual workflow developers would use in a given framework for localisable resources.

But yes, the requirement to cover every control, menu and dialog as well as every possible code path that uses a localisable string from the source code makes the whole endeavour very impractical to solve. With dialogs built in markup, e.g. Qt's .ui or XAML it's easy enough to give context, but the hard part is strings in code where you never know where they'll end up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: