Translation Quality Assessment Matrix

Rajesh Ranjan, the man behind FUEL Project, brought the idea of building some formula that helps to assess the quality of translations. To fulfill this need, we worked together to prepare a Translation Quality Assessment Matrix, TQAM.

TQAM is released at the recent FUEL GILT Conference 2013 under GNU GPL v3 license. To my knowledge, TQAM is the first of it’s kind in the field of Open Source Localization.

TQAM is available here: http://fuelproject.org/tqam/index

What is TQAM? – A Concept

TQAM is a web based tool to help you assess a translation.

Why would you need TQAM? – The Advantage

If an answer to any of the following questions is yes, you may need it.

  • Want to assess translations?
  • Want to identify translators’ strengths or improvement areas?
  • Want to figure out whether a particular translation could be critical or blocker?

Examples:

  • I am a Gujarati translator for the past couple of years for Fedora Project and I have a new member requesting to join the Gujarati Localization efforts. How do I know whether the new member is really capable of translating? How do I know the areas of improvement for the new member? Whether I should approve or deny his/her request?
  • I am Hindi Language Co-ordinator for LibreOffice Localization project and I would like to give “Translator”, “Reviewer” and “Committer” roles to different contributors from my group. How would I identify and differentiate their strengths & areas of improvements and assign roles respectively?
  • I am a Localization Lead for Mozilla Localization project and I have a member requesting to start a new language that’s unknown to me. How would I know whether the person requesting to start a new language is really native to that language?

Of course, you can roughly figure out and make a decision, though there’s no concrete backing that could be common across. TQAM would make it a common tool and process across the evaluators to bring consistency while assessing translations.

How to use TQAM? – The Process

TQAM works on assessment of translations based on the formula created with the help of Error Categories, Assessment Parameters & Sub-Parameters and Error Points.

Error Categories:
Trivial – can ignore
Venial – can forgive
Critical – must be fixed
Blocker – unacceptable

Parameters:
Accuracy
Language
Technical Validation
Locale
Style & Structure
Terminology

and various other sub-parameters for these parameters.

Definitions or descriptions about each of these parameters or sub-parameters could be found from TQAM web.

Steps:

  1. Enter the word count (of the source) in “Total Words” section
  2. Go through the translations manually
  3. Provide number of errors found in translations for the respective parameters/sub-parameters
  4. Hit Enter or click “Calculate” button if you are done
  5. End of the assessment phase TQAM would produce an assessment result in terms of marks out of 100
  6. TQAM would also gives you scoring for each parameter

Inputs/Feedback/Suggestions:

I have already started receiving feedback: https://fedorahosted.org/fuel/wiki/fuel_tqam/feedback You can jump in too otherwise do not hesitate to drop an e-mail on https://fedorahosted.org/mailman/listinfo/fuel-discuss

Thank You!
Ankit Patel

Contextualization for Software Localization

Localization industry has been revamping and coming up with lots of improvements these days. Number of languages getting increased in most of the projects’ portfolios. Many good news around! At the same time there are number of challenges yet to overcome. e.g. Data encoding, file formats, machine translations, online translation tools, translation memory, spell checkers, meta data, insufficient information/reference about translation messages, and you know there are many more to this list…

My focus in this blog post would be to emphasize and overcome the issue of Insufficient information/reference about translation messages or in other words Context for the translation messages. Lack of Context for translation messages is really a critical issue for the localization industry today, not just for translators! e.g. How would you translate the message “Empty Trash” in your native language. Would you translate it like whether the trash is empty or make the trash empty? Unless you see the actual source message in the user interface and use the functionality you wouldn’t be able to translate these kind of ambiguous messages correctly!

Projects like Deckard are of great help for such instances, however I think that’s not the only or real solution to the problem we are facing today!

During the Open Source Language Summit organized at Red Hat office Pune, I got a chance to meet all awesome members from Wikimedia Language Engineering Team. I came to know that Siebrand Mazeland, takes care of reviewing almost every message of the source material that comes to translatewiki.net before it goes to translators. We call it i18n review. Hat’s off Siebrand! Not an easy job! Certainly, this is very much helpful to translators.

For localizers translating various open source projects, where nobody takes care of reviewing translatable messages nor provide context it becomes the most challenging translation job at times!

How can we fix the problem?

Another, not an easy solution, though I would like to give it a try!

Concept: Review and modify the source code of software applications in order to add comments or references or context for the translatable messages in order to make them more meaningful so that it could be easily understood by translators that will help us get the best localization in the industry!

Skill-sets required:

  • Software Engineer – to help us create patches for software source code and communicate with mainstream developers about it
  • Quality Engineer – to help us understand the functionality of the software functionalities and better understand the usage of the translatable messages
  • Technical Writer – to help us write better translatable messages

Strategy:

  1. Form a group consisting of people with each of the skill-sets as mentioned above!
  2. List out various important open source software projects that come from Fedora, Gnome, Kde, Firefox, LibreOffice, etc.
  3. Start approaching them with the concept and idea behind the initiative
  4. Figure out contextual information for each and every translatable message
  5. Accordingly, provide software source code updates or patches whatever is possible
  6. Ensure that the contextual information goes into the translatable file formats provided to translators
  7. Translators do get the contextual information in the form of translation formats
  8. And we receive good quality translations!
  9. That’s pretty much all I guess!

Though I believe there is always a scope of further improvements and changes in the strategy or approach we may come to conclusion as soon as we form a group.

If we are able to convince even a single software application and it’s developers about the idea and get a buy-in, I think we have a ray of hope for the movement that can begun!

Got an interest to do something for the software localization? Got another idea better than this one? Have suggestions to make? Please please do provide your comments and let’s work together to start fixing all software localization issues one by one!

Thank You for reading!
Ankit Patel

“Unicode Text Rendering Reference System” brings six more Indic languages

“Unicode Text Rendering Reference System” also referred as UTRRS.

Few months ago UTRRS developer community announced a development instance of UTRRS, which is hosted on Red Hat’s OpenShift and can be found here: http://utrrs-testing.rhcloud.com/

Initially UTRRS had just five languages (Bengali, Gujarati, Hindi, Punjabi and Tamil). With the addition of six more Indic languages (Assamese, Kannada, Malayalam, Marathi, Oriya, Telugu) to the list UTRRS has now total of 11 Indic languages to be used for developers and localizers or native language users around.

This effort would help reducing the gap of understanding about combinations of scripts between developers and language experts. It would also mark as a standard reference point for anybody looking for accurate native script grammar data.

Please spend some time to go through it and provide your valuable feedback about it on the UTRRS mailing list.

Thanks for reading,
Ankit

Fedora Language Testing Group

I have been engaged with Localization activities for various projects including Fedora, but nowhere I have noticed a dedicated group of volunteers contributing to not just translate or localize software but also test their languages in a professional manner.

In the beginning of March, there’s a new group called “FLTG” (Fedora Language Testing Group) has been formed under Fedora’s SIG (Special Interest Groups).

FLTG folks take care of developing testing strategy, infrastructure, test cases and bug filing, bug triaging, helping testers and many more things associated with Language Testing for Fedora. You may want to join this move and contribute. Initial announcement about the group and information is available here: http://lists.fedoraproject.org/pipermail/trans/2012-March/009703.html

  • If you are a localizer for Fedora, this is the time you want to join this group and help test your language.
  • If you are a native language user of Fedora, this is the time you may want to test your language and take a help from FLTG

Show some love for your language and help testing languages in Fedora!

/Ankit

Ruby on Rails Application Deployment with Passenger on Fedora

Installation:

  • Install apache development libraries
    • yum -y install httpd-devel
  • Install Passenger (Ruby on Rails Server used for Deployment)
    • gem install passenger
  • Install passenger-apache module (Passenger binding with Apache)

Configure virtual host for your RoR apps on apache server

  • cd /var/www/html
  • ln -s ~/myapp/public myapp
  • edit /etc/httpd/conf/httpd.conf
    • LoadModule passenger_module /usr/lib/ruby/gems/1.8/gems/passenger-3.0.6/ext/apache2/mod_passenger.so
    • PassengerRoot /usr/lib/ruby/gems/1.8/gems/passenger-3.0.6
    • PassengerRuby /usr/bin/ruby
    • <VirtualHost *:80>
    •       RailsBaseURI /myapp
    •       <Directory /var/www/html/myapp>
    •            Options -MultiViews
    •            Options FollowSymLinks
    •       </Directory>
    • </VirtualHost>

Modify apache configuration on the proxy server

  • edit /etc/httpd/conf/httpd.conf
  • add following lines
    • <VirtualHost *:80>
    •       ProxyPass /myapp http://myhost/myapp
    •       ProxyPassReverse /myapp http://myhost/myapp
    • </VirtualHost>

Some quick tips

  • Use rails helper tags instead of direct html tags (e.g. ‘image_tag’ in place of ‘img src’, ‘link_to’ or ‘url_for’ in place of ‘a href’, etc)
  • Use javascript_include_tag and javascript_include_tag when stylesheet, images, javascripts are not loading.
  • Make sure public directory of ‘myapp’ has necessary permissions for apache user
  • If you are dealing with SELinux enabled server, you might want to take care of SELinux permissions for passenger as well (though SELinux enabled, permissive mode runs without any error! Hint!!!)
  • sub-URI has to be same for the host machine as well as proxy, when you are doning proxypass and reverse.
  • when you get “NameError: uninitialized constant ActiveSupport::Dependencies::Mutex ” error for Rails 2.3.x versions add line: require ‘thread’ to script/sever file
  • Helpful documentation: http://www.modrails.com/documentation/Users guide Apache.html

Feedback, expert comments welcome!

Thank You!
Ankit