[LON-CAPA-dev] Internationalization opinion

Jeremy Bowers lon-capa-dev@mail.lon-capa.org
Mon, 22 Sep 2003 15:18:39 -0500


As I'm looking at internationalizing the helper, I'm wondering if it 
won't be OK to go ahead and create some descriptive identifiers for the 
messages and program the messages into an "en.pm" file. The reason is 
the helpers have messages like this:

     <message>
       <p>LON-CAPA has live chat functionality.  This course will receive
          its own chat room.  You may deny students, TAs, or instructors
          the right to access the chat room.</p>
       </message>

complete with the leading spaces and the HTML tags. Getting that 
*exactly* right in the [lang].pm files is going to be one PITA... well, 
actually, quite a lot of PITAs.

I'm thinking the best way forward for the helpers is to "emulate" the 
&mt calls. Right now the contents of the <message> tag is just picked up 
and dropped onto the user's screen. I'm thinking of adding a 
pre-processing function into lonlocal that will use a regex on a string, 
something like


sub fakeMT {
	return shift() =~
            /&mt\(
              ('|")?
              ([^)]*)
              ('|")?
              \)
            /&mt($2)/xme;
}

(though I haven't tested that yet)

Then run all the strings in the helper through fakeMT function.

Thus, the above message would become

     <message>
       <p>&mt("LON-CAPA has chat functionality")</p>
       </message>

then, run through this fakeMT function automatically by the lonhelper.pm 
code, outputs the right message.

Now, I realize this decreases the ability of a translator to just pick 
up a message file and create a translation, but IMHO they need to know 
the context of the message anyhow. The question is which is better, 
requiring the translator to get the context and write a complete 
translation from the en.pm file, OR subject the translator to this:

      "      <p>LON-CAPA has live chat functionality.  This course will 
receive\n         its own chat room.  You may deny students, TAs, or 
instructors\n         the right to access the chat room.</p>"
      => "<p>Der LON-CAPA has unt liven chatten unctionalityfay. Cette 
corsage (etc. etc.)",

with absolutely no margin for error in the spacing or formatting (with 
the \n's). It won't be "trivial" either way. (And of course it won't be 
wrapped in the source code, unless somebody manually wraps it, which is 
one more little programming bit the translator needs to understand...)

Also would it be worth it to write a little program to iterate over our 
source files and pick out calls to &mt() automatically, and update the 
[lang].pm files? (Maybe somebody has already written one out there on 
the 'net?) That way we could also automatically annotate where the text 
came from, AND provide a complete file for a new translator to work with.

This would result in output like:

      # loncommon line 884:
      'Perplexing the student' =>
      'Lorem ipsum etc.",

      # loncommon line 1024:
      # UNTRANSLATED
      'Afflicting the student with a wide variety of skin ailments' =>
      'Afflicting the student with a wide variety of skin ailments',

      # location unknown
      # APPARENTLY DEPRECATED
      'The aptly named Sir Phrase-not-appearing-in-this-program' =>
      'silly gibberish here'

Of course we'd take the current files in as input (as shown in this 
snippet) to keep the original translations.

Yes, I'm volunteering if it's deemed worthwhile.