XML and Internationalization...

Walter Smith WalterS at ile.com
Fri Nov 13 00:27:39 GMT 1998


> -----Original Message-----
> From: Deke Smith [mailto:deke at tallent.com]
> Sent: Monday, November 09, 1998 11:45 AM
> To: xml-dev at ic.ac.uk
> Subject: Re: XML and Internationalization...
> 
><snip/>
> 
> Here's my question: 
> 
> As I understand it, TMX is a format for translation 
> "dictionaries" -- or lists of equivalent words, phrases, 
> sentences or paragraphs in different languages. TMX also 
> allows the preservation of formating within phrases, such as 
> boldface, italic, etc.
> 
> I always judge tools by what *I* need from them and that is 
> what I need from TMX. Is it meant to do more than what I have 
> asked it to do? Is this "dictionary" concept something TMX is 
> *meant* for?
> 
> I am under the impression that TMX can also have embedded 
> "macros" within phrases. By "macro", I mean processing 
> commands that may be understood only by a specific scripting 
> language. Am I right?

We served as technical chair of the group of localization companies that
participated in the creation of the TMX format. As Tony said in an earlier
email, it was conceived as a Translation Memory Exchange format. In the
translation/localization biz, we use these translation memory tools (really
nothing more than bi-text databases) to capture prior translation effort and
reuse it where ever applicable. Most of the data/file types that
localization companies traditionally encounter is _not_ native SGML/XML. As
such, vendors are left to their own devices to decide how to process the
plethora of proprietary formats (resource files, DTP files, etc.) to
efficiently access the embedded translatable text. Needless to say, everyone
has come up with very different ways of doing this. The TMX format simply
seeks to provide a pragmatic way of exchanging TM data among disparate
environments, and really nothing more.

Another translation-related tag set is the OpenTag format
(http://www.opentag.org), which we launched over a year ago and have been
collaborating on with others in the localization industry. It specifically
seeks to provide a common method to markup textual data that is extracted
from functional/presentational/structural markup for the purposes of
language translation or any sort of NLP activity. The OpenTag schema was
specifically designed to abstract source differences at the element level,
while disambiguating context issues at the attribute level. Again, the data
types we're mainly dealing with is primarily everything but MLs. It's
flexible enough just about any data you can parse and extract from your
source environment, and there's even elements and attributes that be
employed to induce additional information into the data. While it wasn't
originally conceived to be a tag set for data creation, you may find the
flexibility you're looking for. You would then get the added bonus that your
data would already be ready for processing by any OpenTag-aware translation
tools and environments.

Cheers,
Walter
----------------------------------------------
Walter L. Smith (walters at ile.com)
Emerging Technology Analyst
International Language Engineering Corporation
5700 Flatiron Parkway
Boulder, CO 80301
303-245-7584 (vox)
303-596-7343 (cel)
303-245-7973 (fax)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list