locked
Mark which parts to ignore in a translation RRS feed

  • Question

  • Sometimes it seems translator translates english text inside HTML tags to the target language, thereby breaking the tags. What sort of markers/tags can I use to specify which parts of a text that I do NOT want translated? (ie, left "as-is")
    Monday, September 26, 2011 1:08 AM

Answers

  • Hi David,

    in an HTML document you can mark the element to not be translated with its:translate="no" or class=notranslate.
    Example

    <p><span class=notranslate>Do not translate this</span>Do translate this.</p>

    If your text is not HTML originally, you can make it so, by just enclosing it with the appropriate tags.

    Let us know if this helps,
    Chris Wendt
    Microsoft Translator


    Tuesday, April 15, 2014 11:18 PM

All replies

  • Hi Sime,

    if your text is HTML you want to specify ContentType = "text/html" in the parameter to the Translate() method. Make sure that the text you submit contains the complete HTML element you are translating. Also avoid having a complete sentence in any attributes to the element.

    If you still have trouble, please post your source text here, and the method you are calling.

    Chris Wendt
    Microsoft Translator

    Monday, September 26, 2011 6:10 AM
  • We have text, that is not html, we'd like to translate. However, the user can mark a section of this text to not be translated.
    We then wrap the section the user didn't want to translate with angle brackets, in the hope that translator service would ignore that section.
    Unfortunately, some items in the "not-to-be-translated" section are still being translated. ex: dates are changed from mm/dd/yy to dd/mm/yy. There are other cases too.

    Is there a way to mark an entire segment of a translation request so that it will be ignored by the translator?

    or will we have to do something different, such as compare each "not-to-be-translated" section of the original to the value for that segment in the translation, then replace that segment with the original value for that segment?


    David Hollowell

    Monday, March 31, 2014 5:15 PM
  • Hi David,

    in an HTML document you can mark the element to not be translated with its:translate="no" or class=notranslate.
    Example

    <p><span class=notranslate>Do not translate this</span>Do translate this.</p>

    If your text is not HTML originally, you can make it so, by just enclosing it with the appropriate tags.

    Let us know if this helps,
    Chris Wendt
    Microsoft Translator


    Tuesday, April 15, 2014 11:18 PM
  • Thank you Chris!

    David Hollowell

    Sunday, April 20, 2014 7:42 PM
  • I have a similar issue with text that I do not want translated.  I am using json.  I have the following segment:

    Father had finished for the day, switched off the shop lights and closed the <bx ctype="italic" id="2" rid="2"/>shutters; I don't want the content between the angle brackets translated, but Microsoft returns the following:

    Vater hatte für den Tag beendet, die Shop-Lichter ausgeschaltet und geschlossen die < Bx Ctype = 'kursiv' Id = '2' rid = '2' / > Fensterläden;

    Any suggestions on how to tell Microsoft not to translate the formatting tag?

    Thanks,


    Saul

    Wednesday, December 3, 2014 9:33 PM
  • I'know this post is very old, but maybe some come across with the same issue, so here it goes:

    Saul_R, you may want to make an array, with all the strings such as "<bx", "<something", and then make a function that replaces the strings with Crish (comment above) idea.

    Original String: <bx ctype="italic" id="2" rid="2"/>

    New String: <span class="notranslate"> <bx ctype="italic" id="2" rid="2"/> </span>

    i know, it's not the best approach, but at least it worked for me. :)

    Friday, July 24, 2015 11:13 PM
  • If you are using contentType="text/HTML", mark the element you don't want translated with a class=notranslate attribute.

    If you are using contentType="text/plain", escape the section you don't want to translate to a Twitter tag like #section1, and unescape it after translation.

    By careful when using non-HTML elements, like generic XML, in HTML mode. That will create unpredictable results, because Translator doesn't know if your non-HTML tag is sentence breaking or not. You can write an XSL transform of your original XML to HTML, translate, and then transform back. If it doesn't _look_ right when viewed in HTML, it won't translate right.

    HTH,
    Chris Wendt
    Microsoft Translator

    Saturday, July 25, 2015 12:16 AM