Incorrect translation returned when content has mixed languages RRS feed

  • Question

  • Hi,

    We are using `Translate` and `Detect`, paid subscription,  to translate content on our website. Since most of these content are user-generated there are many instances that the content our users submitted have mixed languages, English and non-English characters.

    We assumed that the translate API is able to translate the the English an non-English characters respectively, but this doesn't seem to work for us.  Below is the sample text we have:

    Please send us a message to confirm the availability and rates before making a booking request. You can do so by this website's online messaging system!\n\nStyle of classic European elegance and grandeur with modern functionalities, this apartment comprises 2 bedrooms, a living room, fully furnished residence with separate balcony, dining, kitchen and bedroom areas.\n\nIt also has a fully equipped kitchenette, Cookers, Tea set and Port. Features also a washing machine, microwave, toaster and a refrigerator.Complimentary high speed wi-fi access, television, Comprehensive cable channels.\nBath Room: Bath Amenities\n\nSurrounded by traditional shop houses and a local food haven, guests will be treated with a vibrant atmosphere be it day or night, allowing an opportunity for all to immerse in the Singapore culture. \n\nThe bus stop at the gate. Just walking distance from Eunos MRT (Mass Rapid Transit) and 15 minutes travel time to Singapore Changi Airport as well as the city, It provides easy access to the various destinations well-known in Singapore.\n它坐落于富有丰富的历史和文化的东部如切区。全新的公寓大厦Tivoli Granda,附设一系列的设施,包括健身房,游泳池和烧烤坑。公寓内部的装修设计及家具用品,也都是全新的。\n「迦南套房」以欧洲经典和优雅高贵,并配合现代化功能的设计风格为主。业者秉持以客为尊的人性化经营方式,并提供一个精致高雅的全方位度假公寓服务,目的是要让住客在访新加坡期间,享有一个美好的旅程及回忆。\n四周环绕着具特色的商店及古迹建筑物,同时本地美食遍布四周。无论昼夜,住客都可在此享受一个富有活力及魅力并充满古迹的特色文化区里。\n\n迦南套房位于交通便利地区。步行即可到达购物中心及地铁站。15分钟的车程抵达新加坡机场和乌节路购物中心区。门口即有巴士站,可通往各地区。

    When we pass this text to the `Detect` language API it returns zh-CHS. But when I tried to `Translate` this to zh-CHT or zh-CHS (since we want the English part to be translated as well) it returns the same string. This happens when calling your API via our code or by using this tool https://datamarket.azure.com/dataset/explore/bing/microsofttranslator.

    However, when I tried to translate this to "fra" both English and non-English parts get translated.

    Is there any way around this issue? Also do you have any documentation on how Translator detects language of a given text especially if it has English and non-English characters?

    Hope to hear from you soon.



    • Edited by beverly555 Thursday, August 21, 2014 7:36 AM
    Thursday, August 21, 2014 5:20 AM

All replies

  • Hello Bev,

    thanks for your use of the Microsoft Translator API.

    The detect function acts on the entire string passed to it, and will return the single language with the highest confidence. In your example this is zh-CHS. When you translate this string without specifying the source language, the language will be detected as zh-CHS, and because you specified zh-CHS as target language, Translator does nothing.

    In order to translate the English portion, specify en as source language. The Chinese already in it will pass through.

    Let us know if this helps,
    Chris Wendt
    Microsoft Translator

    Wednesday, September 3, 2014 12:43 PM
  • Hi Chris,

    Thanks for your reply. The thing is since our content are user generated, we won't know which characters are in them or which is more likely to be detected as the source language.

    In the test I did previously, I did pass zh-CHS as the source and target languages but still doesn't translate the English string correctly.  You can test this with the translator tool also: https://datamarket.azure.com/dataset/explore/bing/microsofttranslator and use the string I mentioned earlier as the text.

    Is this a valid/known issue?



    Monday, September 8, 2014 1:42 AM
  • Hi Bev,

    Please follow the documentation at http://aka.ms/translatormsdn. The API works as I described.

    You can also verify with http://www.bing.com/translator.

    Chris Wendt
    Microsoft Translator.

    Monday, September 8, 2014 10:28 PM