locked
Increasing maximum character limit per request RRS feed

  • Question

  • Hi,

    Is there a way for developers to submit a large text to be translated by Microsoft Translator? The problem I am having at the moment is I am trying to translate an html page that has roughly 20k characters. When I tried to use the translator API using SOAP it gave me an error response saying that the maximum characters allowed was 10240 characters.

    I had tried several workarounds by breaking the content to lesser characters (up to 3000 characters only), but another problem surfaced in the area of the accuracy of the translated result. What I mean is say I have this massive text about 10k characters and I broke it up to 3 chunks of roughly 3000 characters each. One of the chunk ended up with an unclosed html tags, eg. <div><ul><li>test.  When this chunk is translated the returned result will automatically enclosed the html tags (i.e. <div><ul><li>test</li></ul></div>) which exactly where the problem is since I do not want
    it to be enclosed as the content is yet to be continued.

    If it is not possible to increase the maximum character limit per request then is it possible to turn off this auto html tag completion feature in this
    microsoft translator?

    Thank you in advance for your help.
    Wednesday, January 11, 2012 4:06 AM

Answers

  • Hi Nick,

    the maximum size of a request is 10000 characters. Any text bigger than that needs to be split into multiple requests. When translating HTML, the input needs to be well-formed and complete. You will want to walk the document down to the individual <p> element and translate them one by one, or using the TranslateArray() method to translate several <P>s at once.

    Chris Wendt
    Microsoft Translator

    Monday, January 30, 2012 5:19 AM

All replies

  • Hi Nick,

    the maximum size of a request is 10000 characters. Any text bigger than that needs to be split into multiple requests. When translating HTML, the input needs to be well-formed and complete. You will want to walk the document down to the individual <p> element and translate them one by one, or using the TranslateArray() method to translate several <P>s at once.

    Chris Wendt
    Microsoft Translator

    Monday, January 30, 2012 5:19 AM
  • How is it that the widget can translate an entire page and yet the api is unable to parse more than 10,000 characters? I have recursively parsed the html elements using the http api down to 10,000 char and now my html pages render sooooo incredibly slow!! Any advice on how to use the api to work more like the widget. The widget parses them fine.  The pages are about 50,000k, we're using  webforms with third party controls. Any help would be greatly appreciated!!

    Thanks,

    Jared Rainey

    Friday, September 27, 2013 7:38 PM
  • Hi Jared,

    the widget translates in chunks of right under 2000 characters, and it issues multiple requests concurrently. The widget renders the translated element as soon as it is translated, that's why you see the page showing something quickly, before the entire document is translated.

    Let us know if this helps,
    Chris Wendt
    Microsoft Translator

    Friday, September 27, 2013 8:46 PM
  • Hi Jared,

    you may consider the HTML agility pack to break your HTML document into smaller than 10K chunks: http://www.nuget.org/packages/HtmlAgilityPack.

    Chris Wendt
    Microsoft Translator

    Sunday, September 29, 2013 10:18 PM