locked
msnbot not honoring robots.text RRS feed

  • Question

  • I haven't personally experienced it (yet), but webmasters are reporting it left and right, all over the place.

    Is there any particular contact point for webmasters to communicate the issue, if they're experiencing it on their site? Not all will want to post their URL publicly.
    Friday, December 7, 2007 9:52 PM

Answers

  • This is correct, the cloaking detection tool uses the robots.txt file in the same way that MSNBot does, however, in acting like a browser it may download all linked elements on a page required to render the page. This could include video, CSS or javascript files.

    regards,
    nate
    Tuesday, December 11, 2007 11:58 PM

All replies

  • I think you found the place. Although the communication is a bit one sided. And if you don't think you've experienced, search your logs for "LIVSOP", and that's probably a bot. I've gotten 15 so far today. And it's a slow day.

    Sunday, December 9, 2007 1:26 AM
  • Marcia,

     

    Hi,  I would love to help you and this is the public forum for these issues.   What kind of problem are you experiencing with MSNbot?   Can your describe what you are seeing?   I will be glad to look into it, but I will need more detail

     

    thanks,

     

    Jeremiah

    Sunday, December 9, 2007 3:15 AM
  • Hey,

    I hate to speak out of turn, and correct me if I'm wrong marcia, but I believe this question would be most likely regarding the new bot that fakes referrer strings? The ones involving LIVSOP? They don't follow robots.txt if the external location(often javascript or CSS) is referenced inside of a web page...Pretty well documented..

    Sunday, December 9, 2007 5:52 AM
  • This is correct, the cloaking detection tool uses the robots.txt file in the same way that MSNBot does, however, in acting like a browser it may download all linked elements on a page required to render the page. This could include video, CSS or javascript files.

    regards,
    nate
    Tuesday, December 11, 2007 11:58 PM
  •  Jeremiah Andrick - Microsoft wrote:

    Marcia,

     

    Hi,  I would love to help you and this is the public forum for these issues.   What kind of problem are you experiencing with MSNbot?   Can your describe what you are seeing?   I will be glad to look into it, but I will need more detail

     

    thanks,

     

    Jeremiah

    Jeremiah, I know you would. I've been around forums a lot for a long time, and I have to say that there's been a very warm, welcoming presence developing in these forums.

     

    It isn't me with the robots.txt issue or the MSNbot problem, I never exclude bots except maybe an occasional cgi-bin (which I don't use) and maybe an /includes/ folder for security reasons, since I got hacked and was told by my new host that using any scripting, even PHP Includes for navigation, can open a site up to such vulnerabities. (But those networks with NO robots.txt allowing those tracking URLs to get indexed, that's a different story, that's negligent on their part.)

     

    I just posted because I'd seen numerous posts and much whining in forums about the bot/robots.txt issue, but it doesn't concern me personally.

    Wednesday, December 12, 2007 5:07 AM
  • Please can you help me get my robots txt. file validated? It was validated. I made no changes but now it is not validated.

    I received two bits of information. The first said that the url was not identified so I added as per the example the following:

    "robots txt. file for Http:// then my url"

    I then tried to validate it again. This time a new, different warning appreared: "warning: 'sitemap' - tag isn't specified"

    I would greatly appreciate help with an example of what exactly I should add to my robots txt file? Thank you.

    Wednesday, December 24, 2008 2:56 PM