locked
Please help, msnbot is practically attacking my site - pages being recached every few seconds RRS feed

  • Question

  • For some reason, msnbot has a fascination with my /video/ directory, and has been the cause of investigation for our site using so much bandwidth lately. Turns out, since Feb 1st, msnbot has requested the /videos/ page over 240,000 times, using over 25GB of bandwidth. I have changed robots.txt to disallow it to index the folder, but it still persists. Here is a clipping of our raw access log, for just the last 10 minutes.

    http://www.rpgamers.net/accesslog.txt

    If anyone can advise why it has such great interest in this page, and now to allow it to index it without continously flooding it, I would appreciate your recommendations.

    Thank you.

    Tuesday, February 10, 2009 11:44 AM

Answers

  • Hi,

    My apologies for the issues you are experiencing. I'll get our team to look at this and post back as soon as I get a response. Also, have you tried setting a crawl delay? That might help.


    Brett
    Program Manager, Live Search Webmaster Tools
    • Marked as answer by Brett Yount Wednesday, February 25, 2009 4:57 PM
    Wednesday, February 11, 2009 8:34 PM

All replies

  • Okay, the bot has requested robots.txt various times since I first posted, and it is clearly refusing to listen to it.

    I have now added the offending ip to deny on the htaccess level.

    A response on what the heck is going on would be good, or a link to where to report this abusive behaviour.

    The bot has actually increased in hostility overnight, demanding pages more often now.

    65.55.25.142 is the offending IP.

    Tuesday, February 10, 2009 5:30 PM
  • Hi,

    My apologies for the issues you are experiencing. I'll get our team to look at this and post back as soon as I get a response. Also, have you tried setting a crawl delay? That might help.


    Brett
    Program Manager, Live Search Webmaster Tools
    • Marked as answer by Brett Yount Wednesday, February 25, 2009 4:57 PM
    Wednesday, February 11, 2009 8:34 PM
  • Hi Brett,

    I searched Crawl Delay on this forum and although I've looked at several threads they all seem to end with you recomending setting the crawl delay to a number of seconds. Well over the last few years I've tried setting the crawl delay from 5 to 250000 and the MSN bot still hits my site every minute or so.

    I wouldn't mind but it's sucking the guts out of my bandwidth with very few hits from live.com or our many ads placed with Adcenter.

    Does your bot actually read the robots.txt file? or should I rename it to something else your bot will recognise?

    Is it true that if I disallow the msnbot it will still spider my site?

    Yours, looking for answers.....

    Chris
    Thursday, May 28, 2009 12:45 PM