locked
Microsoft Suspected Bot Identifying itself as IE7 RRS feed

  • General discussion

  • On a forum/website I help to administrate, I've been logging the user agent, ip, and page/querystring viewed for all all guests who aren't identifying themselves as known spiders.

    (Due to being hit with content scrapers, hackers and hijackers, to identify new spiders, and to discount spider pageviews from stats [for accuracy]).

     

    However today I noticed about 50+ page requests (one after another in my logs and so must have been very close together otherwise they would have been split by other guests). from about a dozen ips in the 65.55.232.* range. (one of which was 65.55.232.38 )

     

    The same useragent being sent for all was

    Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)

    Presumably being faked.

    Since a dozen ips were used and the the closeness of the page requests. I found it unlikely to be a human.

     

    I was about to add the range to my .htaccess ban list when I ran a dns check

    http://www.dnsstuff.com/tools/whois.ch?ip=65.55.232.38

     

    Resolves back to microsoft?

    OrgName:    Microsoft Corp
    OrgID:      MSFT
    Address:    One Microsoft Way
    City:       Redmond
    StateProv:  WA
    PostalCode: 98052
    Country:    US

    NetRange:   65.52.0.0 - 65.55.255.255

    Any one any idea whats going on?

    Could it be msnbot? To me it seems too aggressive for msnbot.

     

    Why would a fake useragent be being used?

    Wednesday, April 30, 2008 10:32 PM

All replies

  • Hi Steve,

     

    Sorry for the delay. We upgraded MSNbot about that time and received a few complaints about the bot being overly aggressive. If this is still happening, I suggest adding a crawl delay to your robots.txt file.

     

    Friday, May 30, 2008 7:27 PM
  •  Brett Yount wrote:

    Hi Steve,

     

    Sorry for the delay. We upgraded MSNbot about that time and received a few complaints about the bot being overly aggressive. If this is still happening, I suggest adding a crawl delay to your robots.txt file.

     




    Hold on for just a minute here - are you saying that MSN is crawling sites with a spoofed user agent? That just does not sound like a responsible thing to be doing...  Is it at least still checking for robots.txt?
    Thursday, June 5, 2008 5:37 PM