locked
Site not crawled, plus other problems.... RRS feed

  • Question

  • I added my site to the msn webmaster tools, about a week ago, and one part of the site, has been crawled, adn shows pages, which for some reason arent pages that it is supposed to have crawled, and the part of the site, has been crawled but nothing shows up.
     
    When  I look in the Crawl Issues, there seems to be a major problems with each of the links in the drop down menu.
     
    It shows an error 404, and we've tried to change the robot.text file to allow msn to crawl, but it still shows the error.
     
    Its been blocked by REP, and again, we have tried to change therobot.text to allow us to be crawled.
     
    It says we have long links which could send the search engine in to a loop, but they arent over long, I have permalinks set up.
     
    The strangest pert is, it is telling me that we have malaware on the site. Now, I know we dont, as I have google search the site site with their webtools and its not showing the same, plus I contacted my host whop have done a scan, and found no malaware. That's rally frustrating me, as I dont understand why it is MSN should be saying that I have malware.
     
    All these things are really frustrating, as I would like to be able to get my site crawled.
     
    My site is

    www.hamstercareforum.co.uk

     
    Would anybody know why I should be having these problems?
     
    Thank you.
     
    John
    Saturday, December 13, 2008 10:30 PM

Answers

  • Hi,

     

    So, right now, your robots.txt says to crawl and index your entire site. You should put a link to your sitemap in your robots.txt file:

     

     

    sitemap: http://yourdomain.com/sitemap.xml

     

     

    Site crawling/indexing speeds are based on a lot of different factors. I suggest reading the FAQs located at the top of this forum and at the top of the Ranking forum.

     

     

    Brett

    Wednesday, December 17, 2008 8:10 PM

All replies

  • Is there nobody that can help here please. Until I can get this sorted, I am not able to get crawled, or even show up on Live Search. I have contacted Microsoft help, and they redirect me here, so I'm stuck until I can get help.

    Thank you.

    John.
    Tuesday, December 16, 2008 4:41 PM
  • Hi John,

     

    I'm happy to help.  You say a part of your site has been crawled, but not the correct part. Did you get this information from your server logs? I don't show that you have been crawled at all. I also didn't find a robots.txt file or meta tags excluding directories or pages. Could you please provide a link to this information?

     

    How old is your site? I see that most (if not all) of your backlinks are from your blog. You may want to read our FAQs located at the top of the ranking and crawling forums.

     

    In regards to these other crawl issues, when we find an issue with one of your pages, we will explicitly state which page has the issue. This information is located near the bottom of the page under the "Web Page" column (bold heading). Are any of your URLs listed and if so, can you provide the URLs so that I may investigate what the issue might be?

     

    Thanks,

     

    Brett 

     

     

    Tuesday, December 16, 2008 6:39 PM
  • Hi, I got the information from loggin into the webmaster centre. In in section of the drop down links in the Crawl Issues, is where the information came from.

    A lot of the site has been transferred from the old blog, but I have added quite a few pages of new stuff, as well. Nothing has actually come up with urls, which is strange, and the biggest problem is that it says that I have malaware on my site, which I know I dont.

    I am not sure, but would you need access to my webtools, so you can have a look, that might help. I can send you my details to my webmaster centre. That might be easier. I have looked at the robot.text file, and its in the files, could it be in the wrong place?

    John.



    Tuesday, December 16, 2008 7:35 PM
  • I think what you are seeing are the default values. If there are no URLs listed under the column I mentioned, then you do not have any issues.

     

    Example:

     

    Crawl issues

    Use this tool to find out about any issues Live Search discovered while crawling and indexing your site. This information can help you track down missing or bad links, find pages blocked from our index by your robots.txt file, identify URLs that may be too long, and isolate page with content-types that are not supported by Live Search. Learn more

    Use the filter to exclude results from a domain or show only results from a single domain. You can also specify a domain (live.com), a sub-domain (webmaster.live.com), or a top-level domain (.com).

     

    Select Issue Type:
      Learn more
      

     

    Results for File Not Found (404)

    Live Search encountered a "404 File Not Found" HTTP status code when last attempting to download these URLs. This could either mean the page has been removed from your site, or that there is a misspelled URL somewhere on the web pointing to your site. We recommend you use this report to find a list of URLs to either repair the broken link, or create a 301 redirect the URL to a more appropriate location. Learn more

    Web Page

     

    If there are crawl issues, the URL will be listed right here under "Web Page"

     

     

    Could you provide the link to your robots.txt file?

     

    Thanks,

     

    Brett

    Tuesday, December 16, 2008 7:42 PM
  • oh right, so why isnt the site being crawled at all, I have been through everything to try get it fixed, my web designer says that the robot.text file has been updated to allow msn to crawl. He's looking at the file now, and says it should work. Where are you not finding the text file?

    My appologies as well, getting the link to file now, one sec.
    Tuesday, December 16, 2008 7:47 PM
  • Hi, here is the link to the text file

    www.hamstercareforum.co.uk/robots.txt

    Hope that helps.

    John.

    edit.....that's strange, giving an error 404

    trying to work out why....
    Tuesday, December 16, 2008 7:51 PM
  • Please try now, its works.
    Tuesday, December 16, 2008 8:07 PM
  • Ok, I have gone through it all again, reset the robot.text file, regenerated a site map, added it to the correct pages in the webtools, and still nothing shows up. How long does it take? Can you tell from your end if I've done it correctly now?
    Tuesday, December 16, 2008 8:46 PM
  • Can somebody help sort this out please? I still need to find out if I have things set up correctly, as my site still hasnt been crawled yet.
    Wednesday, December 17, 2008 2:05 PM
  • Hi,

     

    So, right now, your robots.txt says to crawl and index your entire site. You should put a link to your sitemap in your robots.txt file:

     

     

    sitemap: http://yourdomain.com/sitemap.xml

     

     

    Site crawling/indexing speeds are based on a lot of different factors. I suggest reading the FAQs located at the top of this forum and at the top of the Ranking forum.

     

     

    Brett

    Wednesday, December 17, 2008 8:10 PM
  • HI, I have readded a regenerated sitemap.

    Its funny, I have another site, connected to this one, that I entered the same time, and that was indexed straight away, mind you saying that, it only has 5 pages indexed, and there's a lot more than that.
    Wednesday, December 17, 2008 9:01 PM
  • I've just realised what has been happening here. I have one site, and two domains, they are attached to each other. They both come under the name Furry Critters.

    Furry Critters Blog has the url         www.furrycritters.co.uk

    Furry Critters has the url                www.hamstercareforum.co.uk  which contains an information page, plus a forum which has /phpbb. on the end of the url.

    Neither url has been crawled since I added them to the webmaster tools. The reason why I say this is that the url furrycritters.co.uk is showing all the links for hamstercareforum.co.uk uk. Which shows that the both sites havent been crawled since I posted them.

    What you are seeing in the furrycritters.co.uk webmaster tools, is actually all the links to the hamstercareforum url. You really need to look in my webmaster tools to see what is happening.

    If you actually have a look, you will see what I mean.

    John.
    Saturday, December 20, 2008 12:48 AM
  • What do you mean when you say you have 1 site and 2 domains?

     

    Tuesday, December 23, 2008 5:35 PM
  • Its been explained above and if you had taken a look at it you would see what I was talking about, plus if you had read this thread http://forums.microsoft.com/webmaster/ShowPost.aspx?PostID=4255583&SiteID=79 it talks about it there.

    I dont know what else I can say.

    John.
    Sunday, December 28, 2008 1:47 PM