locked
1,000 link limit in Sitemapindex.xml RRS feed

  • Question

  • We are about to go over the 1,000 sitemap file limit in the Sitemapindex.xml file.  Is there a way to create another file?

    How do we get around this issue?

    Tuesday, February 12, 2008 8:48 PM

Answers

  • Sitemaps.org provides the following, and the URL the txt is taken from is:

    http://sitemaps.org/protocol.php

    Using Sitemap index files (to group multiple sitemap files)

    You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes). If you would like, you may compress your Sitemap files using gzip to stay within 10MB and reduce your bandwidth requirement. If you want to list more than 50,000 URLs, you must create multiple Sitemap files.

    If you do provide multiple Sitemaps, you should then list each Sitemap file in a Sitemap index file. Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes). The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file.

    The Sitemap index file must:

    • Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.
    • Include a <sitemap> entry for each Sitemap as a parent XML tag.
    • Include a <loc> child entry for each <sitemap> parent tag.

    The optional <lastmod> tag is also available for Sitemap index files.

    Note: A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml can include Sitemaps on http://www.yoursite.com but not on http://www.example.com or http://yourhost.yoursite.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.

    Sample XML Sitemap Index

    The following example shows a Sitemap index that lists two Sitemaps:

    <?xml version="1.0" encoding="UTF-8"?>
    <
    sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
       <
    sitemap>
          <
    loc>http://www.example.com/sitemap1.xml.gz</loc>
          <
    lastmod>2004-10-01T18:23:17+00:00</lastmod>
       </sitemap>
       <
    sitemap>
          <
    loc>http://www.example.com/sitemap2.xml.gz</loc>
          <
    lastmod>2005-01-01</lastmod>
       </sitemap>
    </sitemapindex>

    Friday, February 15, 2008 10:13 AM

All replies

  • Sitemaps.org provides the following, and the URL the txt is taken from is:

    http://sitemaps.org/protocol.php

    Using Sitemap index files (to group multiple sitemap files)

    You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes). If you would like, you may compress your Sitemap files using gzip to stay within 10MB and reduce your bandwidth requirement. If you want to list more than 50,000 URLs, you must create multiple Sitemap files.

    If you do provide multiple Sitemaps, you should then list each Sitemap file in a Sitemap index file. Sitemap index files may not list more than 1,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes). The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file.

    The Sitemap index file must:

    • Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.
    • Include a <sitemap> entry for each Sitemap as a parent XML tag.
    • Include a <loc> child entry for each <sitemap> parent tag.

    The optional <lastmod> tag is also available for Sitemap index files.

    Note: A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml can include Sitemaps on http://www.yoursite.com but not on http://www.example.com or http://yourhost.yoursite.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.

    Sample XML Sitemap Index

    The following example shows a Sitemap index that lists two Sitemaps:

    <?xml version="1.0" encoding="UTF-8"?>
    <
    sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
       <
    sitemap>
          <
    loc>http://www.example.com/sitemap1.xml.gz</loc>
          <
    lastmod>2004-10-01T18:23:17+00:00</lastmod>
       </sitemap>
       <
    sitemap>
          <
    loc>http://www.example.com/sitemap2.xml.gz</loc>
          <
    lastmod>2005-01-01</lastmod>
       </sitemap>
    </sitemapindex>

    Friday, February 15, 2008 10:13 AM
  • 1000 urls should not be a problem and I can verify this with Live.  We broke 1000 pages when we rolled out the recycled promotional items category and they were all promptly indexed by Live.  The threshold as posted is 50k and also in the size of the file which can be overcome by using gzip.

     

     

    Friday, February 15, 2008 4:17 PM
  • Good to know this as i'm coming very close to the limit.

     

    Monday, April 14, 2008 2:49 PM
  • Hi,
    I'm bothered by the same issue and strangely enough no one seems to be answering the question posted (no offense).
    The question is about sitemapINDEX.xml not sitemap.xml. Also, from the question it can be seen that the person is aware of the 1000 entries limit in the sitemap INDEX file, so there was no need to (redundantly) explain it again.

    Anyway, the question still remains: is there a way to submit multiple sitemapINDEX files in MSN. (you can submit multiple files in google)

    Thanks.
    Tuesday, October 7, 2008 4:27 PM
  •  Vegan Fanatic wrote:
    The top level, main sitemap must be named sitemap.xml  but may contain both types of records as desired. If you have several segments with 100,000 items, then you may need to use several submaps to stay withing the spec limits.


    ???
    Where did you get that the "main sitemap" must be named sitemap.xml???
    What is the "main sitemap" anyway?? Do you mean the sitemap index? If so, then NO, it can not contain both types of records! (it can only contain sitemaps urls)

    100,000??? Submaps??? Please refer to http://www.sitemaps.org/protocol.php to get your information straight.
    Wednesday, October 8, 2008 4:26 PM
  • On our site www.catanich.com we have over 38,000 pages with half of them dynamic and have use gSiteCrawler to handle our indexing needs. It has no upper limit and handles the multiple sitemap issue easly. All three SEs read and index the same output.
    Thursday, October 16, 2008 8:00 PM