What is a shard? RRS feed

  • Question

  • I was hearing this term on net concernind DBMS desin. What exact is this term 'shard'?
    Saturday, September 29, 2007 12:42 AM


  • What is sharding?

    While working at Auction Watch, Dathan got the idea to solve their scaling problems by creating a database server for a group of users and running those servers on cheap Linux boxes. In this scheme the data for User A is stored on one server and the data for User B is stored on another server. It's a federated model. Groups of 500K users are stored together in what are called shards.

    The advantages are:

  • High availability. If one box goes down the others still operate.

  • Faster queries. Smaller amounts of data in each user group mean faster querying.

  • More write bandwidth. With no master database serializing writes you can write in parallel which increases your write throughput. Writing is major bottleneck for many websites.

  • You can do more work. A parallel backend means you can do more work simultaneously. You can handle higher user loads, especially when writing data, because there are parallel paths through your system. You can load balance web servers, which access shards over different network paths, which are processed by separate CPUs, which use separate caches of RAM and separate disk IO paths to process work. Very few bottlenecks limit your work.


Saturday, September 29, 2007 7:43 AM