« The I.H.S.D.F. Theorem: A Proposed Theorem for the Trade-offs in Horizontally Scalable Systems | Main | Alternative Memcache Usage: A Highly Scalable, Highly Available, In-Memory Shard Index »

December 28, 2008



I'm curious which technique you usually use =)

Also, if you implement option #3 (offset in auto_increment), for how much shards do you plan? You wouldn't want to have to update all the id's if you have to add a new shard...

Max Indelicato

Hi Evert,

I tend to use the Globally Unique Identifier Keys method over the others when I can, as it offers the greatest amount of flexibility.

You're correct about the Auto-Increment Offset and Increment method. The single biggest issue is that it is fairly inflexible and requires a re-balancing of the sharded data or the sharded data's keys. One way to work around this is to create large number of sub-shards on each actual shard.

For example, we might predefine 100 databases, each working as if they are their own shard, regardless of the actual machine they are currently on. If we have, say 10 servers, and 100 shards, we would put 10 shards on each server. As necessary, we would add another couple of servers and redivide the shards, splitting 100 shards among 12 servers. The idea here is to predefine as many shards as you think you might ever need. You can predefine 1000 of them if you think that is what you need, or even 10000.

Max Indelicato

Just to be clear, the point of predefining a large number of shards is that it is much easier to rebalance servers by moving entire database instances (shards) than to rebalance the sharded data itself. I'm not sure that was made clear in my last comment.

Yuriy Zubarev

Thank you. Never got such a detailed response to a question :)

Donny V

Remember if your going to be using GUIDs as unique identifiers and indexing on them you should use COMB GUIDs. They will prevent the fragmentation that comes with naturally generated GUIDs when indexed.

Here is a good article about COMB GUIDs and code on generating them.

Max Indelicato

Hi Donny,

Good point, thanks for the link.


Hi Max,

Excellent blog. I found you through the (also excellent) High Scalability blog, and have added you to my reader.

In my experience, strategy #1 is accomplished by using the "primary" primary key that determines the shard, and not the shard ID directly.

For example, a user-centric service that shards by users would use the user Id. Since you always have to query the index table (user_lookup, in your example) anyway, this adds no overhead other than a larger primary key in the sharded tables.

It also has the benefit that you can make your URLs a bit more friendly, as in


This way, moving data between shards is completely invisible and cost effective.

I've also seen the distributed "hi-lo" strategy employed, where the index node also contains a table that stores the next available primary key. The application nodes reserve IDs in batches, cache them, and hand them out as new records are created.

Jamie Hall

The simplest answer is to include whatever is the basis for the logical segmentation of your shards as part of your lookups. For example, you can choose to shard by customers, with a million customers on each shard, each with the invoices belonging to those customers. To look up an invoice all you need is a customer ID and an invoice ID, the combination of which is unique across all your shards. You can have a master lookup table for determining which customers are on which shards, or use a simpler, yet ultimately less flexible technique based on a hash of the customer ID or whatever.

Ryan Schneider

With tools like hibernate that have built-in unique key generation my personal experience has shown that option #2 is "no brainer".


good article,

but I recommend using PostgreSQL for partitioning and clustering with PL/Proxy, PgBouncer and WAL.

we have now 2 x PgBouncer, 12 x nodes and 12 x WAL backup servers and everything works fine.

it's only one open source solution at this time.

or if you have a lot of money i recommend using Oracle, where partitioning works like a charm. ;-)

good article with examples from skype: https://developer.skype.com/SkypeGarage/DbProjects/SkypePostgresqlWhitepaper

The comments to this entry are closed.


  • Max Indelicato. Chief Software Architect and technology enthusiast.

Other Services I Use

January 2009

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31