« Scalability Strategies Primer: Database Sharding | Main | How to Organize a Database Table’s Keys for Scalability »

December 21, 2008

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e54fa5b0b988330105368a8a01970b

Listed below are links to weblogs that reference The I.H.S.D.F. Theorem: A Proposed Theorem for the Trade-offs in Horizontally Scalable Systems:

Comments

John Allspaw

Great material here. Any type of ethic that points to silver-bullets being fiction is good, IMHO.

Having been through a fairly large architectural change from a monolithic to federated database change, I'm not totally in agreement that the decrease in flexibility has to be roughly proportional. On an individual node basis, monitoring doesn't have to necessarily change. I'd say that more systemic monitoring becomes more valuable, but as time goes on, the dynamics inherent within the system can better inform what you should be monitoring.

For example: before, you have nagios (or whatever) watching your MySQL instances on your individual nodes. Then you do the horizontal thing and go with your preferred method of doing that. Your original monitoring should obviously stay in place, but other patterns emerge (objects stored per shard, what thresholds of objects dictate what a 'hot' shard would be, etc.) that gives cluster-wide or application-level monitoring more value.

Either way, you're right: monitoring becomes more complex, but I'd also say that difficulties that come from that complexity would have been just the same if you hadn't horizontally scaled.

Max Indelicato

Hi John,

You've made some good points.

My reference to monitoring was meant to be more specific to a system's feature set (web service that performs function X) than a system's components (MySQL instances, etc). Or as you put it: application-level monitoring. Basically, custom built monitoring becomes more difficult in my experience, regardless of how automated the mechanism is, as the size of the system increases. For example, monitoring a system of 10,000 servers is just plain complicated! I probably should have explicitly noted the difference there.

It's funny, after writing this up last night, I started mulling over adding an addendum to this theorem describing in further detail how I arrived at the proportional measure of increased horizontal scalability to decreased flexibility. I kind of gave myself an "out" in that I said "approximately proportional", but that's hardly scientific.

You're right, monitoring becomes more complex regardless of the reasons why a system grows larger, but I thought it important to differentiate growth by horizontal versus vertical scaling. In my experience, vertically scaling a system actually has little impact on the flexibility of the system from a developers perspective. That's the extent of my reasoning behind explicitly stating that this theorem is specific to Horizontal Scalability.

John Allspaw

"In my experience, vertically scaling a system actually has little impact on the flexibility of the system from a developers perspective."

Indeed! I might even say that it's easy with 'diagonal scaling' as well:
http://clarification.wordpress.com/2008/06/05/diagonal-scaling-and-the-law-of-diminishing-returns/

Regardless, I've got you in my RSS sights. :) Not many people post with such insight about these topics.

Max Indelicato

Agreed, diagonal scaling has the added benefit of easing a developer's life in regards to flexibility too. Interesting post you linked to, looks like a great resource on diagonally scaling.

And thanks for the kind words. I'm certainly hoping to post regularly on these kinds of topics.

Niklas

I must say I love your blog. I've just recently begun scaling applications, and I've found your entries to be a very good resource on theoretical (as well as hands-on) scaling.

Thank you!

roppert

Another article I wish management would read and understand.

Nati Shalom

I recently wrote a summary of a pattern for addressing the monitoring bottleneck on one of my recent posts: Data aggregation pattern for effective monitoring (http://natishalom.typepad.com/nati_shaloms_blog/2008/11/managing-application-on-the-cloud-using-a-jmx-fabric-1.html)

The general idea is that the data for monitoring will be stored into in-memory cloud and offloaded asynchronously to the database.

BTW are you familiar with Space Based Architecture? (http://en.wikipedia.org/wiki/Space_based_architecture)
Its a pattern for achieving linear scalability in stateful transactional applications. You may find it relevant from some of the scenarios you described in your blog.

Nati S.

Max Indelicato

Hi Nati,

I'm a huge fan of Space-based Architectures. In fact, a while back I started coding a web-service version that I intended to release to the dev community as an IaaS offering. GigaSpaces was a HUGE inspiration and much of my implementation was based on the GigaSpaces feature set. It ended up morphing into more of a simple tuplespace service after I realized just how much work was involved in developing a truly robust and scalable SBA web-service. Alas, I dropped it after Amazon came out with SimpleDB. The big gray cloud of discouragement got the better of me.

Anyway, I have a list of write-ups that I'd like to do, and there's one about SBA on there not too far from the top. Thanks for commenting, I'm actually a frequent reader of your blog - lots of good stuff on there.

The comments to this entry are closed.

About

  • Max Indelicato. Chief Software Architect and technology enthusiast.

Other Services I Use

January 2009

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31