First, what is horizontal and vertical scaling?

Scaling is the action taken to increase the capacity of a software environment to address performance problems. (If you're dealing with web services, you might want to know about scaling since there are many performance concerns with that technology set.) Suppose your order entry system is running slowly Footnote 1: The authors once worked on a consulting engagement where the system had chronic performance problems. The internal IT community was actually trying to resolve the performance problems by applying a great deal of effort toward enhancing the application with a cartoon image of a running panther that would appear while end users were waiting. We suggested instead that a dialogue be presented to the end user with a dynamically updated query against that same software company’s currently plummeting stock price. Unfortunately the company mysteriously collapsed before either technique could be implemented. . To vertically scale that system, you would go out and buy more CPUs and put them in the order entry server to make it go faster. To horizontally scale, you go out and buy more machines and add them to the network of machines that run the order entry system. The trick is that you can’t always vertically scale, and you can’t always horizontally scale. A blend of each is usually appropriate.

So what's "fragmentation" and what does that have to do with my project going horrifically wrong?

Fragmentation is an important concept to understand when horizontally scaling. Fragmentation occurs because companies have a strong tendency to horizontally scale by adding specialty servers rather than by adding all purpose servers. That is, not all machines in the environment service the same requests. For example, in web based environments, one group of machines might service requests for one URL (like while another group of machines service requests for a different URL (like The peak loads for different parts of the enterprise often occur at different times during the day. Consequently, in the morning, one group of machines may be overloaded with work while other machines stand by relatively idle. In the evening, the reverse condition may exist. With specialty servers, the usable capacity of the system is said to be “fragmented”.

Vertical scaling is less expensive than horizontal scaling, but as additional CPUs are added, the law of diminishing returns kicks in. Each CPU must share the system resources (bus, cache, disk, etc.) with the other CPUs. Obviously there’s a much greater proportional improvement upgrading from 1 CPU to 2 than there is from 98 CPUs to 99.


So how do you know when you should be adding more machines instead of adding more CPUs? 

OK – this excerpt is pretty short, mostly because it isn’t too hard to understand what scaling is.  However, understanding when to horizontally scale and when to vertically scale requires a lot more explanation.  The analysis portion of the scaling chapter is about 6 times as big as the initial overview (offered on this web page).  It's critical to know when you should be adding more machines instead of adding more CPUs.  You'll also need to know how those guidelines change between different operating systems and hardware and what kinds of applications lend themselves better to one kind of scaling than the other.  We cover those topics too. You can really save your organization a lot of money and save yourself a lot of headaches if you make the right choices.  The price of our book is roughly comparable to the price of a few bottles of aspirin, but aspirin won't solve your scaling problems. Our book is an easy read and it's a business expense. Click the “Buy Now” button below before that migraine sets in!