|
SCALABILITY will no longer work without PERFORMANCES.
The (dire) consequences of the CPU frequency halt
ARTICLE UPDATE:
PHP users wanted a comparison between G-WAN C scripts and PHP. Here it is Apache/PHP (using the latest XAMPP on Windows with ZendOptimizer)**:
ab -k -c 500 -t 1
http://10.10.2.6/ loan.php?name=Eva&amount=10000&rate=3.5&term=1
http://10.10.2.6/csp? loan&name=Eva&amount=10000&rate=3.5&term=1
SINGLE-CORE: on average, G-WAN is 118x faster (up to 7,301x)
MULTI-CORE: on average, G-WAN does 4.4x more with 4 Cores
G-WAN's loan.c, unlike loan.php, generates pretty tousands ("1,000,000").
G-WAN makes vertical scalability work, and horizontal scalability fly.
How many can afford to ignore a way to use much less servers for Cloud computing or Software as a Service is unclear. What is clear is the outcome for G-WAN users: they will make more money -faster.
And with a PHP to C scripts translator, PHP users will benefit from G-WAN.
ORIGINAL ARTICLE:
Recently, Facebook reported that newly deployed multi-core servers did not bring the benefits it expected . We will show why.
As the CPU frequency race halted , users have to learn how to exploit the power of multi-core CPUs (parallelism) -or prepare to face skyrocketing costs.
Scalability: ability (a) to handle growing workloads or (b) to be enlarged .
(a) = capacity: great because it does not require new hardware.
(b) = modularity: great because (a) has necessary its limits.
Modularity can be vertical (more CPUs), and horizontal (more servers).
Systems are designed for modularity rather than capacity because the latter is much more expensive to achieve (instead of just adding more cores to a CPU, you have to build a more capable CPU).
We will see that (a) is better not only because it saves money -but also because, as (b) has also its limits, (a) may be the only way for (b) to work.
Consider Apache/Python (Facebook uses PHP, but Python's case is simpler).
As Python is not thread-safe CPUs are either waiting or used by (unproductive) inter-process communication (like FastCGI).
So, Apache will (moderately) benefit from several CPU cores but not Python (unless you make Apache scale even less with a FastCGI pipeline overhead).
Now, let's say that this particular Python program takes 90% of the total request execution time, so Apache accounts for only 10%.
In this situation, Amdahl's Law says that the maximum speedup achieved by using more CPUs is only: 1/(1–0.10)=1.11 times faster.
Horror: 100 billion CPUs would not give Apache/Python visible speed gains.
Vertical scalability (more CPUs) failed because of bad design/performances.
Thanks to Moore's Law, new faster CPUs allowed to serve more requests. Periodically replacing servers was a cheap way to sustain growth.
No longer: a CPU frequency wall was hit. New CPUs are not faster: instead, they use parallelism (multiple cores) to deliver more power (not more speed).
Today, if your applications do not use parallelism well, you have to deploy more servers to serve more requests. That's horizontal scalability (more servers).
To keep up with Moore's Law like in the past, Python users would have to double the number of their servers every two years (64, 128, 256...) or accept the unproductive inter-process communication overhead tax. This will cost much more than just replacing servers.
The business model of many Web players will no longer work as FastCGI makes Apache/PHP even slower without faster CPUs to cope with this additional load.
G-WAN's code scales linearly and even logarithmically. Here, the speedup achieved by using 4 cores is optimal: 1/(1/4)=4.4 times faster* (70.4 times faster with 64 cores).
While others must add servers, you can just upgrade them (adding CPU cores): vertical scalability works.
A reverse-proxy balances the load on G-WAN C scripts: no FastCGI needed.
Nothing beats performances: instead of resolving problems they avoid them.
(*) See top of page: on average on the 1-1,000 concurrency range, G-WAN does 4.4x better on an Core2 Quad 3GHz (without hyper-threading) than on a Single-Core 3.06GHz (with hyper-threading).
(**) Single-Core: Dell Precision 650 (Oct. 2003), Intel Xeon 3.06GHz, 1GB RAM @ 266MHz, Windows XP Pro SP3.
Multi-Core: Acer G7700 (Aug. 2008), Intel Core2 Quad X9650 3GHz, 4GB RAM @ 800MHz, Vista Ultimate 64-bit.
|