Drupal is one of two most popular content management systems (CMS) written in PHP . It is used as a back-end system for at least 1.5% of all websites worldwide. It is also one with the the slowest systems of this kind on the Internet
There have been many suggestions on improving Drupal performance, some of them recommend the use of APC module, data caching, or even compilation of the entire system through HipHop for PHP. While the first two solutions have been successfully implemented, no one was able to perform the build process.
After many battles with the compiler and the Drupal code, I present you results of the first successful translation of Drupal 7 to C++ language.
Introduction
All tests were conducted on a modified version of Drupal. These changes were necessary in order to ensure compatibility with HipHop translator.
You can download modified source codes from this link.
The system was installed in the minimal version and then launched in three different ways:
- as a standard PHP script
- as a PHP script with APC opcode caching enabled
- in the form of a compiled program
Due to hardware limitations the MySQL server is located on the same machine as the web server, more advanced tests will be performed in the near future using the multiple servers.
Testing platform
Processor: Intel(R) Core(TM)2 Duo CPU E7600 @ 3.06GHz
Memory: 2.5GB RAM
System: Fedora 12 (64bit)
Kernel: 2.6.32.26-175.fc12.x86_64 #1 SMP
The server was used exclusively for testing purposes – it was running only the services associated with the benchmark.
Test Configuration
Apache version: 2.2.15
MySQL Server Version: 5.1.47
PHP Version: 5.3.3
HipHop for PHP Version: 806ee06
Drupal Version: 7.0
Drupal was compiled with this command:
cd ~drupal date && ~/hiphop/hiphop-php/src/hphp/hphp --keep-tempdir=1\ --log=3\ --input-list=files.full.list\ --include-path="." --force=1\ --cluster-count=240\ -v "AllDynamic=true"\ -v "AllVolatile=true"\ -o /tmp/drupal\ --parse-on-demand=0\ --sync-dir=/tmp/sync
And launched with this command:
cd ~drupal /tmp/drupal/program -m server -p 80\ -v "Server.SourceRoot=`pwd`"\ -v "Server.DefaultDocument=index.php"\ -c $HPHP_HOME/bin/mime.hdf\ -v "Log.File=/tmp/errors"\ -v "ErrorHandling.AssertActive=true"\ -v "ErrorHandling.AssertWarning=true"\ -v "ErrorHandling.WarningFrequency=10000"\ -v "ErrorHandling.NoticeFrequency=10000" &
CPU usage
The following test examines the performance of Drupal by simulating the concurrent activity of many visitors on the Drupal home page.
I found that in case of a dual core server four was the optimal number of concurrent users, so I used the ab program as as a benchmark tool and launched it with a following command:
ab -n 300 -c 4 http://achilles.webtutor/
The first result shows the CPU usage of a regular PHP script during the execution of the 300 HTTP requests started by 4 concurrent users:
sy = system CPU usage (gradient color), us = user CPU usage (solid color)
As you can see Drupal is indeed a very demanding system. The test in this case took almost 20 seconds, which gave a dissapointing result of 15 requests per second.
Let’s see what we get after enabling the APC module. Results are as follows:
Drupal performance improved dramatically. The test was completed in 6 seconds, 3 times faster than with traditional PHP! Due to the size of Drupal code, however, this result is not shocking. While the APC module will not speed up the script itself, its opcode cache eliminates the delay caused by having to parse PHP code on every HTTP request.
Interestingly, Drupal is not able to use the full computing power of the test server. The official cause is unknown, however it may caused by some kind of internal locking in the APC module.
Since we know how much the opcode cache improves performance by omitting the PHP parser, its time to test how much we can accelerate the PHP code itself. We check this by translating Drupal source code to C++ and compiling the application:
Compiled Drupal application is five times faster than a regular script, and almost two times faster than a script launched from the opcode cache!
Let’s compare the results. The first is the detailed comparison of CPU usage:
And the overall CPU usage:
Here are the results taken directly from the ab tool:
Environment Execution Type time [300 req] ----------------------------- Regular PHP 19.873 sec PHP + APC 6.396 sec HipHop for PHP 3.896 sec
Concurrency benchmark
In this scenario I measured a number of requests performed per second. The summary results are as follows:
Type of environment | Requests per second [#/sec] | Time per request [ms] | Req/sec ratio [%] |
Regular PHP | 15.10 | 66.242 | 100% |
PHP + APC (opcode cache) | 46.90 | 21.321 | 310% |
HipHop for PHP | 77.01 | 12.985 | 510% |
Other concurrency levels
As a curiosity I decided to investigate how the Drupal’s performance can be affected by a variable number of concurrent users. In order to do so I tested the system with seven different workloads simulating 1, 2, 4, 8, 16, 32 and 64 concurrent users.
Here are the results:
Tabular version:
Users PHP APC HipHop ----------------------------- 1 8.68 28.11 40.23 2 13.34 38.25 56.12 4 15.28 46.82 74.12 8 14.76 49.96 72.12 16 14.04 49.09 74.12 32 12.35 45.00 77.67 64 5.22 39.17 73.02
Please note: this test is heavily CPU bound and should be executed on a multicore servers instead.
As you can see, in case of APC and HipHop for PHP translator Drupal scales quite well up to the 8 simultaneous users on a dual CPU system. Unfortunately the same cannot be said about regular PHP interpreter, which is much slower in every tested scenario.
Different optimization levels in GCC compiler
Without any optimization option, the compiler’s goal is to reduce the cost of compilation and to make debugging produce the expected results.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
Let’s see the results of different optimization levels switched on during the Drupal compilation:
optimization req/sec level -------------------------- default 74.12 -o2 87.11 -o3 90.04
The difference of 12 req/sec between a default and -o2 optimization is quite big and almost equals to the 15 req/sec achieved by an interpreted PHP script!
What’s more in case of -o3 optimization Drupal is up to 6 times faster than in a pure PHP environment.
Summary
As I mentioned on the outset Drupal is not the fastest system on the Internet. However, after several changes in the code to add compatibility with HipHop for PHP, it becomes a very effective tool in the hands of every webmaster.
Other articles about HipHop for PHP
- HipHop for PHP: benchmark
- Strong data typing in PHP, part II: autoboxing and indesctructible objects
- Drupal and HipHop for PHP – part I: compilation
- Mod_rewrite and HipHop for PHP on Apache Web Server
This should become standard practice with Drupal, not sure how this would affect upgrade paths? Great work
What would happen if this were to run on the Zend Optimizer?
@Bjordan: automatic updates are of course unsupported. You have to update files manually and then recompile the entire source code. There is also another solution: run Drupal 7 in a dev box as a pure PHP application, do the automatic update and then recompile the source code and copy the executable file onto the production servers.
————————————-
@wally: Zend Optimizer would of course… slow everything down.
Pure PHP is faster than Zend Optimizer because this module does not have the opcode cache. It is used only to run PHP applications encrypted with the Zend Encoder.
Many people mistake Zend Optimizer with Zend Accelerator (a part of Zend Platform).
Unfortunately I do not have enough free time to test this solution, but I think, that Zend Accelerator is as fast as APC or eAccelerator. Both APC and Zend Accelerator provide the opcode cache and do some minor tweaks to the cached PHP source codes (bad news is that APC module has these optimizations disabled in its newest versions).
Hi,
Just to clarify, Zend Optimzer+ IS an opcode cache bundled with Zend Server and Zend Optimizer is not an opcode cache as you mentionned.
Btw I’m very surprised by the system cpu used by Drupal.
Thanks for the benchmark.
Hi,
Glad to know HipHop is working greatly with Drupal!
Just FYI, though, Drupal is the third most popular CMS worldwide (after WordPress and Joomla!)
.
@Eric, the probable cause of this behavior (high system cpu usage) is described in my other post, see this link for details:http://php.webtutor.pl/en/2011/06/02/running-php-on-nfs-huge-performance-problems-and-one-simple-solution/ . The article describes problems with NFS, but I think it is also valid for different scenarios with Drupal.
Do you have the settings you used for the APC benchmark – or could you share the whole benchmark configuration on github?
@Mats, as I remember the APC and HipHop for PHP settings are both set to default, I’ll post them tomorrow, because a test server used in this benchmark is currently offline.
Thanks for the nice article, keep up the good work.
this is just using Drupal core, though, right? How does PHP handle code that isn’t compiled? e.g. would it be possible to compile and run the Drupal core with Hip Hop, then leverage standard PHP/APC caching for contrib/custom modules?
@CraigMC:
there is an experimental “phpi” application, it has the features of a regular PHP interpreter, but converts the PHP files into C++ code. Unfortunately I did not tested it, so I don’t know how good it is. There is no option to make a mixed PHP/HipHop for PHP mode, or at least I’m unaware of it.
我用 hphpi运行一个cms 系统正常,但是我运行编译出来的文件(program),cms系统生成的静态网页全是0kb,为什么?
谢谢!
Thanks for the benchamarks. Was APC stat (filemtime) turned off ?
Because if not, it was not a best case comparison.
Please print what APC settings were used?
Try apc.stat = 0 and apc.slam_defense = 0
also give APC as large of a memory segment as it needs.
Hi _ck_
I used default APC configuration which can be found here:
http://www.php.net/manual/en/apc.configuration.php
This may not be the best case comparison, but keep in mind that I also used the default HipHop for PHP configuration during compilation.
I see no point using anything other than default (stable) configuration, because it would give unfair advantage to one of two solutions (it would depend which solution I know better).
Sure I could also disable atime and mtime on my local EXT3 filesystem (this would speed up APC more than HipHop), use FastCGI rather than mod_php, lighthttpd rather than Apache, and put everything into ramdisk or SSD disk. But this would be an edge case example.
btw. you can download a VMware image with preinstalled HipHop for PHP compiler from my other article and check your own settings.
Slow with hiphop, I’m C++ guy, I found many problems related to memory leaks. Unless you are prepared with work with schedule service startups and writing additional .bash scripts, it’s not worthy to go with PHP HipHop.
One more, generally CMS got execution path as, 10% server (lending request, parsing html get, rewriting urls etc), 40% rendering (server pages) and 50% db performance. Most of the benchmark results don’t explain what amount of data, complexity of plugins, number of queries and other dependent calls in background.
Looking at my experience in large, I see you can gain 10-15% overall performance over the best optimized PHP setup (Nginx/Lighttpd + APC + Memcache + Static File Generation Plugins + CDN support).
Facebook has limited scope of caching as, it’s stream/timeline service, for them it does make sense, the scale allow them decent budget to maintain such overhead over development cost.
Rest is your choice. I’m pro HipHop but for right solution.
N.
Yes, you’re right about the execution paths. From my experience, the 60-70% of execution time is consumed by the DB processing/communication.
Unfortunately on larger setups and busy sites even extra 10%-30% can make a huge difference. Correct me if I’m wrong, but in most cases performance degrades exponentially (especially when there is more parallel request than available CPU’s, etc).
I’m also not suggesting that HipHop for PHP is a silver bullet for every one, just look at my other articles about HH where regular PHP is sometimes up to 10x faster than compiled applications.