Biz & IT

A faster Web server: ripping out Apache for Nginx

Sometimes Apache can be overkill. Here's one man's tale of replacing his …

Lee Hutchinson – Nov 13, 2011 4:00 pm | 80

Coaxing 10,000 requests per second from Nginx, on a laptop Credit: Photograph by ludwig van standard lamp

I am, at best, a fly-by-night sysadmin. I grew to adult nerdhood doing tech support and later admin work in a Windows shop with a smattering of *nix, most of which was attended to by bearded elders locked away in cold, white rooms. It wasn’t until I started managing enterprise storage gear that I came to appreciate the power of the bash shell, and my cobbled-together home network gradually changed from a Windows 2003 domain supporting some PCs to a mixture of GNU/Linux servers and OS X desktops and laptops.

Like so many others, I eventually decided to put my own website up on the Internets, and I used the Apache HTTP server to host it. Why? I had an Ubuntu server box sitting in front of me, and Apache was the Web server I’d heard about the most. If Apache was good enough for big sites, it should be good enough for my little static personal site. Right?

But it wasn’t quite right for me. Here’s why—and what I learned when I spent a weekend ripping out my Apache install and replacing it with lightweight speed demon of a Web server called Nginx.

Old and busted

Apache was easy to set up. I almost typed “trivially” easy, but going into an Apache setup with nothing more than a plucky attitude and the knowledge that “Apache is some software that hosts websites” means you’re going to face a learning curve. Still, after no more than an hour or two searching Google for help and poking through Apache’s conf files, I had a website, and it was on the Internet! A few months later, Ars ran a piece on getting free SSL/TLS certificates. I immediately wanted to try it—not because I had any real need for it, but just to see how certs worked. Less than a day after the piece ran, I had a class 2 wildcard SSL/TLS certificate for my domain, and my Web server was rocking the https.

Things ran well this way for a couple of years, but as I started doing more with the Web server, it began to be apparent that my setup, while perfectly workable, could be better. In particular, adding Tectonicus (a Minecraft map renderer which generates millions of tiny tiles and stitches them together with a Google Maps-style interface) to the Web server showed me that things were less than optimal. Even over my local network, Apache struggled to serve the map at a suitably snappy pace. The Web server is a dual-core AMD E-350 with 2GB of RAM and a Vertex 2 solid state drive (SSD), and it would serve the site’s static images instantly. But the htop tool showed that the Apache processes went CPU-crazy any time the Tectonicus map was being served; both cores shot 100 percent usage as the screen slowly filled with tiles.

Additionally, I began running a small wiki on the same box. This used Dokuwiki, a wiki server which can be skinned to closely resemble MediaWiki but which stores its data in flat files rather than requiring a database. Dokuwiki requires PHP, a widely used scripting language that runs on a huge number of Web servers around the world, so this meant I needed to install some manner of PHP package into my current setup.

There were many paths to take. Since I had installed Apache on Ubuntu the easy way, by typing “sudo aptitude install apache2,” I got what is known as the Apache MPM Prefork version. This is the most commonly installed version of Apache, and it works by launching a number of separate Apache processes to handle Web requests. It does not use multiple threads, but instead parcels work out to child Apache processes (for a good refresher on the difference between a thread and a process, check out this Ask Ars feature on the topic). Prefork is the default Apache installation because Apache is an extensible Web server that can be customized to do all sorts of useful things by adding modules, and some of the modules that people might want to install don’t work well when run in a multithreaded fashion.

The drawback to doing everything with processes is that Apache prefork can be a bit of a memory hog, especially under load. Another precompiled flavor of Apache can be installed as an alternative: Apache MPM worker. “Worker” differs from “prefork” in that worker’s processes are multithreaded, giving them the ability to service more requests with fewer system resources. This can translate into faster pages served with less RAM and CPU. However, because some Apache modules don’t necessarily work well when run under multithreaded Apache, you have to specifically select this version to install on Ubuntu and on other GNU/Linux distros with package management.

A bit of searching showed that Apache worker could go a long way toward making Tectonicus serve its tons of tiles faster, but switching would cause some issues with PHP. The built-in Apache PHP module, “mod_php,” is one of those modules that can have issues running multi-threaded. I was faced with quite a bit of software ripping and replacing to switch from mod_php to a standalone PHP.

A post by Ars forum member Blacken00100, however, pushed me in a new direction entirely. Apache with standalone PHP might prove far less optimal than a lightweight event-driven Web server like Nginx with standalone PHP. My mental wheels began turning. I figured that, so long as I was going to be doing some work, I might as well go all the way and see if I could set up what is widely regarded as the fastest Web server around.

The new hotness

Nginx (pronounced “engine-ex”) is a lightweight Web server with a reputation for speed, speed, speed. It differs from Apache in a fundamental way—Apache is a process- and thread-driven application, but Nginx is event-driven. The practical effect of this design difference is that a small number of Nginx “worker” processes can plow through enormous stacks of requests without waiting on each other and without synchronizing; they just “close their eyes” and eat the proverbial elephant as fast as they can, one bite at a time.

Apache, by contrast, approaches large numbers of requests by spinning off more processes to handle them, typically consuming a lot of RAM as it does so. Apache looks at the elephant and thinks about how big it is as it tucks into its meal, and sometimes Apache gets a little anxious about the size of its repast. Nginx, on the other hand, just starts chomping.

The difference is summed up succinctly in a quote by Chris Lea on the Why Use Nginx? page: “Apache is like Microsoft Word, it has a million options but you only need six. Nginx does those six things, and it does five of them 50 times faster than Apache.”

Nginx particularly excels at serving static files—like the Tectonicus map tile images. For larger websites, it’s often employed as a front-end Web server to quickly dish up unchanging page content, while passing on requests for dynamic stuff to more complex Apache Web servers running elsewhere. However, I was interested in it purely as a fast single Web server.

Like everything else mentioned in this article, Nginx is available from the Ubuntu package repositories with a quick “sudo aptitude install Nginx.” After stopping Apache, I had Nginx installed in moments. Building further on Blacken’s advice, I also instead installed php5-fpm, which is a heavily modified PHP package with built-in FastCGI capabilities. Blacken’s recommendation to go with php5-fpm instead of the older and better-known php5-cgi bundle was because fpm’s ability to turn on or turn off new PHP processes as dictated by server load makes it a much smarter and more powerful package; it can consume fewer resources and at the same time scale transparently under load and maintain speed.

If your needs are simple, like mine, then getting an operational PHP installation with php5-fpm is an easy affair. The main configuration file (/etc/php5/fpm/php-fpm.conf under Ubuntu 11.10) didn’t need to be altered at all, while the pool configuration file (/etc/php5/fpm/pool.d/www.conf) only needed some slight adjustment. The pool conf file defines how php5-fpm will accept CGI requests from the Web server; by default, php5-fpm listens on TCP port 9000 for requests from the Web server, but I changed this to use a Unix socket file instead, since running CGI requests through a local TCP port introduces some tiny amount of latency. It likely won’t matter unless your website will be churning out lots of pages, but I wanted to do things the “correct” way. Additionally, the pool conf file lets you specify the user and group that the pool processes will run as—it’s a good idea to set this to the same user and group that your Web server uses.

Most importantly, the pool conf file lets you define the minimum and maximum number of PHP processes that will be spawned if php-fpm is configured in “dynamic” mode. This lets you start with only one or two active processes to serve PHP requests, but you can tell php-fpm that it’s allowed to spawn more processes as needed. The only real limit is the amount of RAM and CPU you have to spare. For my tiny website, I set php-fpm to start with a single process, with the option of spawning up to 10. Finally, the pool conf file lets you specify traditional PHP configuration values, like maximum memory usage, maximum upload size, the location of your sendmail binary, and so on.

Configuring Nginx

After saving the configuration file and bouncing the daemon via its init script, I had a fully functional PHP environment, and I was ready to turn my attention to the Web server. How could I adapt an existing Apache configuration, one that included functional SSL/TLS support, to Nginx?

Turns out it’s actually pretty darn easy, since Nginx doesn’t have nearly the configuration richness of Apache. For small sites, this is a very good thing! Nginx, when installed under Ubuntu via a package manager, uses an Apache-like directory structure. Everything user-configurable resides under /etc/Nginx; there’s an Nginx.conf file for holding all the global settings, a conf.d directory for placing additional conf files that will be parsed and included in the running configuration, and sites-available and sites-enabled directories, for defining the actual websites and their specific configurations.

The contents of /etc/Nginx will be somewhat familiar to Apache users

It’s notable what Nginx doesn’t use for configuration—there’s no support for .htaccess files. Any configuration you want done on specific subdirectories must be handled in either a conf file or one of the site definition files. If you’re thinking of switching to Nginx and your site relies heavily on .htaccess trickery, either for defining access or for adding rewrite rules or anything else, you need to reassess and see if whatever you’re doing can be recreated in the conf files instead. This also means that some Web applications that depend specifically on the presence of .htaccess files probably won’t work well (or at all) under Nginx.

In spite of this, my site adapted well to Nginx, and the main conf file required almost no editing. The most important setting in the main file is the “worker_processes” setting, which defines how many Nginx processes will run. Since one worker process can handle thousands of simultaneous requests, a good rule of thumb is to have one worker process per CPU core. In my case, I set this to 2. The file also lets you specify which user the Nginx processes will run as; since I installed Nginx via a package manager, this was pre-configured to run as the www-data user, just like Apache.

The remainder of the configuration was done in the sites-available directory. Much like Apache, with Nginx you create site definitions in the sites-available directory and then symlink them into the sites-enabled directory, which Nginx parses through at startup. However, unlike Apache, where I had one separate file for non-SSL and another for SSL, the “default” file included in sites-available has both HTTP and HTTPS virtual hosts defined within.

Nginx carries forward the Apache concept of virtual hosts, and offers enough configuration options to satisfy most sites. You define a virtual host name, a Web root where the files to be served are located, and then call out any specific location permissions and directives. Rewrites are also handled in the same file, rather than potentially being called out in multiple places as in Apache. This can lead to a more complex site definition file then you might have for an Apache site, but the configuration is centralized.

There’s a bit of a difference in setting up SSL, too, as Nginx doesn’t support separate chain certificates like Apache. If your site’s certificate requires an intermediate certificate bundle, you’ll have to concatenate your site’s certificate into the bundle before you can use it. Further, when serving files with SSL/TLS, Nginx defaults to using the extremely secure but also extremely slow DHE-RSA-AES256-SHA cipher on encrypted connections. This is good if you need very tight encryption for your pages, but not so good if you need your pages to be served quickly, as the extra Diffie-Hellman encryption it adds is computationally intensive. If you’re going to be serving encrypted pages, it might be a good idea to disallow the DSE-RSA-AES256-SHA cipher and have Nginx fall back to using plain AES256-SHA. The method for doing so, as well as some more information about the issue, can be found on this page.

In order to make Nginx correctly pass PHP files to the standalone php-fpm installation, all you have to do is set up a handler under each virtual host that needs to use PHP, like this:

location ~ .php$ { try_files $uri =404; fastcgi_pass unix:/var/run/php5-fpm.soc;}

This tells Nginx that files located anywhere under the Web root and which end in .php should be passed with FastCGI through the socket at /var/run/php5-fpm, which is where the php5-fpm processes are listening for work to show up.

One important tip—there’s a well-known potential vulnerability that exists when combining Nginx with FastCGI and PHP that can allow malicious visitors to send non-PHP files through the PHP handler. The vulnerability takes advantage of PHP attempting to be as helpful as possible with the things it gets handed by the Web server; it is possible to tell PHP to execute files that don’t actually end in PHP by feeding it an incorrect filename that does end in PHP. There is an option in php.ini that can be set to stop this from happening, but a good number of popular PHP applications actually depend on this overly helpful behavior, so it’s easier to address from the server side. That’s what the “try_files $uri =404;” line is for—it instructs Nginx to first try to serve the exact URI it’s been given, and if that URI does not explicitly exist, to report a 404 error rather than passing the URI on to PHP.

The vast majority of Nginx + PHP tutorials on the Web gloss over this configuration pitfall, even though it’s been around for nearly two years. (I mention it here because it’s an easy-to-avoid problem!)

The remainder of the configuration file work I had to do was with rewrites, all of which are concerned with cleaning up Dokuwiki URLs and making them easier to read. Translating from Apache’s extremely rich rewrite language to Nginx’s took a small amount of guesswork, and if your site relies on serious rewriting, you likely don’t want to use Nginx. Its rewrite engine is flat-out not as powerful as Apache’s.

A set of .htaccess-housed rewrite rules for my hosted wiki, to clean up URLs and deal with secure log-ons…

…and most of the equivalent rewrite rules in Nginx

Smooth sailing

And with that, I was done. So how well does it all work?

It works very well indeed. Putting aside the PHP bits for a moment, the main reason I wanted to switch was because of flat file serving speed. In my entirely subjective tests, Nginx destroys Apache here. The same Tectonicus map that formerly took seconds to fully populate the screen now appears instantly, and is also instantly responsive to mouse input. Dragging and zooming are fluid and fast, unlike the same map under Apache, which would lag and skip like it was being hosted on a hamster-powered server in Antarctica instead of a LAN box on the other end of a gigabit Ethernet connection. I am extremely pleased.

The two Nginx worker processes and the lone PHP-FPM pool process, using only about 14MB of physical RAM

The improvement is a bit murkier with the PHP-based wiki, but it’s most decidedly not worse than it was before. There are several image-heavy pages on the wiki and loading them through Apache would take perhaps 3-5 seconds; the same pages load in roughly the same amount of time on Nginx. However, the RAM footprint of the Nginx+php-fpm set up is considerably smaller than the large Apache prefork configuration I had originally, and the CPU utilization while servicing a dozen or so page loads is much lower.

The switch took most of a Saturday between a few false starts and a lot of Googling, with a big chunk of the time burned up trying to fix the rewrite rules and make the wiki happy. I learned a lot from the experience, not the least of which is that Nginx does what it says on the tin—it’s a damn fast Web server.

Listing image: Photograph by www.pc-freak.net

Lee Hutchinson Senior Technology Editor

Lee is the Senior Technology Editor, and oversees story development for the gadget, culture, IT, and video sections of Ars Technica. A long-time member of the Ars OpenForum with an extensive background in enterprise storage and security, he lives in Houston.

80 Comments