Profiling WordPress Performance

Performance is key for high-traffic WordPress sites, and there are two main ingredients for great WordPress performance – caching and profiling. In this short tutorial we’ll focus on profiling, how it complements caching, and why all of it really matters.

Just Cache It

When your WordPress site starts to feel slow, and you google the topic, the most popular advice you’ll see is to get some form of page caching. Indeed, page caching tremendously helps serve pages faster, and in many cases it’s as simple as activating a plugin or adding a few of lines to your web server config. That’s great advice, you should definitely do that.

But caching is only one piece of the puzzle. It doesn’t answer the question of why your site is slow, and arguably, neither does it make your site faster — it simply hides the fact that some portion of your application code is slow, and makes it seem faster by not executing that portion. Which is okay.

However, in some scenarios you can’t rely on caching, for example:

  • Logged in requests, including all of wp-admin
  • Pages with dynamic content: WooCommerce carts, checkout flows, etc.
  • Unique search queries
  • Requests with unique PHP sessions or other cookies

Membership sites, BuddyPress networks, bbPress forums, WooCommerce and other e-commerce sites, all suffer from poor cache hit rates (or cacheability) because of the nature of how they work. Furthermore, most page caching plugins will stop serving cached requests to any user that has left a comment on your blog. Though luck.

At this point you should resort to profiling.

Profiling

Profiling is the process of analyzing the execution of an application, measuring the frequency and duration of function calls, memory consumption and more. It helps you figure out where exactly is the bottleneck in your application.

Functions Profiling Metrics
Functions Profiling Metrics

To get started with profiling you’ll need to have a profiler available on your server, which usually comes in form of a PHP extension. The most popular choice by far is Xdebug, which is a fully fledged debugging and profiling suite for PHP. It comes pre-installed on many local development environments.

Note: If you’re hosted with Pressjitsu, you already have access to our integrated profiling tools available on all servers (screenshot above). If you need Xdebug activated on your staging server, just open a support request.

Tideways

However, in this tutorial we will work with the simpler Tideways extension, which is a very lightweight profiler for PHP and, unlike Xdebug, can be used in production environments.

To install Tideways on a Debian/Ubuntu box, you’ll need:

echo 'deb http://s3-eu-west-1.amazonaws.com/qafoo-profiler/packages debian main' > /etc/apt/sources.list.d/tideways.list
wget -qO - https://s3-eu-west-1.amazonaws.com/qafoo-profiler/packages/EEB5E8F4.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install tideways-php

There are pre-compiled packages for other distributions as well. The tideways-php package is enough to get you started — no need for the daemon, unless you’re planning to use the paid Tideways.io service.

Generating a Profile

Once you installed a profiler, you’ll need to learn how to start and stop it. The Tideways extension has a whole library of useful classes and functions, most of which are geared towards their paid service, but the core extension exposes two simple functions that you truly need:

tideways_enable(); // start profiling
tideways_disable(); // stop profiling

If you’d like to generate a profile for the complete request, you’ll want to start the profiler as early as possible, and stop it as late as possible. Starting early for WordPress means the wp-config.php file, right before loading wp-settings.php. Stopping the profiler can be done in a registered shutdown function:

tideways_enable( TIDEWAYS_FLAGS_MEMORY + TIDEWAYS_FLAGS_CPU );
register_shutdown_function( function() {
    $results = tideways_disable();
    // Save these results
});

Note that even though Tideways is pretty lightweight, a profiling request will still be slower than a regular request, and will consume more server resources, so don’t attempt to profile each and every request, especially on a busy site. The simplest way to limit profiling to your requests is to use a query string, so in wp-config.php:

if ( isset( $_GET['profile'] ) && $_GET['profile'] === 'secret-string' ) {
    tideways_enable( TIDEWAYS_FLAGS_MEMORY + TIDEWAYS_FLAGS_CPU );
    register_shutdown_function( ... );
}

Then to run a profiling request on your site you could simply go to example.org?profile=secret-string from your browser.

Saving the Profile

The tideways_disable() function returns a PHP array with all the information about your function calls. The easiest way to read through the results, is to save the array into a file and use a user interface, which will sort and aggregate the data, and allow you to dig through it.

The most common format is .xhprof, which is simply the array as a serialized PHP string. So in your shutdown function:

$results = tideways_disable();
file_put_contents( '/tmp/my-profile.xhprof', serialize( $results ) );

This will write a my-profile.xhprof file to your temporary directory. There are plenty of GUI tools out there to then view this file, including the original xhprof_html, xhgui, uprofiler.

Here’s the complete code for your wp-conifg.php file to generate and store profiling data:

if ( isset( $_GET['profile'] ) && $_GET['profile'] === 'secret-string' ) {
    tideways_enable( TIDEWAYS_FLAGS_MEMORY + TIDEWAYS_FLAGS_CPU );
    register_shutdown_function( function() {
        $results = tideways_disable();
        file_put_contents( '/tmp/my-profile.xhprof', serialize( $results ) );
    });
}

Interpreting the Results

Once you can view the results in a GUI, you’ll want to start making sense of it all. You should always start with a list of functions ordered by their exclusive wall time – that’s the time spent inside of the function itself, but not in any of its descendants.

WordPress Profile on xhprof_html
WordPress Profile on xhprof_html

The top few in the list are your main concern, but knowing which function is causing the slowness in your WordPress site is not nearly enough. You’ll want to know where exactly that function is called from — which plugin or theme is responsible.

Luckily, all of these GUIs allow you to dig into each function and look at its parents. By digging through the parents and looking at the inclusive wall time, you should be able to easily spot where things go wrong.

Most developers will prefix their functions with their plugin name, so it’ll be easy to spot and then search your codebase. Also keep an eye out on the memory usage metrics. A big spike in memory usage is often a result of poorly written code.

WordPress-specific Examples

Over the years here at Pressjitsu, we’ve profiled thousands of slow requests, and below are some great examples of things we often see, and some tips on how to solve them.

curl_exec()

This is an HTTP request, likely to a remote server which may be taking a long time to respond. By following through the function’s parents (WP_Http methods, wp_remote_* functions), you’ll be able to find the initiator of such a request.

Note that there may be several calls to curl_exec() and not all of them may be slow, so make sure you watch the inclusive time when digging through.

When you do find the initiator, try and figure out what’s going on by inspecting the plugin/theme code, perhaps you could cache the results of the request for a while in a transient or elsewhere, so that you wouldn’t have to perform an HTTP request on every visit.

mysqli_query()

This signals that there’s a slow MySQL query somewhere. You can follow the chain of parents to try and figure out the initiator, but often times you’ll land on core’s main query routine. This usually means that a plugin or theme altered the main query (through pre_get_posts or other means) and made it slower — very typical for featured posts, related posts, sorting/filtering options, etc.

In this case it’s best to look at the query itself, which you can either find from your MySQL query log, slow log, or using tools such as Debug Bar or Query Monitor.

Debug Bar for WordPress
Debug Bar for WordPress

For example, if the query contains a JOIN on wp_postmeta which looks for meta_key = '_foobar', then you’ll want to search your codebase for “_foobar” and likely land on a meta_query or WP_Meta_Query. Unfortunately there isn’t an easy fix for when you do find it, you’ll have to figure out a way to simplify that query, move it to custom table, Elasticsearch, or just get rid of it completely.

wp_load_alloptions()

If wp_load_alloptions() is anywhere near the top of your slowest functions or functions that consume most memory, you’re in trouble. This core function loads all data from the wp_options table that’s marked to autoload.

The solution to this problem is to thoroughly clean up your wp_options table, remove transients, options from unused plugins/themes and other things. We have a pretty detailed guide on how to optimize wp_options for speed.

Where to Next

After implementing a fix for a problem that you’ve found through profiling, you should run the same profile again and compare the results, just to make sure that you’ve actually fixed the issue, and haven’t introduced a new one. Rinse and repeat until you’re satisfied with the results and move on to a different request.

When you’re confident enough to run the profiler on requests from actual visitors, make sure you limit yourself to small samples and not profile every request. For example, to profile one in every 100 requests to WooCommerce’s /my-account/, you could:

if ( $_SERVER['REQUEST_URI'] === '/my-account/' && rand( 1, 100 ) === 1 ) {
    tideways_enable( ... );
    register_shutdown_function( ... );
}

You could also start a timer and save profiles only for requests that are above a certain threshold, for example longer than 1 second. Keep in mind that profiling requests are always going to be slower than regular ones, so don’t beat yourself over absolute time metrics.

In some cases, unfortunately, you won’t be able to get to the bottom of what’s slow from analyzing such profiles. Unlike true hierarchical profilers, Tideways and XHProf do not keep the full call stack at every gathering point, hence the low overhead. So if you feel like you’re stuck, it’s often a good idea to resort back to Xdebug.

An Xdebug-generated Profile in KCacheGrind
An Xdebug-generated Profile in KCacheGrind

If you’re hosted with Pressjitsu, we have a built-in profiler available on all servers, and we’ll be happy to help you run some profiles, help understand the results and implement fixes. Just open a support request through your account panel.