The WordPress i18n (internationalization) API is a pretty inefficient one, sorry. While the translation files appear to be in PO/MO format and can be parsed by gettext implementations, you’ll be surprised to find out that WordPress uses its own gettext implementation written completely in PHP. This is because the gettext PHP module is not built into PHP by default, so many deployments out there will not have it.
Since the PHP implementation is several times slower than the module-based one there have been several attempts to support the latter. None of these attempts have come close to being committed into core. With all efforts concentrated in ticket #17268, there are serious hurdles to overcome as there turned out to be major differences between the native/module implementation and the POMO library included with WordPress.
Until then, we have to come up with other optimization solutions for translations. Let’s discuss why they’re so slow. If you’re not running a localized website you will never notice, but on localized ones, they take anywhere from 100 ms to 500 ms depending on how many plugins are installed. Why so much? To answer the question we have to have a high level understanding of how translations work in WordPress.
If you’re here for the download and don’t want to bother with the technical background for our solution, then head over to our GitHub repository.
Why so slow?
Every plugin, theme and major parts of the core will load compiled language files via the
load_plugin_textdomain function. The compiled language files are parsed and loaded into memory. This is the operation that takes quite a bit of time. Since the compiled files are binary there is a lot of low-level bit and byte operations, transforming maps that are meant to be scanned on the fly into PHP arrays in memory. This pretty much defeats the whole purpose of having compiled translations files.
Moreover, since the above happens during the loading phase of all active plugins, the core loads all the strings it may ever need into memory. There is no lazy loading. Plugin developers don’t even differentiate between backend string and frontend ones. So you have all these strings in memory, and invested the CPU cycles into getting them there, for 10% of the strings on the screen, if you’re lucky. Often times, for frontend pages, you’ll be using 1% of them.
This is quite awful and some WordPress developers have tried to address this in different ways. Mostly by trying to cache the decompiled translation data into a less resource-intensive format. Some have done this by using PHP serialization, others by using JSON serialization. Both are native C code and run much faster than the POMO library. This is a decent improvement of probably around 20-30%.
We set out to break the speed of sound with our own translation cache solution. The first thing we noticed is that serialization is still expensive. While JSON is faster than PHP serialize (at least on our nodes with their underlying JSON libraries installed), getting rid of serialization completely would bump performance up a notch. So the first thing we did was to leverage the Redis cache that all our servers have preinstalled. We use it for full-page caching in WordPress and thought that it would be a nice shortcut.
The plan was to store every string in the key-value store and have the strings always ready for fetching and unserialized. Moreover since the fetches were being done ad-hoc we didn’t have to load them all into memory. Unfortunately, this didn’t work out too well. The overhead of a couple of milliseconds per translation call added up over the thousands of requested strings. This negated any performance profit we got from not accessing disk, not unserializing, not taking up memory. Although, memory was still taken up – in Redis.
Back to the drawing board. The second iteration of our translation caching prototype dumped executable PHP code instead of serialized data. By including the cache file we skipped unserialization step all together. This was a major improvement of up to 50-60%. This was not enough, though. We wanted more.
Looking through our performance profiles the next bottleneck was disk access. The answer was turning on opcache. This proved to confirm the hunch we had when the idea of writing code and including it might have its bytecode cached. The gains were enormous!
With some more polishing, micro optimizations and lazy-loading we were able to achieve optimum performance that beat all existing solutions out there. And we called it pomodoro which spans out to “POMO d’Oro” – the golden POMO cacher ;)
After almost a year in controlled alpha and beta testing we are happy with the solution. It is now installed on localized WordPress sites on all our servers. If you’re not hosted with us don’t worry. The code is open source along with the development history which you can look into in the Git logs on our GitHub page. We’re always open to new ideas and contributions so go ahead and send in the PRs.
While the plugin is mature enough to be used in production, there are few limitations and things to look out for:
- opcache is a must, your gains will be around 50% lower without it
- using it alongside Loco Translate may yield unintended results in some rare cases
- the temporary directory has to be accessible for reads/writes and
- in some cases if the files become corrupted the whole site will crash with a syntax error (we’ve seen it happen once in a million when diskspace is low)
But don’t let these scare you. The plugin is being used on thousands of sites with no issues.
Questions, comments? Feel free to reach out via e-mail at firstname.lastname@example.org. And do let us know how much faster your WordPress site becomes!
May performance be ever in your in favor.