/dev/blog
Bez Hermoso, Software Engineer @ Square
Some back-story: In the course of contributing some Swagger-specific features
to the awesome NelmioApiDocBundle,
the Symfony\Component\Config\ConfigCache
class was brought to my attention. To give you an idea how this fits in, you
should know that the bundle generates an HTML page
documenting your REST API. It gets the needed information from metadata declared as @ApiDoc
annotations in controllers
in the Symfony app. On top of that, the bundle also processes metadata from different libraries:
integration with JmsSerializerBundle,
Symfony’s Validator
and Routing component,
FOSRestBundle is built-in.
All these libraries does a good amount of caching on their end. However, NelmioApiDocBundle
does not.
This means, every time the documentation page is being viewed, all documentation metadata is being re-built.
Although it did not present any significant performance issues in the beginning, it is apparent that things can speed up a bit if we could skip
all these steps if none of the configuration regarding routes, serialization, or validation hasn’t changed at all.
I mean, how often do they change in production, anyway?
I raised this concern on a PR when the project lead, Will Durand, broached the subject about
using ConfigCache
to address this. So scurried
to learn how it works. Sure enough,
there I found a nice, succinct documentation for it.
I decided to write this article to provide a comprehensive example of how to utilize ConfigCache
,
in case some of you out there still
require a little bit of help after reading the official documentation.
In a previous blog post, I detailed a way to locate resources within all registered bundles prior to or during
the compile stage of the service container. Let us build on top of that in a way that illustrates how to utilize the ConfigCache
class as well.
I’ll illustrate these topics by telling a story of a certain fictional bundle named TheHunt/SitemapBundle
, which offers this set of functionality:
- Provides the ability to generate a
sitemap.xml
file - Creates a page or a template partial that lists links (for footer or sidebar menus, for example.)
- Requires very minimal to zero configuration
By “very minimal to zero” configuration, I mean it would require very little work to add a link to the list of links, and most importantly does not require changing any of
the major configuration files within app/config/*
. In fact, let us define how exactly we can specify links for this bundle to use:
- Links defined within
sitemap.yml
files scattered across different bundles. - Via annotations in controller actions.
Both of these styles of defining metadata seems to be the most popular within the Symfony community. For this article, lets focus on defining metadata via YAML files within bundles. Lets set aside configuring by annotations for another blog post…
Building the bundle
Resources – sitemap.yml files
Let’s define how sitemap metadata can be defined in sitemap.yml
files:
# Simple routes
registration_page:
title: Join Us!
route: fos_security_register
sections [ sitemap, footer ]
about_us:
title: About Us
route: about_us_page
sections [ sitemap, footer ]
# Routes with parameters
french_site:
title: French site
route: home_page
params:
_locale: 'fr'
sections [ footer ]
hello_world:
title: Hello, world!
route: articles
params:
slug: hello-world
updated: 2014-07-21 15:00:00
sections [ sitemap ]
# External links
sponsor_1:
title: Pearson Specter & Litt
url: "http://pearsonspecterlitt.com?_ref=%affiliate_code"
sections [ footer ]
Such files will be locate under the Resources/config
directory across different bundles in your app.
(For example, your TheHunt\UserBundle\Resources\config\sitemap.yml
would register the user pages to the sitemap,
while a similar file in TheHunt\BlogBundle
would register blog posts, etc.)
Services
This bundle doesn’t do that many things, and all of its responsibilities can be broken down across only a few services. The service of primary interest for us would be this, though:
TheHunt\SitemapBundle\Sitemap\LinkCollector
- This class is responsible for collecting links that are defined across different bundles:
<?php
namespace TheHunt\SitemapBundle\Sitemap;
use Symfony\Component\Routing\Generator\UrlGeneratorInterface;
use Symfony\Component\Yaml\Yaml;
class LinkCollector
{
protected $files;
protected $links;
protected $generator;
public function __construct(array $files, UrlGeneratorInterface $generator)
{
$this->files = $files;
$this->generator = $generator;
}
public function getLinks()
{
if (null === $this->links) {
$this->links = array();
foreach ($this->files as $file) {
$linkDefs = Yaml::parse(file_get_contents($file));
foreach ($linkDefs as $name => $linkDef) {
$this->links[$name] = $this->generateLink($linkDef);
}
}
}
return $this->links;
}
private function generateLink($linkDef)
{
if (!empty($linkDef['route'])) {
$href =
$this->generator->generate(
$linkDef['route'],
!empty($linkDef['params']) ? $linkDef['params'] : array(),
UrlGeneratorInterface::ABSOLUTE_URL
);
} elseif (!empty($linkDef['url']) {
$href = $linkDef['url'];
} else {
throw new \DomainException('Not enough information provided to generate a link.');
}
if (empty($linkDef['sections'])) {
throw new \InvalidArgumentException(sprintf('No section specified for %s', $href);
}
return array(
'title' => $linkDef['title'],
'href' => $href,
'sections' => $linkDef['sections'],
'updated' => !empty($linkDef['updated']) ? $linkDef['updated'] : strtotime('Y-m-d H:i:s'),
);
}
}
And we will register this as a service:
# TheHunt\SitemapBundle\Resources\config\services.yml
services:
thehunt_sitemap.link_collector:
class: TheHunt\SitemapBundle\Sitemap\LinkCollector
arguments:
- []
- @router
Notice that the first constructor argument is an empty array. We have to create the actual values at run-time, using the technique describe in this post:
<?php
namespace TheHunt\SitemapBundle\DependencyInjection;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\HttpKernel\DependencyInjection\Extension;
class TheHuntSitemapExtension extends Extension
{
public function load(array $configs, ContainerBuilder $container)
{
/** Some boilerplate stuff **/
$bundles = $container->getParameter('kernel.bundles');
// Let's gather the paths of all sitemap.yml files that exist in bundles
$files = array();
foreach ($bundles as $bundleName => $bundleClass) {
$refClass = new \ReflectionClass($bundleClass);
$file = dirname($refClass->getFileName()) . '/Resources/config/sitemap.yml';
if (file_exists($file) === true) {
$files[] = $file;
}
}
// Let's replace the placeholder blank array with the actual list.
$collector = $container->getDefinition('thehunt_sitemap.link_collector');
$collector->replaceArgument(0, $files);
}
}
As far as collecting and generating sitemap links go, we are done. Consumers of the thehunt_sitemap.link_collector
will simply have to call its getLinks
method and do with the results
however they wish (to generate an XML file, or to use them in a Twig template, etc.)
Caching
Everything works they way it should. However, if you study the whole mechanism, we are doing some potentially expensive operations:
- Reading the contents of multiple files.
- Parsing YAML contents
- Generating URLs from route names and parameters.
The first item may not be expensive, but the last two can potentially a negative effect on performance, especially if we are dealing with huge YAML files or a list of 100+ routes. Imagine if we
decide to support configuration by annotations. That additional complexity would take a toll, too. We could really use some caching mechanism. Fortunately, Symfony ships with Symfony\Component\Config\ConfigCache
which is perfect for our needs.
To make it more interesting, let us exercise some OOP chops: Instead of refactoring the existing LinkCollector
, lets create another class that extends it and simply adds a thin layer of caching:
<?php
namespace TheHunt\SitemapBundle\Sitemap;
use Symfony\Component\Routing\Generator\UrlGeneratorInterface;
use Symfony\Component\Config\ConfigCache;
use Symfony\Component\Config\Resource\FileResource;
class CachingLinkCollector extends LinkCollector
{
protected $cacheFile;
public function __construct($files, UrlGeneratorInterface $generator, $cacheDir, $debug)
{
parent::__construct($files, $generator);
$this->cacheFile = $cacheDir . '/sitemap-links.php.cache';
$this->cache = new ConfigCache($cacheDir, $debug);
}
public function getLinks()
{
if ($this->cache->isFresh() === false) {
$resources = array();
foreach ($this->files as $file) {
// Register files that we are interested in as a FileResource instance.
$resources[] = new FileResource($file);
}
$links = parent::getLinks();
$this->cache->write(sprintf('<?php return %s', var_export($links, true)), $resources);
return $links;
}
// Since the cache is clean, then just use the data stored in our cache file.
return require $this->cacheFile;
}
}
Now, when $this->get('thehunt_sitemap.link_collector')->getLinks()
is run for the first time, a file named sitemap-links.php.cache
will
be created under the cache directory which contains something like:
<?php
return array(
'registration_page' => array(
'title' => 'Join Us!',
'href' => 'http://yoursite.com/user/register',
'updated' => '2014-07-21 01:30:00',
'sections' => array('sitemap', 'footer'),
),
'french_site' => ...
);
(Since the data we are caching is just an array that contains primitive and scalar values, storing it as an executable PHP file with the help of var_export
can do the job very well.
If however you are dealing with objects, this won’t really work. In such a case, using serialize
to store it and using deserialize
during retrieval is the way to go.)
Now, as long as none of the sitemap.yml
files remains unmodified, this file will remain there and will never be regenerated.
The ConfigCache
class takes care most of the complicated logic of implementing a caching layer which is smart enough to know when it should be busted or invalidated.
Compared to the TTL (time-to-live) based caching wherein the cache is busted after a set period of time, ConfigCache
is way smarter because it actually checks whether or not
any of the resources we registered has changed since (using filemtime
). If any of them has been modified, it is only logical
to bust the cache is there is a chance that the underlying data should change. This means that your cache is only built once in production as the files would remain unchanged until your next deploy (unless
you change stuff directly on production, you evil you).
Symfony’s service container and router generator/matcher uses this very mechanism.
This will be the rest of the changes to finally put our caching layer to use:
<?php
namespace TheHunt\SitemapBundle\DependencyInjection;
use Symfony\Component\Config\Definition\Builder\TreeBuilder;
use Symfony\Component\Config\Definition\ConfigurationInterface;
class Configuration implements ConfigurationInterface
{
public function getConfigTreeBuilder()
{
/** Some boilerplate stuff **/
$rootNode
->children()
->booleanNode('cache')->defaultValue(true)->end()
->end();
return $treeBuilder;
}
}
The above changes to our bundle’s configuration clas will now expose the ability to turn caching layer on or off within our config file:
# app/config/config.yml
thehunt_sitemap:
cache: true # We can turn the caching layer on/off from here.
And finally, the switching between the non-caching and caching LinkCollector
s is done via:
<?php
namespace TheHunt\SitemapBundle\DependencyInjection;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\HttpKernel\DependencyInjection\Extension;
class TheHuntSitemapExtension extends Extension
{
public function load(array $configs, ContainerBuilder $container)
{
/** Gather files... **/
// Let's replace the placeholder blank array with the actual list.
$collector = $container->getDefinition('thehunt_sitemap.link_collector');
$collector->replaceArgument(0, $files);
// If caching is turned on in the bundle's configuration
if ($config['cache'] === true) {
// Replace the link collector class to use the one with caching awareness
$collector->setClass('TheHunt\SitemapBundle\Sitemap\CachingLinkCollector');
// Add the additional required values needed by the cache-aware counterpart
$collector->addArgument($container->getParameter('kernel.cache_dir'));
$collector->addArgument($container->getParameter('kernel.debug'));
}
}
}
Now our mechanism for gathering sitemap links from YAML files is complete! We wrote two classes each having their own responsibilities.
We also managed to inject just the right amount of information that these classes need, and no more.
We could have injected the container into LinkCollector
and have it pull the router
service from there to generate
the URLs, or even use the $this->container->locateResource(...)
method for finding the sitemap.yml
files.
But giving the kernel
or the service_container
to our LinkCollector
is not optimal as it introduces coupling and can be considered an anti-pattern in
our particular use-case.
After-all, all it needs is the list of files to read from and the router
service to generate fully-qualified URLs, and that is all that we are giving it. Its easy to determine
what this class needs to do its job, which makes it easier to understand, manage, and test with mocks or otherwise.
Perhaps for the next blog post, lets improve this bundle even more by giving it the ability to read configuration from annotations. That task, too, will really benefit from our caching mechanism…
Resource Loaders
There is actually a more powerful way of reading resource files in Symfony, which is using file locators and resource loaders. However these components are a bit complex for our current discussion. When we start supporting annotations and XML configurations, though, we will explore this option.