Day 5 - Validating Performance Optimizations

In the previous chapter, we learned how to profile an HTTP request directly from a browser. We learned how to find bottlenecks very easily and how to read a call graph to find the code consuming most of the resources. We then wrote a patch hoping that it would improve the situation.

Now we need to validate that the fix has a positive and significant enough impact on performance. To do this, we are going to use profile comparisons, a unique Blackfire feature.

Step 4: Comparing Profiles (Code Changes)

We, as humans, are very bad at understanding how a computer works and how code is executed. It is impossible to know which implementation of an algorithm is going to be the fastest without a deep understanding of operating systems and your programming language implementation. Moreover, a patch improving one part of the code could potentially have a negative impact on another part of the application, making the overall speed actually worse.

Instead of guessing, we need hard numbers. Take a profile using the Google Chrome extension again on http://fix1-ijtxpsladv67o.eu.platform.sh/Twig, where we applied the patch from the previous chapter:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
diff --git a/lib/Gitter/Client.php b/lib/Gitter/Client.php
index c806c1a..97a7aef 100644
--- a/lib/Gitter/Client.php
+++ b/lib/Gitter/Client.php
@@ -79,6 +79,12 @@ class Client

     public function getVersion()
     {
+        static $version;
+
+        if (null !== $version) {
+            return trim($version);
+        }
+
         $process = new Process($this->getPath() . ' --version');
         $process->run();

Without the patch, the profile summary looked like this:

/docs/gitlist-profile-withoutfix1-summary.png

And here is the new one with the patch:

/docs/gitlist-profile-withfix1-summary.png

The wall time went down from 118ms to 82.6ms, which was expected: the call graph tells us that even if Gitter\Client::getVersion() is still called 10 times, it only calls Process::run() once, which means that our cache works well:

/docs/gitlist-profile-fix1-call-graph.png

Running the GitList test suite shows that we didn't break any functionality.

We have just done a "manual comparison": we compared numbers and we looked for expected changes in the call graph. But there is a better way with Blackfire: profiles comparison.

Click on the Blackfire logo to view your Blackfire "Dashboard" (or just go to https://blackfire.io when connected):

/docs/gitlist-blackfire-dashboard.png

Click first on the "Compare" button on the second listed profile (the Twig profile without the patch) and then click on "Compare" again on the first listed profile (the Twig profile after applying the patch). Blackfire now displays the comparison of the two profiles. The web interface looks very similar to the profile one but the call graph is now a visual representation of the comparison: faster nodes in blue, slower nodes in red:

The summary indicates that there is a 30% time improvement between the two profiles, with just our four-line patch, which is not too bad. Having a look at the numbers on the left and the colors on the comparison graph, we can conclude that there is no performance regression elsewhere:

/docs/gitlist-blackfire-comparison-fix1.png

Step 2 (alternative): Comparing Profiles (Different Pages)

In the previous chapter, we used the homepage and the project page profile summaries to select the page we should optimize first. Then we looked at a profile of the project page to find optimizations.

Instead of doing this manually, we could have used the Blackfire's comparison feature to highlight performance differences:

Not surprisingly, the comparison shows that the biggest slow down comes from the stream_select() call.

Tip

You can also compare two random profiles, but if they are too many differences in the code, the results will not be useful or easy to understand. Doing comparisons on requests using the same PHP libraries and/or the same framework gives better results.

Profiling Other Dimensions

Blackfire provides detailed data on many dimensions: wall time, CPU time, I/O time, memory, and network. Focusing on the wall time only, like we did on GitList, is just a first step.

A function consuming a lot of memory or getting a lot of data from the network has a direct impact on performance as well. The wall time already includes the impact of memory consumption or the impact of the amount of data retrieved from the network... but depending on the machine specifications or the network topology, the impact may vary widely.

Data retrieved over a network is an excellent example. On your local development machine, all project components are probably installed locally, so loading big payloads probably won't have a significant impact on performance as latency is very low in this situation. On production servers, where data can be hosted on a different server or even on a different network, latency and network speed can have a much bigger impact on the overall performance of your application. That's one of the reasons why you should optimize all dimensions, not just the wall time.

Always check all dimensions when looking for performance issues.

Profiling Again

30% improvement on the GitList project page is impressive, but can we do better? Profiling is a never-ending process. Whenever you fix a bug or add a new feature, you need to check the performance impact of that change.

Take 10 minutes now to look at the project page profile again and try to find some more function calls that could be optimized.

Done? Did you find something? I did!

10 calls to Gitter\Client::run() still represents 60% of the wall time. Optimizing a function call can be done in two ways: optimizing some of the functions called by this method or reducing the number of calls to this method.

Use the search field on the left to easily find the Gitter\Client::run() call then click on the function name to reveal the detailed panel:

/docs/gitlist-client-run-panel.png

Besides the main dimension numbers, we learn that this method has "10 callers" and calls "3 callees".

A callee is a function called by our method, a child node in the call graph. Click on each callee arrow to review them all:

/docs/gitlist-client-run-panel-callees.png

The two main callees comes from the Symfony Process component, which we probably cannot optimize.

What about the callers? A caller is a function that called our method, a parent node in the call graph. Again, click on each caller:

/docs/gitlist-client-run-panel-callers.png

Are you thinking what I'm thinking? Gitter\Repository::getBranches() calls run() 4 times and Gitter\Repository::getTags() 3 times. The branches and the tags of a repository do not change that fast! I would even say that having different values within the same request could lead to hard to find bugs.

Using the same caching technique as before should help:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
diff --git a/lib/Gitter/Repository.php b/lib/Gitter/Repository.php
index 42836f9..cd0aeba 100644
--- a/lib/Gitter/Repository.php
+++ b/lib/Gitter/Repository.php
@@ -233,12 +233,18 @@ class Repository
      */
     public function getBranches()
     {
+        static $cache = array();
+
+        if (array_key_exists($this->path, $cache)) {
+            return $cache[$this->path];
+        }
+
         $branches = $this->getClient()->run($this, "branch");
         $branches = explode("\n", $branches);
         $branches = array_filter(preg_replace('/[\*\s]/', '', $branches));

         if (empty($branches)) {
-            return $branches;
+            return $cache[$this->path] = $branches;
         }

         // Since we've stripped whitespace, the result "* (detached from "
@@ -248,7 +254,7 @@ class Repository
             $branches = array_slice($branches, 1);
         }

-        return $branches;
+        return $cache[$this->path] = $branches;
     }

     /**
@@ -322,15 +328,21 @@ class Repository
      */
     public function getTags()
     {
+        static $cache = array();
+
+        if (array_key_exists($this->path, $cache)) {
+            return $cache[$this->path];
+        }
+
         $tags = $this->getClient()->run($this, "tag");
         $tags = explode("\n", $tags);
         array_pop($tags);

         if (empty($tags[0])) {
-            return NULL;
+            return $cache[$this->path] = NULL;
         }

-        return $tags;
+        return $cache[$this->path] = $tags;
     }

     /**

By now, you know the story by heart. Go to http://fix2-ijtxpsladv67o.eu.platform.sh/Twig/, generate a profile with the Google Chrome extension, and compare it with the previous profile we made after the first fix:

Tip

To easily find profiles on Blackfire's dashboard, look for the URL starting with fix1 and fix2. Blackfire also lets you name your profiles to find them more easily. You can name a profile directly from the summary toolbar, the dashboard, or the profile page.

And we did it again! 28% performance improvement on top of the 30% improvement we had made before. We can confirm the overall speedup by comparing the profile before any fixes with the latest one:

These two patches improve the performance of the GitList project page by 49% (from 118ms to 60ms).

Note

As Gitter, the project we optimized, is Open-Source, we've submitted a pull request with our findings and it has been merged.

References

We have just finished a typical profiling session: analyze a profile, find improvements, fix the code, check that performance is better, rinse and repeat. During this process, we generated many profiles. But all profiles are not equivalent. The profile on the first fix for instance is an "intermediate" profile, we can get rid of it fast. But the project page profile from before we made any fixes is more important as it represents our "baseline". We have used it several times to get the specific performance improvements for each patch, but also to get the total speedup. Baseline profiles need to be kept around for longer than intermediate profiles.

Baseline profiles are called references on Blackfire. A reference profile represents the baseline performance of some code (the version deployed on production servers, the master branch of a repository, version 1 of a project, whatever makes sense for you).

Note

Blackfire retention on regular profiles varies depending on your plan (from one day to 3 months). Unlike regular profiles, reference profiles never expire.

The project page profile for the master branch of GitList is a reference profile. You can mark a profile as being a reference when generating it. In the Google Chrome extension dialog box, select "Create New Reference" in the "Compare with" select box:

/docs/gitlist-reference-from-browser.png

Note

Reference profiles have a star in front of them on the dashboard:

/docs/gitlist-reference-dashboard.png

Having reference profiles makes the workflow more fluid as well. Before clicking on "Profile" for a fix, select the reference in the dialog box. Blackfire will then automatically display the comparison summary in the browser and link "View Comparison" now automatically redirects to the profile comparison.

Day-to-Day Profiling Methodology

To sum up, finding and fixing performance bottlenecks always uses the same profiling methodology:

  • Create references for key pages;
  • Select the slowest ones;
  • Compare and analyze profiles to spot differences and bottlenecks (on all dimensions);
  • Find the biggest bottlenecks;
  • Try to fix the issue or improve the overall performance;
  • Check that tests are not broken;
  • Generate a profile of the updated version of the code;
  • Compare the new profile with the reference;
  • Take a moment to celebrate your achievement;
  • Rinse and repeat.

Conclusion

You might think that optimizing an application like we've just done is enough and that Blackfire is not needed anymore. And you would be wrong. This is just the beginning of our journey. Performance management is a day-to-day activity. Whenever you fix a bug or add a new feature in your application, you should check that you have not introduced a performance regression.

The next chapter will guide you through the installation of Blackfire on your environment to let you profile your very own projects.