[ Index ]

PHP Cross Reference of Phabricator

title

Body

[close]

/src/docs/contributor/ -> n_plus_one.diviner (source)

   1  @title Performance: N+1 Query Problem
   2  @group developer
   3  
   4  How to avoid a common performance pitfall.
   5  
   6  = Overview =
   7  
   8  The N+1 query problem is a common performance antipattern. It looks like this:
   9  
  10    COUNTEREXAMPLE
  11    $cats = load_cats();
  12    foreach ($cats as $cat) {
  13      $cats_hats = load_hats_for_cat($cat);
  14      // ...
  15    }
  16  
  17  Assuming ##load_cats()## has an implementation that boils down to:
  18  
  19    SELECT * FROM cat WHERE ...
  20  
  21  ..and ##load_hats_for_cat($cat)## has an implementation something like this:
  22  
  23    SELECT * FROM hat WHERE catID = ...
  24  
  25  ..you will issue "N+1" queries when the code executes, where N is the number of
  26  cats:
  27  
  28    SELECT * FROM cat WHERE ...
  29    SELECT * FROM hat WHERE catID = 1
  30    SELECT * FROM hat WHERE catID = 2
  31    SELECT * FROM hat WHERE catID = 3
  32    SELECT * FROM hat WHERE catID = 4
  33    SELECT * FROM hat WHERE catID = 5
  34    ...
  35  
  36  The problem with this is that each query has quite a bit of overhead. **It is
  37  //much faster// to issue 1 query which returns 100 results than to issue 100
  38  queries which each return 1 result.** This is particularly true if your database
  39  is on a different machine which is, say, 1-2ms away on the network. In this
  40  case, issuing 100 queries serially has a minimum cost of 100-200ms, even if they
  41  can be satisfied instantly by MySQL. This is far higher than the entire
  42  server-side generation cost for most Phabricator pages should be.
  43  
  44  = Batching Queries =
  45  
  46  Fix the N+1 query problem by batching queries. Load all your data before
  47  iterating through it (this is oversimplified and omits error checking):
  48  
  49    $cats = load_cats();
  50    $hats = load_all_hats_for_these_cats($cats);
  51    foreach ($cats as $cat) {
  52      $cats_hats = $hats[$cat->getID()];
  53    }
  54  
  55  That is, issue these queries:
  56  
  57    SELECT * FROM cat WHERE ...
  58    SELECT * FROM hat WHERE catID IN (1, 2, 3, 4, 5, ...)
  59  
  60  In this case, the total number of queries issued is always 2, no matter how many
  61  objects there are. You've removed the "N" part from the page's query plan, and
  62  are no longer paying the overhead of issuing hundreds of extra queries. This
  63  will perform much better (although, as with all performance changes, you should
  64  verify this claim by measuring it).
  65  
  66  See also @{method:LiskDAO::loadRelatives} method which provides an abstraction
  67  to prevent this problem.
  68  
  69  = Detecting the Problem =
  70  
  71  Beyond reasoning about it while figuring out how to load the data you need, the
  72  easiest way to detect this issue is to check the "Services" tab in DarkConsole
  73  (see @{article:Using DarkConsole}), which lists all the service calls made on a
  74  page. If you see a bunch of similar queries, this often indicates an N+1 query
  75  issue (or a similar kind of query batching problem). Restructuring code so you
  76  can run a single query to fetch all the data at once will always improve the
  77  performance of the page.


Generated: Sun Nov 30 09:20:46 2014 Cross-referenced by PHPXref 0.7.1