[ Index ]

PHP Cross Reference of Phabricator

title

Body

[close]

/src/docs/flavor/ -> php_pitfalls.diviner (source)

   1  @title PHP Pitfalls
   2  @group php
   3  
   4  This document discusses difficult traps and pitfalls in PHP, and how to avoid,
   5  work around, or at least understand them.
   6  
   7  = array_merge() in Incredibly Slow When Merging A List of Arrays =
   8  
   9  If you merge a list of arrays like this:
  10  
  11    COUNTEREXAMPLE
  12    $result = array();
  13    foreach ($list_of_lists as $one_list) {
  14      $result = array_merge($result, $one_list);
  15    }
  16  
  17  ...your program now has a huge runtime because it generates a large number of
  18  intermediate arrays and copies every element it has previously seen each time
  19  you iterate.
  20  
  21  In a libphutil environment, you can use @{function@libphutil:array_mergev}
  22  instead.
  23  
  24  = var_export() Hates Baby Animals =
  25  
  26  If you try to var_export() an object that contains recursive references, your
  27  program will terminate. You have no chance to intercept or react to this or
  28  otherwise stop it from happening. Avoid var_export() unless you are certain
  29  you have only simple data. You can use print_r() or var_dump() to display
  30  complex variables safely.
  31  
  32  = isset(), empty() and Truthiness =
  33  
  34  A value is "truthy" if it evaluates to true in an ##if## clause:
  35  
  36    $value = something();
  37    if ($value) {
  38      // Value is truthy.
  39    }
  40  
  41  If a value is not truthy, it is "falsey". These values are falsey in PHP:
  42  
  43    null      // null
  44    0         // integer
  45    0.0       // float
  46    "0"       // string
  47    ""        // empty string
  48    false     // boolean
  49    array()   // empty array
  50  
  51  Disregarding some bizarre edge cases, all other values are truthy. Note that
  52  because "0" is falsey, this sort of thing (intended to prevent users from making
  53  empty comments) is wrong in PHP:
  54  
  55    COUNTEREXAMPLE
  56    if ($comment_text) {
  57      make_comment($comment_text);
  58    }
  59  
  60  This is wrong because it prevents users from making the comment "0". //THIS
  61  COMMENT IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD BETTER NOT
  62  BREAK IT!!!// A better test is probably strlen().
  63  
  64  In addition to truth tests with ##if##, PHP has two special truthiness operators
  65  which look like functions but aren't: empty() and isset(). These operators help
  66  deal with undeclared variables.
  67  
  68  In PHP, there are two major cases where you get undeclared variables -- either
  69  you directly use a variable without declaring it:
  70  
  71    COUNTEREXAMPLE
  72    function f() {
  73      if ($not_declared) {
  74        // ...
  75      }
  76    }
  77  
  78  ...or you index into an array with an index which may not exist:
  79  
  80    COUNTEREXAMPLE
  81    function f(array $mystery) {
  82      if ($mystery['stuff']) {
  83        // ...
  84      }
  85    }
  86  
  87  When you do either of these, PHP issues a warning. Avoid these warnings by using
  88  empty() and isset() to do tests that are safe to apply to undeclared variables.
  89  
  90  empty() evaluates truthiness exactly opposite of if(). isset() returns true for
  91  everything except null. This is the truth table:
  92  
  93    VALUE             if()        empty()     isset()
  94  
  95    null              false       true        false
  96    0                 false       true        true
  97    0.0               false       true        true
  98    "0"               false       true        true
  99    ""                false       true        true
 100    false             false       true        true
 101    array()           false       true        true
 102    EVERYTHING ELSE   true        false       true
 103  
 104  The value of these operators is that they accept undeclared variables and do not
 105  issue a warning. Specifically, if you try to do this you get a warning:
 106  
 107    COUNTEREXAMPLE
 108    if ($not_previously_declared) {         // PHP Notice:  Undefined variable!
 109      // ...
 110    }
 111  
 112  But these are fine:
 113  
 114    if (empty($not_previously_declared)) {  // No notice, returns true.
 115      // ...
 116    }
 117    if (isset($not_previously_declared)) {  // No notice, returns false.
 118      // ...
 119    }
 120  
 121  So, isset() really means is_declared_and_is_set_to_something_other_than_null().
 122  empty() really means is_falsey_or_is_not_declared(). Thus:
 123  
 124    - If a variable is known to exist, test falsiness with if (!$v), not empty().
 125      In particular, test for empty arrays with if (!$array). There is no reason
 126      to ever use empty() on a declared variable.
 127    - When you use isset() on an array key, like isset($array['key']), it will
 128      evaluate to "false" if the key exists but has the value null! Test for index
 129      existence with array_key_exists().
 130  
 131  Put another way, use isset() if you want to type "if ($value !== null)" but are
 132  testing something that may not be declared. Use empty() if you want to type
 133  "if (!$value)" but you are testing something that may not be declared.
 134  
 135  = usort(), uksort(), and uasort() are Slow =
 136  
 137  This family of functions is often extremely slow for large datasets. You should
 138  avoid them if at all possible. Instead, build an array which contains surrogate
 139  keys that are naturally sortable with a function that uses native comparison
 140  (e.g., sort(), asort(), ksort(), or natcasesort()). Sort this array instead, and
 141  use it to reorder the original array.
 142  
 143  In a libphutil environment, you can often do this easily with
 144  @{function@libphutil:isort} or @{function@libphutil:msort}.
 145  
 146  = array_intersect() and array_diff() are Also Slow =
 147  
 148  These functions are much slower for even moderately large inputs than
 149  array_intersect_key() and array_diff_key(), because they can not make the
 150  assumption that their inputs are unique scalars as the ##key## varieties can.
 151  Strongly prefer the ##key## varieties.
 152  
 153  = array_uintersect() and array_udiff() are Definitely Slow Too =
 154  
 155  These functions have the problems of both the ##usort()## family and the
 156  `array_diff()` family. Avoid them.
 157  
 158  = foreach() Does Not Create Scope =
 159  
 160  Variables survive outside of the scope of foreach(). More problematically,
 161  references survive outside of the scope of foreach(). This code mutates
 162  `$array` because the reference leaks from the first loop to the second:
 163  
 164    COUNTEREXAMPLE
 165    $array = range(1, 3);
 166    echo implode(',', $array); // Outputs '1,2,3'
 167    foreach ($array as &$value) {}
 168    echo implode(',', $array); // Outputs '1,2,3'
 169    foreach ($array as $value) {}
 170    echo implode(',', $array); // Outputs '1,2,2'
 171  
 172  The easiest way to avoid this is to avoid using foreach-by-reference. If you do
 173  use it, unset the reference after the loop:
 174  
 175    foreach ($array as &$value) {
 176      // ...
 177    }
 178    unset($value);
 179  
 180  = unserialize() is Incredibly Slow on Large Datasets =
 181  
 182  The performance of unserialize() is nonlinear in the number of zvals you
 183  unserialize, roughly O(N^2).
 184  
 185    zvals       approximate time
 186    10000       5ms
 187    100000      85ms
 188    1000000     8,000ms
 189    10000000    72 billion years
 190  
 191  
 192  = call_user_func() Breaks References =
 193  
 194  If you use call_use_func() to invoke a function which takes parameters by
 195  reference, the variables you pass in will have their references broken and will
 196  emerge unmodified. That is, if you have a function that takes references:
 197  
 198    function add_one(&$v) {
 199      $v++;
 200    }
 201  
 202  ...and you call it with call_user_func():
 203  
 204    COUNTEREXAMPLE
 205    $x = 41;
 206    call_user_func('add_one', $x);
 207  
 208  ...##$x## will not be modified. The solution is to use call_user_func_array()
 209  and wrap the reference in an array:
 210  
 211    $x = 41;
 212    call_user_func_array(
 213      'add_one',
 214      array(&$x)); // Note '&$x'!
 215  
 216  This will work as expected.
 217  
 218  = You Can't Throw From __toString() =
 219  
 220  If you throw from __toString(), your program will terminate uselessly and you
 221  won't get the exception.
 222  
 223  = An Object Can Have Any Scalar as a Property =
 224  
 225  Object properties are not limited to legal variable names:
 226  
 227    $property = '!@#$%^&*()';
 228    $obj->$property = 'zebra';
 229    echo $obj->$property;       // Outputs 'zebra'.
 230  
 231  So, don't make assumptions about property names.
 232  
 233  = There is an (object) Cast =
 234  
 235  You can cast a dictionary into an object.
 236  
 237    $obj = (object)array('flavor' => 'coconut');
 238    echo $obj->flavor;      // Outputs 'coconut'.
 239    echo get_class($obj);   // Outputs 'stdClass'.
 240  
 241  This is occasionally useful, mostly to force an object to become a Javascript
 242  dictionary (vs a list) when passed to json_encode().
 243  
 244  = Invoking "new" With an Argument Vector is Really Hard =
 245  
 246  If you have some ##$class_name## and some ##$argv## of constructor
 247  arguments and you want to do this:
 248  
 249    new $class_name($argv[0], $argv[1], ...);
 250  
 251  ...you'll probably invent a very interesting, very novel solution that is very
 252  wrong. In a libphutil environment, solve this problem with
 253  @{function@libphutil:newv}. Elsewhere, copy newv()'s implementation.
 254  
 255  = Equality is not Transitive =
 256  
 257  This isn't terribly surprising since equality isn't transitive in a lot of
 258  languages, but the == operator is not transitive:
 259  
 260    $a = ''; $b = 0; $c = '0a';
 261    $a == $b; // true
 262    $b == $c; // true
 263    $c == $a; // false!
 264  
 265  When either operand is an integer, the other operand is cast to an integer
 266  before comparison. Avoid this and similar pitfalls by using the === operator,
 267  which is transitive.
 268  
 269  = All 676 Letters in the Alphabet =
 270  
 271  This doesn't do what you'd expect it to do in C:
 272  
 273    for ($c = 'a'; $c <= 'z'; $c++) {
 274      // ...
 275    }
 276  
 277  This is because the successor to 'z' is 'aa', which is "less than" 'z'. The
 278  loop will run for ~700 iterations until it reaches 'zz' and terminates. That is,
 279  `$c` will take on these values:
 280  
 281    a
 282    b
 283    ...
 284    y
 285    z
 286    aa // loop continues because 'aa' <= 'z'
 287    ab
 288    ...
 289    mf
 290    mg
 291    ...
 292    zw
 293    zx
 294    zy
 295    zz // loop now terminates because 'zz' > 'z'
 296  
 297  Instead, use this loop:
 298  
 299    foreach (range('a', 'z') as $c) {
 300      // ...
 301    }


Generated: Sun Nov 30 09:20:46 2014 Cross-referenced by PHPXref 0.7.1