[ Index ]

PHP Cross Reference of Phabricator

title

Body

[close]

/src/docs/user/userguide/ -> arcanist_lint_script_and_regex.diviner (source)

   1  @title Arcanist User Guide: Script and Regex Linter
   2  @group userguide
   3  
   4  Explains how to use the Script and Regex linter to invoke an existing
   5  lint engine that is not integrated with Arcanist.
   6  
   7  The Script and Regex linter is a simple glue linter which runs some
   8  script on each path, and then uses a regex to parse lint messages from
   9  the script's output. (This linter uses a script and a regex to
  10  interpret the results of some real linter, it does not itself lint
  11  both scripts and regexes).
  12  
  13  Configure this linter by setting these keys in your configuration:
  14  
  15    - `script-and-regex.script` Script command to run. This can be
  16      the path to a linter script, but may also include flags or use shell
  17      features (see below for examples).
  18    - `script-and-regex.regex` The regex to process output with. This
  19      regex uses named capturing groups (detailed below) to interpret output.
  20  
  21  The script will be invoked from the project root, so you can specify a
  22  relative path like `scripts/lint.sh` or an absolute path like
  23  `/opt/lint/lint.sh`.
  24  
  25  This linter is necessarily more limited in its capabilities than a normal
  26  linter which can perform custom processing, but may be somewhat simpler to
  27  configure.
  28  
  29  == Script... ==
  30  
  31  The script will be invoked once for each file that is to be linted, with
  32  the file passed as the first argument. The file may begin with a "-"; ensure
  33  your script will not interpret such files as flags (perhaps by ending your
  34  script configuration with "--", if its argument parser supports that).
  35  
  36  Note that when run via `arc diff`, the list of files to be linted includes
  37  deleted files and files that were moved away by the change. The linter should
  38  not assume the path it is given exists, and it is not an error for the
  39  linter to be invoked with paths which are no longer there. (Every affected
  40  path is subject to lint because some linters may raise errors in other files
  41  when a file is removed, or raise an error about its removal.)
  42  
  43  The script should emit lint messages to stdout, which will be parsed with
  44  the provided regex.
  45  
  46  For example, you might use a configuration like this:
  47  
  48    "script-and-regex.script": "/opt/lint/lint.sh --flag value --other-flag --"
  49  
  50  stderr is ignored. If you have a script which writes messages to stderr,
  51  you can redirect stderr to stdout by using a configuration like this:
  52  
  53    "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" 2>&1'"
  54  
  55  The return code of the script must be 0, or an exception will be raised
  56  reporting that the linter failed. If you have a script which exits nonzero
  57  under normal circumstances, you can force it to always exit 0 by using a
  58  configuration like this:
  59  
  60    "script-and-regex.script": "sh -c '/opt/lint/lint.sh \"$0\" || true'"
  61  
  62  Multiple instances of the script will be run in parallel if there are
  63  multiple files to be linted, so they should not use any unique resources.
  64  For instance, this configuration would not work properly, because several
  65  processes may attempt to write to the file at the same time:
  66  
  67    COUNTEREXAMPLE
  68    "script-and-regex.script": "sh -c '/opt/lint/lint.sh --output /tmp/lint.out \"$0\" && cat /tmp/lint.out'"
  69  
  70  There are necessary limits to how gracefully this linter can deal with
  71  edge cases, because it is just a script and a regex. If you need to do
  72  things that this linter can't handle, you can write a phutil linter and move
  73  the logic to handle those cases into PHP. PHP is a better general-purpose
  74  programming language than regular expressions are, if only by a small margin.
  75  
  76  == ...and Regex ==
  77  
  78  The regex must be a valid PHP PCRE regex, including delimiters and flags.
  79  
  80  The regex will be matched against the entire output of the script, so it
  81  should generally be in this form if messages are one-per-line:
  82  
  83    /^...$/m
  84  
  85  The regex should capture these named patterns with `(?P<name>...)`:
  86  
  87    - `message` (required) Text describing the lint message. For example,
  88      "This is a syntax error.".
  89    - `name` (optional) Text summarizing the lint message. For example,
  90      "Syntax Error".
  91    - `severity` (optional) The word "error", "warning", "autofix", "advice",
  92      or "disabled", in any combination of upper and lower case. Instead, you
  93      may match groups called `error`, `warning`, `advice`, `autofix`, or
  94      `disabled`. These allow you to match output formats like "E123" and
  95      "W123" to indicate errors and warnings, even though the word "error" is
  96      not present in the output. If no severity capturing group is present,
  97      messages are raised with "error" severity. If multiple severity capturing
  98      groups are present, messages are raised with the highest captured
  99      severity. Capturing groups like `error` supersede the `severity`
 100      capturing group.
 101    - `error` (optional) Match some nonempty substring to indicate that this
 102      message has "error" severity.
 103    - `warning` (optional) Match some nonempty substring to indicate that this
 104      message has "warning" severity.
 105    - `advice` (optional) Match some nonempty substring to indicate that this
 106      message has "advice" severity.
 107    - `autofix` (optional) Match some nonempty substring to indicate that this
 108      message has "autofix" severity.
 109    - `disabled` (optional) Match some nonempty substring to indicate that this
 110      message has "disabled" severity.
 111    - `file` (optional) The name of the file to raise the lint message in. If
 112      not specified, defaults to the linted file. It is generally not necessary
 113      to capture this unless the linter can raise messages in files other than
 114      the one it is linting.
 115    - `line` (optional) The line number of the message.
 116    - `char` (optional) The character offset of the message.
 117    - `offset` (optional) The byte offset of the message. If captured, this
 118      supersedes `line` and `char`.
 119    - `original` (optional) The text the message affects.
 120    - `replacement` (optional) The text that the range captured by `original`
 121      should be automatically replaced by to resolve the message.
 122    - `code` (optional) A short error type identifier which can be used
 123      elsewhere to configure handling of specific types of messages. For
 124      example, "EXAMPLE1", "EXAMPLE2", etc., where each code identifies a
 125      class of message like "syntax error", "missing whitespace", etc. This
 126      allows configuration to later change the severity of all whitespace
 127      messages, for example.
 128    - `ignore` (optional) Match some nonempty substring to ignore the match.
 129      You can use this if your linter sometimes emits text like "No lint
 130      errors".
 131    - `stop` (optional) Match some nonempty substring to stop processing input.
 132      Remaining matches for this file will be discarded, but linting will
 133      continue with other linters and other files.
 134    - `halt` (optional) Match some nonempty substring to halt all linting of
 135      this file by any linter. Linting will continue with other files.
 136    - `throw` (optional) Match some nonempty substring to throw an error, which
 137      will stop `arc` completely. You can use this to fail abruptly if you
 138      encounter unexpected output. All processing will abort.
 139  
 140  Numbered capturing groups are ignored.
 141  
 142  For example, if your lint script's output looks like this:
 143  
 144    error:13 Too many goats!
 145    warning:22 Not enough boats.
 146  
 147  ...you could use this regex to parse it:
 148  
 149    /^(?P<severity>warning|error):(?P<line>\d+) (?P<message>.*)$/m
 150  
 151  The simplest valid regex for line-oriented output is something like this:
 152  
 153    /^(?P<message>.*)$/m


Generated: Sun Nov 30 09:20:46 2014 Cross-referenced by PHPXref 0.7.1