MediaWiki
REL1_22
|
Class used internally by Diff to actually compute the diffs. More...
Public Member Functions | |
_compareseq ($xoff, $xlim, $yoff, $ylim) | |
Find LCS of two sequences. | |
_diag ($xoff, $xlim, $yoff, $ylim, $nchunks) | |
Divide the Largest Common Subsequence (LCS) of the sequences [XOFF, XLIM) and [YOFF, YLIM) into NCHUNKS approximately equally sized segments. | |
_lcs_pos ($ypos) | |
_line_hash ($line) | |
Returns the whole line if it's small enough, or the MD5 hash otherwise. | |
_shift_boundaries ($lines, &$changed, $other_changed) | |
Adjust inserts/deletes of identical lines to join changes as much as possible. | |
diff ($from_lines, $to_lines) | |
diff_local ($from_lines, $to_lines) | |
Public Attributes | |
$in_seq = array() | |
$ychanged | |
$yind = array() | |
$yv = array() | |
const | MAX_XREF_LENGTH = 10000 |
Protected Attributes | |
$lcs = 0 | |
$seq = array() | |
$xchanged | |
$xind = array() | |
$xv = array() |
Class used internally by Diff to actually compute the diffs.
The algorithm used here is mostly lifted from the perl module Algorithm::Diff (version 1.06) by Ned Konz, which is available at: http://www.perl.com/CPAN/authors/id/N/NE/NEDKONZ/Algorithm-Diff-1.06.zip
More ideas are taken from: http://www.ics.uci.edu/~eppstein/161/960229.html
Some ideas are (and a bit of code) are from from analyze.c, from GNU diffutils-2.7, which can be found at: ftp://gnudist.gnu.org/pub/gnu/diffutils/diffutils-2.7.tar.gz
closingly, some ideas (subdivision by NCHUNKS > 2, and some optimizations) are my own.
Line length limits for robustness added by Tim Starling, 2005-08-31 Alternative implementation added by Guy Van den Broeck, 2008-07-30
Definition at line 168 of file DairikiDiff.php.
_DiffEngine::_compareseq | ( | $ | xoff, |
$ | xlim, | ||
$ | yoff, | ||
$ | ylim | ||
) |
Find LCS of two sequences.
The results are recorded in the vectors $this->{x,y}changed[], by storing a 1 in the element for each line that is an insertion or deletion (ie. is not in the LCS).
The subsequence of file 0 is [XOFF, XLIM) and likewise for file 1.
Note that XLIM, YLIM are exclusive bounds. All line numbers are origin-0 and discarded lines are not counted.
$xoff | |
$xlim | |
$yoff | |
$ylim |
Definition at line 473 of file DairikiDiff.php.
References $lcs, _diag(), and list.
Referenced by diff_local().
_DiffEngine::_diag | ( | $ | xoff, |
$ | xlim, | ||
$ | yoff, | ||
$ | ylim, | ||
$ | nchunks | ||
) |
Divide the Largest Common Subsequence (LCS) of the sequences [XOFF, XLIM) and [YOFF, YLIM) into NCHUNKS approximately equally sized segments.
Returns (LCS, PTS). LCS is the length of the LCS. PTS is an array of NCHUNKS+1 (X, Y) indexes giving the diving points between sub sequences. The first sub-sequence is contained in [X0, X1), [Y0, Y1), the second in [X1, X2), [Y1, Y2) and so on. Note that (X0, Y0) == (XOFF, YOFF) and (X[NCHUNKS], Y[NCHUNKS]) == (XLIM, YLIM).
This function assumes that the first lines of the specified portions of the two files do not match, and likewise that the last lines do not match. The caller must trim matching lines from the beginning and end of the portions it is going to specify.
$xoff | |
$xlim | |
$yoff | |
$ylim | |
$nchunks |
Definition at line 348 of file DairikiDiff.php.
References $matches, $n, _lcs_pos(), array(), empty, and list.
Referenced by _compareseq().
_DiffEngine::_lcs_pos | ( | $ | ypos | ) |
$ypos |
Definition at line 431 of file DairikiDiff.php.
Referenced by _diag().
_DiffEngine::_line_hash | ( | $ | line | ) |
Returns the whole line if it's small enough, or the MD5 hash otherwise.
$line | string |
Definition at line 317 of file DairikiDiff.php.
Referenced by diff_local().
_DiffEngine::_shift_boundaries | ( | $ | lines, |
&$ | changed, | ||
$ | other_changed | ||
) |
Adjust inserts/deletes of identical lines to join changes as much as possible.
We do something when a run of changed lines include a line at one end and has an excluded, identical line at the other. We are free to choose which identical line is included. `compareseq' usually chooses the one at the beginning, but usually it is cleaner to consider the following identical line to be the "change".
This is extracted verbatim from analyze.c (GNU diffutils-2.7).
Definition at line 530 of file DairikiDiff.php.
References $changed, $lines, wfProfileIn(), and wfProfileOut().
Referenced by diff().
_DiffEngine::diff | ( | $ | from_lines, |
$ | to_lines | ||
) |
$from_lines | |
$to_lines |
Definition at line 186 of file DairikiDiff.php.
References _shift_boundaries(), array(), diff_local(), wfProfileIn(), and wfProfileOut().
_DiffEngine::diff_local | ( | $ | from_lines, |
$ | to_lines | ||
) |
$from_lines | |
$to_lines |
Definition at line 244 of file DairikiDiff.php.
References _compareseq(), _line_hash(), array(), empty, global, wfProfileIn(), and wfProfileOut().
Referenced by diff().
_DiffEngine::$in_seq = array() |
Definition at line 177 of file DairikiDiff.php.
_DiffEngine::$lcs = 0 [protected] |
Definition at line 179 of file DairikiDiff.php.
Referenced by _compareseq().
_DiffEngine::$seq = array() [protected] |
Definition at line 177 of file DairikiDiff.php.
_DiffEngine::$xchanged [protected] |
Definition at line 172 of file DairikiDiff.php.
_DiffEngine::$xind = array() [protected] |
Definition at line 175 of file DairikiDiff.php.
_DiffEngine::$xv = array() [protected] |
Definition at line 174 of file DairikiDiff.php.
_DiffEngine::$ychanged |
Definition at line 172 of file DairikiDiff.php.
_DiffEngine::$yind = array() |
Definition at line 175 of file DairikiDiff.php.
_DiffEngine::$yv = array() |
Definition at line 174 of file DairikiDiff.php.
const _DiffEngine::MAX_XREF_LENGTH = 10000 |
Definition at line 170 of file DairikiDiff.php.