This project contains a collection of AWK scripts, Python scripts, and autotest (part of autoconf) m4 macros.
pyrediff
is a Python script to perform pattern-aware comparison of
PATTERN
and OUTPUT
files to remove blocks that don't differ if a given
Python regular expression
(pyre) line in PATTERN
matches
the equivalent line in OUTPUT
.
Named groups (?P<name>...)
can be used in subsequent patterns
with \g<name>
; see example 3.
pyrediff
supports three different modes of operation:
pyrediff PATTERN OUTPUT
: comparePATTERN
andOUTPUT
, writing mismatches to stdout.pyrediff -e INPUT
: escape the pattern characters in INPUT, writing the result to stdout.pyrediff -f
: filter the output ofdiff PATTERN OUTPUT
.
Named groups are supported.
Strings captured in a named group using (?P<name>...)
can
be used in the current pattern line with the backreference (?P=name)
and in subsequent pattern lines with \g<name>
.
In the latter, occurrences of \g<name>
in the pattern line will be
replaced with a previously captured value before the pattern is applied.
See example 3.
Note:
pyrediff PATTERN OUTPUT
post-processes the output ofdiff
. Complex regular expressions, and/or lots of non-alphanumeric escaping inPATTERN
may causediff
to generate output as added and removed lines at different line offsets (instead of changed lines), preventingpyrediff
from applying thePATTERN
correctly.
The pyrediff
usage is:
Usage: pyrediff PATTERN OUTPUT
pyrediff -e INPUT
pyrediff -f
Pattern-aware comparison of PATTERN and OUTPUT. Similar to diff(1), except
that PATTERN may contain python regular expressions. Strings captured in a
named group using (?P<name>...) can be used in subsequent pattern lines with
\g<name>; occurrences of \g<name> in the pattern line will be replaced with a
previously captured value before the pattern is applied.
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-e INPUT, --escape=INPUT
escape INPUT to stdout instead of diffing
-f, --filter filter stdin, which is the output of `diff PATTERN
OUTPUT`
check_pattern.awk
is an AWK script to post-process the output of
diff PATTERN OUTPUT
to remove blocks that don't differ if a
given AWK regular expression line in PATTERN
matches
the equivalent line in OUTPUT
.
The script is usually invoked as:
% diff pattern output | awk -f check_pattern.awk
Various autotest (autoconf) m4 macros are provided with pattern (AWK regular expression) and pyre (Python regular expression) support.
Note: As autoconf uses [] for quoting, the use of [brackets] in the macro arguments stdout-re and stderr-re can be awkward and require careful extra quoting, or quadrigraphs
@<:@
(for[
) and@:>@
(for]
).
Macros that support AWK regular expressions in the pattern:
Macro: AX_AT_CHECK_PATTERN(
commands, [status=0
], [stdout-re], [stderr-re], [run-if-fail], [run-if-pass])
Similar to AT_CHECK()
, except that stdout-re and stderr-re are
AWK regular expressions (REs).
Using AT_CHECK()
, runs commands in a subshell, which are expected to
have an exit status of status, and to generate stdout
to match the
pattern stdout-re and stderr
to match the pattern stderr-re.
The AT_CHECK()
support for special values for stdout-re and stderr-re
of ignore
, stdout
, stderr
, (etc) is available.
Macro: AX_AT_DIFF_PATTERN(
pattern-file, test-file, [status=0
], [differences], [run-if-fail], [run-if-pass])
Checks that an AWK pattern file pattern-file applies to a test file test_file,
using AT_CHECK()
, with expected diff
differences in differences.
Create the file filename with the contents of the AWK script used by
AX_AT_CHECK_PATTERN()
and AX_AT_DIFF_PATTERN()
.
This is the same as the check_pattern.awk script.
Macros that support Python regular expressions:
Macro: AX_AT_CHECK_PYREDIFF(
commands, [status=0
], [stdout-re], [stderr-re], [run-if-fail], [run-if-pass])
Similar to AT_CHECK()
, except that stdout-re and stderr-re are
Python regular expressions (pyre).
Using AT_CHECK()
, runs commands in a subshell, which are expected to
have an exit status of status, and to generate stdout
to match the
pyre (Python regular expression) stdout-re and stderr
to match the pyre stderr-re.
The AT_CHECK()
support for special values for stdout-re and stderr-re
of ignore
, stdout
, stderr
, (etc) is available.
Macro: AX_AT_DIFF_PYRE(
pyre-file, test-file, [status=0
], [differences], [run-if-fail], [run-if-pass])
Checks that a pyre file pyre-file applies to a test file test-file,
using AT_CHECK()
, with expected diff
differences in differences.
Create a file filename with the contents of the Python script used by
AX_AT_CHECK_PYREDIFF()
and AX_AT_DIFF_PYRE()
.
This is the same as the pyrediff script.
Given pattern file 1.pattern
:
First line
Second line with a date .*\.
and output file 1.output
:
First line
Second line with a date 2014-11-22T16:41:00.
the output of diff 1.pattern 1.output
is:
% diff 1.pattern 1.output
2c2
< Second line with a date .*\.
---
> Second line with a date 2014-11-22T16:41:00.
and filtered with awk -f check_pattern.awk
:
% diff 1.pattern 1.output | awk -f check_pattern.awk
or filtered with pyrediff
:
% diff 1.pattern 1.output | pyrediff -f
or processed with pyrediff
:
% pyrediff 1.pattern 1.output
There is no output because the regex on the second line of 1.pattern matches that of the second line of 1.output.
Given pattern file 2.pattern
:
line 1 [0-9]+\.[0-9]+s
line 2
line 3
line 4
and output file 2.output
:
line 1 25.63s
line 2
line 3
line 3b extra
line 4
the output of diff 2.pattern 2.output
is:
% diff 2.pattern 2.output
1c1
< line 1 [0-9]+\.[0-9]+s
---
> line 1 25.63s
3a4
> line 3b extra
and filtered with awk -f check_pattern.awk
the only output is the extra line line 3b extra
:
% diff 2.pattern 2.output | awk -f check_pattern.awk
3a4
> line 3b extra
(with an exit status of 1),
or filtered with pyrediff
:
% diff 2.pattern 2.output | pyrediff -f
3a4
> line 3b extra
(with an exit status of 1),
or processed with pyrediff
:
% pyrediff 2.pattern 2.output
3a4
> line 3b extra
(with an exit status of 1).
Given pattern file 3.pattern
:
pid (?P<Pid>\d+) again=(?P=Pid)
second
third,\g<Pid>\g<Pid>
and output file 3.output
created with:
% ( echo "pid $$ again=$$"; echo "second"; echo "third,$$$$" ) > 3.output
% cat 3.output
pid 2211 again=2211
second
third,22112211
and filtered with pyrediff
:
% diff 3.pattern 3.output | pyrediff -f
or processed with pyrediff
:
% pyrediff 3.pattern 3.output
There is no output because the occurrences of \g<Pid>
in the pattern line third,\g<Pid>\g<Pid>
are replaced by the value of named group Pid
captured from the (?P<Pid>\d+)
in the first pattern.
Given pattern file 4.pattern
:
line 1 [0-9]+\.[0-9]+s
line 2
line 3
l..e 4
line 5 extra
line 6
line 7.*
line 8
line 9
and output file 4.output
:
line 1 25.63s
line 2
line 3
line 4
line 6
line 7 match any
line 8
the output of diff 4.pattern 4.output
is:
% diff 4.pattern 4.output
1c1
< line 1 [0-9]+\.[0-9]+s
---
> line 1 25.63s
4,5c4
< l..e 4
< line 5 extra
---
> line 4
7c6
< line 7.*
---
> line 7 match any
9d7
< line 9
and filtered with awk -f check_pattern.awk
the output is missing line 5 extra
and line 9
:
% diff 4.pattern 4.output | awk -f check_pattern.awk
4,5c4
< l..e 4
< line 5 extra
---
> line 4
9d7
< line 9
(with an exit status of 1),
or filtered with pyrediff
:
% diff 4.pattern 4.output | pyrediff -f
4,5c4
< l..e 4
< line 5 extra
---
> line 4
9d7
< line 9
(with an exit status of 1),
or processed with pyrediff
:
% pyrediff 4.pattern 4.output
4,5c4
< l..e 4
< line 5 extra
---
> line 4
9d7
< line 9
(with an exit status of 1).
Copyright (c) 2013-2024, Luke Mewburn luke@mewburn.net
Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. This file is offered as-is, without any warranty.