8000 gh-133722: Add Difflib theme to `_colorize` and 'color' option to `difflib.unified_diff` by dougthor42 · Pull Request #133725 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-133722: Add Difflib theme to _colorize and 'color' option to difflib.unified_diff #133725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
5c5b248
Add test case
dougthor42 May 9, 2025
fdc0fa0
Add 'color' arg to difflib.unified_diff.
dougthor42 May 9, 2025
fcdd7ab
Update docs and ACKs
dougthor42 May 9, 2025
0e9b070
blurb
dougthor42 May 9, 2025
7c31749
fixup to follow convention
dougthor42 May 9, 2025
66475a2
Add 'Difflib' theme
dougthor42 May 11, 2025
dbf0547
fixup tests
dougthor42 May 11, 2025
a72012e
Switch to using themes. So easy!
dougthor42 May 11, 2025
2a3d818
use 'next' in versionchanged docs
dougthor42 May 11, 2025
252982e
turns out 'git diff' adds reset to the start and end of context lines
dougthor42 May 11, 2025
3422fa7
Use GNU unified diff terms
dougthor42 May 14, 2025
bffdd71
move class
dougthor42 May 14, 2025
3255866
kw-only the 'color' arg
dougthor42 May 14, 2025
c48a6ac
Doc formatting updates
dougthor42 May 14, 2025
8ca50fa
Sort the things that are safe to sort without kw_only=True
dougthor42 May 14, 2025
eb0e81e
Update what's new
dougthor42 May 14, 2025
f7b34c3
Merge remote-tracking branch 'upstream/main' into difflib-color-gh133722
dougthor42 May 14, 2025
fb092b0
fixup docs
dougthor42 May 14, 2025
734b0bc
fixup docs
dougthor42 May 20, 2025
8a10e40
Code review: docs, whatsnew, f-strings, news
dougthor42 May 20, 2025
57b80d1
force_colorized
dougthor42 May 20, 2025
c235425
kw_only
dougthor42 May 29, 2025
387cfe6
Merge branch 'main' into difflib-color-gh133722
dougthor42 May 29, 2025
833d86a
documentation updates per code review
dougthor42 Jun 11, 2025
25f9d80
Merge branch 'main' into difflib-color-gh133722
dougthor42 Jul 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion Doc/library/difflib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,7 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module.
emu


.. function:: unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n')
.. function:: unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n', color=False)

Compare *a* and *b* (lists of strings); return a delta (a :term:`generator`
generating the delta lines) in unified diff format.
Expand All @@ -297,6 +297,9 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module.
For inputs that do not have trailing newlines, set the *lineterm* argument to
``""`` so that the output will be uniformly newline free.

Set ``color`` to ``True`` to inject ANSI color codes and make the output look
like what ``git diff --color`` shows.

The unified diff format normally has a header for filenames and modification
times. Any or all of these may be specified using strings for *fromfile*,
*tofile*, *fromfiledate*, and *tofiledate*. The modification times are normally
Expand All @@ -319,6 +322,10 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module.

See :ref:`difflib-interface` for a more detailed example.

.. versionchanged:: next
Added the *color* parameter.


.. function:: diff_bytes(dfunc, a, b, fromfile=b'', tofile=b'', fromfiledate=b'', tofiledate=b'', n=3, lineterm=b'\n')

Compare *a* and *b* (lists of bytes objects) using *dfunc*; yield a
Expand Down
15 changes: 15 additions & 0 deletions Lib/_colorize.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,17 @@ class Unittest(ThemeSection):
reset: str = ANSIColors.RESET


@dataclass(frozen=True)
class Difflib(ThemeSection):
"""A 'git diff'-like theme for `difflib.unified_diff`."""
header: str = ANSIColors.BOLD # eg "---" and "+++" lines
hunk: str = ANSIColors.CYAN # the "@@" lines
equal: str = ANSIColors.RESET # context lines
insert: str = ANSIColors.GREEN
delete: str = ANSIColors.RED
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to use the difflib.context_diff internal terminology here. Should I use the more git-like terms context, added, and removed instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe the Git-like terms if they're going to be more familiar with people and if the difflib ones are all internal.

Where does Git mention them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall they are defined in the GNU unified diff detailed description.

The git diff man page mentions them in a variety of places (emphasis mine):

-U
--unified=

Generate diffs with lines of context instead of the usual three. Implies --patch.

Generating patch text with -p

  1. Hunk headers mention the name of the function to which the hunk applies.

plain

Any line that is added in one location and was removed in another location will be colored with color.diff.newMoved.

Though sometimes it uses "deleted" instead of "removed":

--numstat

Similar to --stat, but shows number of added and deleted lines


I've gone ahead and switched to the GNU unified diff terms prior to the requested sorting. Also, to keep consistent with the other dataclass attributes (which are not currently sorted), the reset value was left at the end.

reset: str = ANSIColors.RESET


@dataclass(frozen=True)
class Theme:
"""A suite of themes for all sections of Python.
Expand All @@ -218,6 +229,7 @@ class Theme:
syntax: Syntax = field(default_factory=Syntax)
traceback: Traceback = field(default_factory=Traceback)
8000 unittest: Unittest = field(default_factory=Unittest)
difflib: Difflib = field(default_factory=Difflib)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here:

Suggested change
syntax: Syntax = field(default_factory=Syntax)
traceback: Traceback = field(default_factory=Traceback)
unittest: Unittest = field(default_factory=Unittest)
difflib: Difflib = field(default_factory=Difflib)
difflib: Difflib = field(default_factory=Difflib)
syntax: Syntax = field(default_factory=Syntax)
traceback: Traceback = field(default_factory=Traceback)
unittest: Unittest = field(default_factory=Unittest)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very slightly concerned about sorting when the dataclasses aren't kw_only.

Should I update the dataclass decorator to include kw_only=True and then sort? Or is that out of scope of this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ask @ambv about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Done


def copy_with(
self,
Expand All @@ -226,6 +238,7 @@ def copy_with(
syntax: Syntax | None = None,
traceback: Traceback | None = None,
unittest: Unittest | None = None,
difflib: Difflib | None = None,
) -> Self:
"""Return a new Theme based on this instance with some sections replaced.

Expand All @@ -237,6 +250,7 @@ def copy_with(
syntax=syntax or self.syntax,
traceback=traceback or self.traceback,
unittest=unittest or self.unittest,
difflib=difflib or self.difflib,
)

@classmethod
Expand All @@ -252,6 +266,7 @@ def no_colors(cls) -> Self:
syntax=Syntax.no_colors(),
traceback=Traceback.no_colors(),
unittest=Unittest.no_colors(),
difflib=Difflib.no_colors(),
)


Expand Down
23 changes: 16 additions & 7 deletions Lib/difflib.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
'Differ','IS_CHARACTER_JUNK', 'IS_LINE_JUNK', 'context_diff',
'unified_diff', 'diff_bytes', 'HtmlDiff', 'Match']

from _colorize import can_colorize, get_theme
from heapq import nlargest as _nlargest
from collections import namedtuple as _namedtuple
from types import GenericAlias
Expand Down Expand Up @@ -1094,7 +1095,7 @@ def _format_range_unified(start, stop):
return '{},{}'.format(beginning, length)

def unified_diff(a, b, fromfile='', tofile='', fromfiledate='',
tofiledate='', n=3, lineterm='\n'):
tofiledate='', n=3, lineterm='\n', color=False):
r"""
Compare two sequences of lines; generate the delta as a unified diff.

Expand All @@ -1111,6 +1112,9 @@ def unified_diff(a, b, fromfile='', tofile='', fromfiledate='',
For inputs that do not have trailing newlines, set the lineterm
argument to "" so that the output will be uniformly newline free.

Set `color` to True to inject ANSI color codes and make the output look
like what `git diff --color` shows.

The unidiff format normally has a header for filenames and modification
times. Any or all of these may be specified using strings for
'fromfile', 'tofile', 'fromfiledate', and 'tofiledate'.
Expand All @@ -1134,32 +1138,37 @@ def unified_diff(a, b, fromfile='', tofile='', fromfiledate='',
four
"""

if color and can_colorize():
t = get_theme(force_color=True).difflib
else:
t = get_theme(force_no_color=True).difflib

_check_types(a, b, fromfile, tofile, fromfiledate, tofiledate, lineterm)
started = False
for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n):
if not started:
started = True
fromdate = '\t{}'.format(fromfiledate) if fromfiledate else ''
todate = '\t{}'.format(tofiledate) if tofiledate else ''
yield '--- {}{}{}'.format(fromfile, fromdate, lineterm)
yield '+++ {}{}{}'.format(tofile, todate, lineterm)
yield '{}--- {}{}{}{}'.format(t.header, fromfile, fromdate, lineterm, t.reset)
yield '{}+++ {}{}{}{}'.format(t.header, tofile, todate, lineterm, t.reset)

first, last = group[0], group[-1]
file1_range = _format_range_unified(first[1], last[2])
file2_range = _format_range_unified(first[3], last[4])
yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm)
yield '{}@@ -{} +{} @@{}{}'.format(t.hunk, file1_range, file2_range, lineterm, t.reset)

for tag, i1, i2, j1, j2 in group:
if tag == 'equal':
for line in a[i1:i2]:
yield ' ' + line
yield f'{t.equal} {line}{t.reset}'
continue
if tag in {'replace', 'delete'}:
for line in a[i1:i2]:
yield '-' + line
yield f'{t.delete}-{line}{t.reset}'
if tag in {'replace', 'insert'}:
for line in b[j1:j2]:
yield '+' + line
yield f'{t.insert}+{line}{t.reset}'


########################################################################
Expand Down
15 changes: 15 additions & 0 deletions Lib/test/test_difflib.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,21 @@ def test_range_format_context(self):
self.assertEqual(fmt(3,6), '4,6')
self.assertEqual(fmt(0,0), '0')

def test_unified_diff_colored_output(self):
args = [['one', 'three'], ['two', 'three'], 'Original', 'Current',
'2005-01-26 23:30:50', '2010-04-02 10:20:52']
actual = list(difflib.unified_diff(*args, lineterm='', color=True))

expect = [
"\033[1m--- Original\t2005-01-26 23:30:50\033[0m",
"\033[1m+++ Current\t2010-04-02 10:20:52\033[0m",
"\033[36m@@ -1,2 +1,2 @@\033[0m",
"\033[31m-one\033[0m",
"\033[32m+two\033[0m",
"\033[0m three\033[0m",
]
self.assertEqual(expect, actual)


class TestBytes(unittest.TestCase):
# don't really care about the content of the output, just the fact
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1889,6 +1889,7 @@ Nicolas M. Thiéry
James Thomas
Reuben Thomas
Robin Thomas
Douglas Thor
Brian Thorne
Christopher Thorne
Stephen Thorne
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Added a ``color`` option to :func:`difflib.unified_diff` that injects ANSI color
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Added a ``color`` option to :func:`difflib.unified_diff` that injects ANSI color
Added a *color* option to :func:`difflib.unified_diff` that injects ANSI color

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops!

codes to mimic ``git diff`` colors.
Loading
0