8000 gh-66449: configparser: Add support for unnamed sections by pslacerda · Pull Request #117273 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-66449: configparser: Add support for unnamed sections #117273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions Doc/library/configparser.rst
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,11 @@ may be treated as parts of multiline values or ignored.
By default, a valid section name can be any string that does not contain '\\n'.
To change this, see :attr:`ConfigParser.SECTCRE`.

The first section name may be omitted if the parser is configured to allow an
unnamed top level section with ``allow_unnamed_section=True``. In this case,
the keys/values may be retrieved by :const:`UNNAMED_SECTION` as in
``config[UNNAMED_SECTION]``.

Configuration files may include comments, prefixed by specific
characters (``#`` and ``;`` by default [1]_). Comments may appear on
their own on an otherwise empty line, possibly indented. [1]_
Expand Down Expand Up @@ -325,6 +330,27 @@ For example:
# Did I mention we can indent comments, too?


.. _unnamed-sections:

Unnamed Sections
----------------

The name of the first section (or unique) may be omitted and values
retrieved by the :const:`UNNAMED_SECTION` attribute.

.. doctest::

>>> config = """
... option = value
...
... [ Section 2 ]
... another = val
... """
>>> unnamed = configparser.ConfigParser(allow_unnamed_section=True)
>>> unnamed.read_string(config)
>>> unnamed.get(configparser.UNNAMED_SECTION, 'option')
'value'

Interpolation of values
-----------------------

Expand Down Expand Up @@ -1216,6 +1242,11 @@ ConfigParser Objects
names is stripped before :meth:`optionxform` is called.


.. data:: UNNAMED_SECTION

A special object representing a section name used to reference the unnamed section (see :ref:`unnamed-sections`).


.. data:: MAX_INTERPOLATION_DEPTH

The maximum depth for recursive interpolation for :meth:`~configparser.ConfigParser.get` when the *raw*
Expand Down
6 changes: 6 additions & 0 deletions Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,12 @@ Other Language Changes

(Contributed by William Woodruff in :gh:`112389`.)

* The :class:`configparser.ConfigParser` now accepts unnamed sections before named
ones if configured to do so.

(Contributed by Pedro Sousa Lacerda in :gh:`66449`)


New Modules
===========

Expand Down
116 changes: 85 additions & 31 deletions Lib/configparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
delimiters=('=', ':'), comment_prefixes=('#', ';'),
inline_comment_prefixes=None, strict=True,
empty_lines_in_values=True, default_section='DEFAULT',
interpolation=<unset>, converters=<unset>):

interpolation=<unset>, converters=<unset>,
allow_unnamed_section=False):
Create the parser. When `defaults` is given, it is initialized into the
dictionary or intrinsic defaults. The keys must be strings, the values
must be appropriate for %()s string interpolation.
Expand Down Expand Up @@ -68,6 +68,10 @@
converter gets its corresponding get*() method on the parser object and
section proxies.

When `allow_unnamed_section` is True (default: False), options
without section are accepted: the section for these is
``configparser.UNNAMED_SECTION``.

sections()
8000 Return all the configuration section names, sans DEFAULT.

Expand Down Expand Up @@ -156,7 +160,7 @@
"ConfigParser", "RawConfigParser",
"Interpolation", "BasicInterpolation", "ExtendedInterpolation",
"SectionProxy", "ConverterMapping",
"DEFAULTSECT", "MAX_INTERPOLATION_DEPTH")
"DEFAULTSECT", "MAX_INTERPOLATION_DEPTH", "UNNAMED_SECTION")

_default_dict = dict
DEFAULTSECT = "DEFAULT"
Expand Down Expand Up @@ -336,6 +340,15 @@ def __init__(self, filename, lineno, line):
self.line = line
self.args = (filename, lineno, line)

class _UnnamedSection:

def __repr__(self):
return "<UNNAMED_SECTION>"


UNNAMED_SECTION = _UnnamedSection()


# Used in parser getters to indicate the default behaviour when a specific
# option is not found it to raise an exception. Created to enable `None` as
# a valid fallback value.
Expand Down Expand Up @@ -550,7 +563,8 @@ def __init__(self, defaults=None, dict_type=_default_dict,
comment_prefixes=('#', ';'), inline_comment_prefixes=None,
strict=True, empty_lines_in_values=True,
default_section=DEFAULTSECT,
interpolation=_UNSET, converters=_UNSET):
interpolation=_UNSET, converters=_UNSET,
allow_unnamed_section=False,):

self._dict = dict_type
self._sections = self._dict()
Expand Down Expand Up @@ -589,6 +603,7 @@ def __init__(self, defaults=None, dict_type=_default_dict,
self._converters.update(converters)
if defaults:
self._read_defaults(defaults)
self._allow_unnamed_section = allow_unnamed_section

def defaults(self):
return self._defaults
Expand Down Expand Up @@ -862,13 +877,19 @@ def write(self, fp, space_around_delimiters=True):
if self._defaults:
self._write_section(fp, self.default_section,
self._defaults.items(), d)
if UNNAMED_SECTION in self._sections:
self._write_section(fp, UNNAMED_SECTION, self._sections[UNNAMED_SECTION].items(), d, unnamed=True)

for section in self._sections:
if section is UNNAMED_SECTION:
continue
self._write_section(fp, section,
self._sections[section].items(), d)

def _write_section(self, fp, section_name, section_items, delimiter):
"""Write a single section to the specified `fp`."""
fp.write("[{}]\n".format(section_name))
def _write_section(self, fp, section_name, section_items, delimiter, unnamed=False):
"""Write a single section to the specified `fp'."""
if not unnamed:
fp.write("[{}]\n".format(section_name))
for key, value in section_items:
value = self._interpolation.before_write(self, section_name, key,
value)
Expand Down Expand Up @@ -961,6 +982,7 @@ def _read(self, fp, fpname):
lineno = 0
indent_level = 0
e = None # None, or an exception

try:
for lineno, line in enumerate(fp, start=1):
comment_start = sys.ma 6D40 xsize
Expand Down Expand Up @@ -1007,6 +1029,13 @@ def _read(self, fp, fpname):
cursect[optname].append(value)
# a section header or option header?
else:
if self._allow_unnamed_section and cursect is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is already unmanageably complex (and complexity checks disabled in the backport). Adding this single boolean parameter has expanded this function from ~120 lines to almost 160 and increases the mccabe cyclometric complexity of this function from 25 to 31 (where a target complexity is <10).

 cpython main @ pip-run mccabe -- -m mccabe --min 14 Lib/configparser.py
940:4: 'RawConfigParser._read' 25
 cpython main @ gh pr checkout 117273
Switched to branch 'unnamed-section'
 cpython unnamed-section @ pip-run mccabe -- -m mccabe --min 14 Lib/configparser.py
961:4: 'RawConfigParser._read' 31

Moreover, I see lots of logic copied and subtly tweaked.

I'm tempted to say this shouldn't be accepted without first refactoring the implementation to allow for the introduction of this feature to be less invasive.

I can see you've put a lot of work into this, and I don't want to let the perfect be the enemy of the good. Since this change has clean unit tests, I'm okay to say let's accept it but flag an issue to refactor this implementation after.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it has a large complexity to achieve a relatively simple task. A refactor would be wellcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started that work in https://github.com/jaraco/cpython/tree/feature/refactor-configparser-read-complexity, based on this PR, assuming that it's going to go in.

sectname = UNNAMED_SECTION
cursect = self._dict()
self._sections[sectname] = cursect
self._proxies[sectname] = SectionProxy(self, sectname)
elements_added.add(sectname)

indent_level = cur_indent_level
# is it a section header?
mo = self.SECTCRE.match(value)
Expand All @@ -1027,36 +1056,61 @@ def _read(self, fp, fpname):
elements_added.add(sectname)
# So sections can't start with a continuation line
optname = None
# no section header in the file?
# no section header?
elif cursect is None:
raise MissingSectionHeaderError(fpname, lineno, line)
# an option line?
# an option line?
else:
mo = self._optcre.match(value)
indent_level = cur_indent_level
# is it a section header?
mo = self.SECTCRE.match(value)
if mo:
optname, vi, optval = mo.group('option', 'vi', 'value')
if not optname:
e = self._handle_error(e, fpname, lineno, line)
optname = self.optionxform(optname.rstrip())
if (self._strict and
(sectname, optname) in elements_added):
raise DuplicateOptionError(sectname, optname,
fpname, lineno)
elements_added.add((sectname, optname))
# This check is fine because the OPTCRE cannot
# match if it would set optval to None
if optval is not None:
optval = optval.strip()
cursect[optname] = [optval]
sectname = mo.group('header')
if sectname in self._sections:
if self._strict and sectname in elements_added:
raise DuplicateSectionError(sectname, fpname,
lineno)
cursect = self._sections[sectname]
elements_added.add(sectname)
elif sectname == self.default_section:
cursect = self._defaults
else:
# valueless option handling
cursect[optname] = None
cursect = self._dict()
self._sections[sectname] = cursect
self._proxies[sectname] = SectionProxy(self, sectname)
elements_added.add(sectname)
# So sections can't start with a continuation line
optname = None
# no section header in the file?
elif cursect is None:
raise MissingSectionHeaderError(fpname, lineno, line)
# an option line?
Comment on lines 1067 to +1087
Copy link
Member
@jaraco jaraco Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is unreachable and duplicates code above. See a2ea34d. This finding reinforces my understanding that the code is too complex to reason about to make changes safely. If we remove this change now, it'll break all of the refactoring work I've done in that branch, so let's leave it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited the link above. I'd previously mentioned the wrong commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplication will be addressed in #117348.

else:
# a non-fatal parsing error occurred. set up the
# exception but keep going. the exception will be
# raised at the end of the file and will contain a
# list of all bogus lines
e = self._handle_error(e, fpname, lineno, line)
mo = self._optcre.match(value)
if mo:
optname, vi, optval = mo.group('option', 'vi', 'value')
if not optname:
e = self._handle_error(e, fpname, lineno, line)
optname = self.optionxform(optname.rstrip())
if (self._strict and
(sectname, optname) in elements_added):
raise DuplicateOptionError(sectname, optname,
fpname, lineno)
elements_added.add((sectname, optname))
# This check is fine because the OPTCRE cannot
# match if it would set optval to None
if optval is not None:
optval = optval.strip()
cursect[optname] = [optval]
else:
# valueless option handling
cursect[optname] = None
else:
# a non-fatal parsing error occurred. set up the
# exception but keep going. the exception will be
# raised at the end of the file and will contain a
# list of all bogus lines
e = self._handle_error(e, fpname, lineno, line)
finally:
self._join_multiline_values()
# if any parsing errors occurred, raise an exception
Expand Down
48 changes: 48 additions & 0 deletions Lib/test/test_configparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -2115,6 +2115,54 @@ def test_instance_assignment(self):
self.assertEqual(cfg['two'].getlen('one'), 5)


class SectionlessTestCase(unittest.TestCase):

def fromstring(self, string):
cfg = configparser.ConfigParser(allow_unnamed_section=True)
cfg.read_string(string)
return cfg

def test_no_first_section(self):
cfg1 = self.fromstring("""
a = 1
b = 2
[sect1]
c = 3
""")

self.assertEqual(set([configparser.UNNAMED_SECTION, 'sect1']), set(cfg1.sections()))
self.assertEqual('1', cfg1[configparser.UNNAMED_SECTION]['a'])
self.assertEqual('2', cfg1[configparser.UNNAMED_SECTION]['b'])
self.assertEqual('3', cfg1['sect1']['c'])

output = io.StringIO()
cfg1.write(output)
cfg2 = self.fromstring(output.getvalue())

#self.assertEqual(set([configparser.UNNAMED_SECTION, 'sect1']), set(cfg2.sections()))
self.assertEqual('1', cfg2[configparser.UNNAMED_SECTION]['a'])
self.assertEqual('2', cfg2[configparser.UNNAMED_SECTION]['b'])
self.assertEqual('3', cfg2['sect1']['c'])

def test_no_section(self):
cfg1 = self.fromstring("""
a = 1
b = 2
""")

self.assertEqual([configparser.UNNAMED_SECTION], cfg1.sections())
self.assertEqual('1', cfg1[configparser.UNNAMED_SECTION]['a'])
self.assertEqual('2', cfg1[configparser.UNNAMED_SECTION]['b'])

output = io.StringIO()
cfg1.write(output)
cfg2 = self.fromstring(output.getvalue())

self.assertEqual([configparser.UNNAMED_SECTION], cfg2.sections())
self.assertEqual('1', cfg2[configparser.UNNAMED_SECTION]['a'])
self.assertEqual('2', cfg2[configparser.UNNAMED_SECTION]['b'])


class MiscTestCase(unittest.TestCase):
def test__all__(self):
support.check__all__(self, configparser, not_exported={"Error"})
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
:class:`configparser.ConfigParser` now accepts unnamed sections before named
ones, if configured to do so.
0