8000 GH-91415: Mention alphabetical sort ordering in the Sorting HOWTO by rhettinger · Pull Request #98336 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

GH-91415: Mention alphabetical sort ordering in the Sorting HOWTO 8000 #98336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 16, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
GH-91415: Mention alphabetical sort ordering in the Sorting HOWTO
  • Loading branch information
rhettinger committed Oct 16, 2022
commit 7830a517fb0ceb01d90d18760bca10bfcb6e0471
98 changes: 22 additions & 76 deletions Doc/howto/sorting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,8 @@ The `Timsort <https://en.wikipedia.org/wiki/Timsort>`_ algorithm used in Python
does multiple sorts efficiently because it can take advantage of any ordering
already present in a dataset.

The Old Way Using Decorate-Sort-Undecorate
==========================================
Decorate-Sort-Undecorate
========================

This idiom is called Decorate-Sort-Undecorate after its three steps:

Expand Down Expand Up @@ -226,90 +226,36 @@ after Randal L. Schwartz, who popularized it among Perl programmers.

Now that Python sorting provides key-functions, this technique is not often needed.

Comparison Functions
====================

The Old Way Using the *cmp* Parameter
=====================================

Many constructs given in this HOWTO assume Python 2.4 or later. Before that,
there was no :func:`sorted` builtin and :meth:`list.sort` took no keyword
arguments. Instead, all of the Py2.x versions supported a *cmp* parameter to
handle user specified comparison functions.

In Py3.0, the *cmp* parameter was removed entirely (as part of a larger effort to
simplify and unify the language, eliminating the conflict between rich
comparisons and the :meth:`__cmp__` magic method).

In Py2.x, sort allowed an optional function which can be called for doing the
comparisons. That function should take two arguments to be compared and then
return a negative value for less-than, return zero if they are equal, or return
a positive value for greater-than. For example, we can do:

.. doctest::

>>> def numeric_compare(x, y):
... return x - y
>>> sorted([5, 2, 4, 1, 3], cmp=numeric_compare) # doctest: +SKIP
[1, 2, 3, 4, 5]

Or you can reverse the order of comparison with:

.. doctest::

>>> def reverse_numeric(x, y):
... return y - x
>>> sorted([5, 2, 4, 1, 3], cmp=reverse_numeric) # doctest: +SKIP
[5, 4, 3, 2, 1]

When porting code from Python 2.x to 3.x, the situation can arise when you have
the user supplying a comparison function and you need to convert that to a key
function. The following wrapper makes that easy to do:

.. testcode::

def cmp_to_key(mycmp):
'Convert a cmp= function into a key= function'
class K:
def __init__(self, obj, *args):
self.obj = obj
def __lt__(self, other):
return mycmp(self.obj, other.obj) < 0
def __gt__(self, other):
return mycmp(self.obj, other.obj) > 0
def __eq__(self, other):
return mycmp(self.obj, other.obj) == 0
def __le__(self, other):
return mycmp(self.obj, other.obj) <= 0
def __ge__(self, other):
return mycmp(self.obj, other.obj) >= 0
def __ne__(self, other):
return mycmp(self.obj, other.obj) != 0
return K

.. doctest::
:hide:
Unlike key functions that return an absolute values for sorting,
a comparison function computes the relative ordering for two inputs.

>>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric))
[5, 4, 3, 2, 1]
For example, a `balance scale
<https://upload.wikimedia.org/wikipedia/commons/1/17/Balance_à_tabac_1850.JPG>`_
compares two samples giving a relative ordering of lighter, heavier, or
equal. Likewise, comparison function `cmp(a, b)` returns a negative
value for less-than, a positive value for greater-than, or zero if the
inputs are equal.

To convert to a key function, just wrap the old comparison function:

.. testsetup::

from functools import cmp_to_key

.. doctest::
It is common to find comparison functions when translating algorithms
from other languages. And sometimes, libraries provide comparison
functions. For example, :func:`locale.strcoll` is a comparison function.

>>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric))
[5, 4, 3, 2, 1]
To accommodate those situations, Python provides
:class:`functools.cmp_to_key` to wrap the comparison function
making it usable as a key function::

In Python 3.2, the :func:`functools.cmp_to_key` function was added to the
:mod:`functools` module in the standard library.
sorted(words, key=cmp_to_key(strcoll)

Odds and Ends
=============

* For locale aware sorting, use :func:`locale.strxfrm` for a key function or
:func:`locale.strcoll` for a comparison function.
:func:`locale.strcoll` for a comparison function. This is necessary
because the "alphabetical" sort ordering can vary across cultures even
if the underlying alphabet is the same.

* The *reverse* parameter still maintains sort stability (so that records with
equal keys retain the original order). Interestingly, that effect can be
Expand Down
0