8000 ENH: get_rename_function: add support for __getitem__ override · Issue #39921 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content
ENH: get_rename_function: add support for __getitem__ override #39921
@philippefutureboy

Description

@philippefutureboy

Is your feature request related to a problem?

I would like to be able to use a subclass of UserDict with custom __getitem__ implementation as a mapper for my dataframe columns. My use case is that I want to have a mapper with default remapping rules when the key is not present.
Unfortunately, as the implementation checks for membership of the key before actually calling __getitem__, I am only able to do the behaviour by wrapping the UserDict with a lambda.

Current implementation of get_rename_function in pandas:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(x):
            if x in mapper:
                return mapper[x]
            else:
                return x

    else:
        f = mapper

    return f

My proposition is the following:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(k):
            return mapper.get(k, k)
    else:
        f = mapper

    return f

which functionally should result in the same behaviour.
The advantage for me is that __getitem__ is called and I can process k to return my default mapping.
I don't know however if this would have a significant performance impact. If so, alternatively I'd propose the following:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if not callable(mapper) and isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(x):
            if x in mapper:
                return mapper[x]
            else:
                return x
    else:
        f = mapper

    return f

Then I could implement the __call__ method. The downside however is that this could be a breaking change for some people that pass a subclass of a Mapping as well.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0