8000 Support CPython standard library by notro · Pull Request #1167 · adafruit/circuitpython · GitHub
[go: up one dir, main page]

Skip to content

Support CPython standard library #1167

New issue

Have a question about this pr 8000 oject? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 24, 2018
Merged

Support CPython standard library #1167

merged 6 commits into from
Oct 24, 2018

Conversation

notro
Copy link
@notro notro commented Sep 7, 2018

This is a follow up on #1155 with the CircuitPython changes.

This shows the flash use in bytes on a Metro M4 build:

Patch Flash
samd: Make os and time weak modules 144
samd: Enable collections.OrderedDict 444
samd: Enable MICROPY_CPYTHON_COMPAT 752
Make closure and bound method hashable 0
modsys: exc_info: Add traceback 0
samd51: Enable sys.exc_info() 1104

Here's a grep of the stdlib use of what's provided by MICROPY_CPYTHON_COMPAT: https://gist.github.com/notro/63d716efef0fb12fcc9edcd151c684b1

See commit messages for more info.

I suppose this will overflow on some boards.

Copy link
Member
@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets only do this for the M4 since the stdlib won't fit otherwise.


#define MICROPY_PORT_BUILTIN_MODULE_WEAK_LINKS \
{ MP_ROM_QSTR(MP_QSTR_os), MP_ROM_PTR(&os_module) }, \
{ MP_OBJ_NEW_QSTR(MP_QSTR_time), (mp_obj_t)&time_module }, \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you adding this back?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that import os and import time continues to work if there's no python module with that name. It's a weak alias.

@notro
Copy link
Author
8000 notro commented Sep 8, 2018

Version 2

  • Only enable on M4
  • Increase stack size to 8k on M4. If forgot about this one, I was setting it with set_next_stack_limit in my script.

@notro
Copy link
Author
notro commented Sep 8, 2018

Please wait with this, I'm having problems with the workaround for setting properties on functions.
Making bound method hashable isn't working like I thought it would.

Ref: http://docs.micropython.org/en/latest/unix/genrst/core_language.html#user-defined-attributes-for-functions-are-not-supported

notro added 3 commits October 7, 2018 19:36
This sets the __file__ property on MPY modules like how it's done on pure python modules.
This gives access to the function underlying the bound method.

Used in the converted CPython stdlib logging.Formatter class to handle
overrriding a default converter method bound to a class variable.
The method becomes bound when accessed from an instance of that class.

I didn't investigate why CircuitPython turns it into a bound method.
Add traceback chain to sys.exec_info()[2].
No actual frame info is added, but just enough to recreate the printed
exception traceback.

Used by the unittest module which collects errors and failures and prints
them at the end.
@notro
Copy link
Author
notro commented Oct 7, 2018

Version 3

The problem with the function attribute workaround has been solve 8000 d using the name as key instead of the hash (#1179).

Note: The first patch Set __file__ on MPY modules applies to all boards not just samd51.

The reason it has taken so long to spin a new version, is that I've spent time converting some more of the CPython stdlib modules. There are subtle differences between MP and CPython so it has been slow work, and memory is a challenge squeezing the tests in. Hopefully the conversion speed will go up as I get to know the differences.

@tannewt
Copy link
Member
tannewt commented Oct 8, 2018

What differences are you running into? I'd like to be as similar as we can so I wonder if its worth changing CircuitPython.

@notro
Copy link
Author
notro commented Oct 9, 2018

I haven't written them down since I've been focused on seeing if it was at all possible to convert stdlib.
But here's the last two that I remember:

From lib/logging/init.py:

import time

class LogRecord(object):
    def __init__(self, name, level, pathname, lineno,
                 msg, args, exc_info, func=None, sinfo=None, **kwargs):
        ct = time.time()
        self.created = ct

class Formatter(object):
    converter = time.localtime

    def __init__(self, fmt=None, datefmt=None, style='%'):
        pass

    def formatTime(self, record, datefmt=None):
        print('converter', self.converter)
        ct = self.converter(record.created)

record = LogRecord('', 0, '', 0, '', None, None)
fmt = Formatter()
fmt.formatTime(record)

CircuitPython turns Formatter.converter into a bound method:

converter <bound_method>
Traceback (most recent call last):
  File "<stdin>", line 21, in <module>
  File "<stdin>", line 17, in formatTime
TypeError: function expected at most 1 arguments, got 2

>>> Formatter.converter
<function>

CPython:

converter <built-in function localtime>

CircuitPython workaround (relies on the py/objboundmeth patch):

    def formatTime(self, record, datefmt=None):
#        ct = self.converter(record.created)
        try:
            ct = self.converter.__func__(record.created)
        except AttributeError:
            ct = self.converter(record.created)
        # <snip>

From lib/datetime.py:

def _cmp(x, y):
    return 0 if x == y else 1 if x > y else -1

class date:
    def __new__(cls, year, month, day):                                         ###
        self = object.__new__(cls)
        self._year = year
        self._month = month
        self._day = day
        return self

    def __eq__(self, other):
        print('date.__eq__({!r})'.format(other))             ###
        if isinstance(other, date):
            return self._cmp(other) == 0
        return NotImplemented

    def __ne__(self, other):
        print('date.__ne__({!r})'.format(other))             ###
        if isinstance(other, date):
            return self._cmp(other) != 0
        return NotImplemented

    def _cmp(self, other):
        assert isinstance(other, date)
        y, m, d = self._year, self._month, self._day
        y2, m2, d2 = other._year, other._month, other._day
        return _cmp((y, m, d), (y2, m2, d2))


class datetime(date):
    def __new__(cls, year, month, day, hour=0, minute=0, second=0,              ###
                microsecond=0, tzinfo=None):
        self = date.__new__(cls, year, month, day)
        self._hour = hour
        self._minute = minute
        self._second = second
        self._microsecond = microsecond
        self._tzinfo = tzinfo
        return self

    def __eq__(self, other):
        print('datetime.__eq__({!r})'.format(other))             ###
        if isinstance(other, datetime):
            return self._cmp(other, allow_mixed=True) == 0<
8000
/span>
        elif not isinstance(other, date):
            return NotImplemented
        else:
            return False

    def __ne__(self, other):
        print('datetime.__ne__({!r})'.format(other))             ###
        if isinstance(other, datetime):
            return self._cmp(other, allow_mixed=True) != 0
        elif not isinstance(other, date):
            return NotImplemented
        else:
            return True

as_date = date(2018, 10, 6)
as_datetime = datetime(2018, 10, 6)
as_date == as_datetime

CircuitPython calls __eq__ on the date class:

date.__eq__(<datetime object at 20002b90>)

CPython calls __eq__ on the datetime class because it's a subclass of date per docs:

datetime.__eq__(<__main__.date object at 0xb6a3f7d0>)

I haven't got a workaround for this one.

@tannewt
Copy link
Member
tannewt commented Oct 9, 2018

Ah interesting! I'd happily merge PRs that align our behavior with C python. Its not on the top of my list of work to do though.

@notro
Copy link
Author
notro commented Oct 24, 2018

ping

@tannewt
Copy link
Member
tannewt commented Oct 24, 2018

Thanks for the ping! Sorry I didn't do any additional review. I didn't know the state of this PR.

I don't want to re-use the u* versions of modules because MicroPython uses them. How about weak linking to the top level and adding a circuitpython module to wrap them? That way to access CircuitPython's time would be from circuitpython import time.

@notro
Copy link
Author
notro commented Oct 24, 2018

I think that would be strange.
The reason I used u is because it's the MicroPython way of naming C modules.
Here's how the MP stdlib extends utime for instance: time.py

If you don't like it, can we do what CPython does with its C modules and prepend it with an underscore _time?

@tannewt
Copy link
Member
tannewt commented Oct 24, 2018

_time works for me. 👍

notro added 3 commits October 24, 2018 19:31
Add alias for uerrno so the user doesn't have to know about the
CircuitPython special names for the module.

Make os and time weak modules (aliases) making it possible to add
functionality to those modules written in python.

Example:
'import os' will now look in the path for an os module and if not found
it will import the builtin module. An os module written in python will
import the builtin module through its name prefixed with an underscore
(_os) following the C module naming practice in CPython.

Also right align the macro values to increase readability making it
easier to compare the values for samd21 and samd51. Even the longest
macro from py/mpconfig.h will fit with this alignment.
This enables various things in order to support the CPython standard library.

MICROPY_PY_BUILTINS_NOTIMPLEMENTED:
Support NotImplemented for easy conversion of stdlib.
It doesn't do fallbacks though, only raises TypeError.

MICROPY_PY_COLLECTIONS_ORDEREDDICT:
collections.OrderedDict

MICROPY_PY_FUNCTION_ATTRS:
Support function.__name__ for use as key in the function attribute workaround.

MICROPY_PY_IO:
uio module: BytesIO, FileIO, StringIO, TextIOWrapper
Also add 'io' alias.

MICROPY_PY_REVERSE_SPECIAL_METHODS:
Support the __r*__ special methods.

MICROPY_PY_SYS_EXC_INFO:
sys.exc_info() used by unittest when collecting exceptions.

MICROPY_CPYTHON_COMPAT:
Some of the things it adds:

>>> object.__init__
<function>
>>> object.__new__
<function>
>>> object.__class__
<class 'type'>
>>> object().__class__
<class 'object'>
>>> object.__name__
'object'

>>> 'Hello'.encode()
b'Hello'
>>> b'Hello'.decode()
'Hello'

Named tuple field names from string:
namedtuple('Point', 'x y')
This is necessary in order to run unittest.
Heavy tests like those in the stdlib need 12-14k.
@notro
Copy link
Author
notro commented Oct 24, 2018

Version 4

  • Prepend C modules with an underscore _ as CPython instead of u as MicroPython. Thus the change is: utime -> _time and uos -> _os

Copy link
Member
@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@tannewt tannewt merged commit dc9d338 into adafruit:master Oct 24, 2018
@notro
Copy link
Author
notro commented Oct 24, 2018

Thanks @tannewt.
I have ported some more of the stdlib tests and I've discovered that several are dedicated to the builtin types and functions. They serve as a nice window into the differences between CircuitPython and CPython: https://github.com/notro/tmp_CircuitPython_stdlib/tree/master/lib/test

@notro notro deleted the cpython_stdlib branch February 14, 2019 21:22
tannewt added a commit that referenced this pull request May 12, 2025
Both #1167 and #5072 add traceback types. #1167 added a bunch of
named tuples in order to reproduce the traceback string. Since #5072
added traceback printing, we don't need the old way. So, rollback
PR #1167 in favor of the newer traceback type.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0