10000 [WIP] Newspec stage 2. by methane · Pull Request #79 · msgpack/msgpack-python · GitHub
[go: up one dir, main page]

Skip to content

[WIP] Newspec stage 2. #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 20, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
d610975
add support for extended types: you can now pack/unpack custom python…
antocuni Oct 15, 2013
5529dfe
kill some duplicate code from unpack/unpackb and move the logic to Un…
antocuni Oct 18, 2013
522c4bf
slightly change to API
antocuni Oct 18, 2013
c727440
automatically find the best format to encode extended types
antocuni Oct 18, 2013
afa28fb
add support to unpack all ext formats
antocuni Oct 18, 2013
5467515
implement Packer.pack_extended_type also in the cython version of the…
antocuni Oct 18, 2013
a7485ec
add the hook for unknown types also to the cython Packer
antocuni Oct 18, 2013
ff85838
implement unpack_one also for the cython version, and add a test for it
antocuni Oct 18, 2013
985d4c1
add a test for unpacking extended types
antocuni Oct 19, 2013
56dd165
implement unpacking for all the fixtext formats
antocuni Oct 19, 2013
c9b97f0
implement unpacking of ext 8,16,32
antocuni Oct 19, 2013
6386481
add a note in the README
antocuni Oct 19, 2013
27f0cba
Merge branch 'master' of https://github.com/antocuni/msgpack-python i…
methane Oct 20, 2013
aa68c9b
fallback: Support pack_ext_type.
methane Oct 20, 2013
96bcd76
Packing ExtType and some cleanup
methane Oct 20, 2013
822cce8
Support unpacking new types.
methane Oct 20, 2013
0d5c58b
cleanup
methane Oct 20, 2013
84dc99c
Add ext_type example to README.
methane Oct 20, 2013
cb78959
Update README.
methane Oct 20, 2013
37c2ad6
Add tests and bugfix.
methane Oct 20, 2013
e3fee4d
fallback: support packing ExtType
methane Oct 20, 2013
d84a403
fix bugs.
methane Oct 20, 2013
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 60 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ MessagePack for Python
=======================

:author: INADA Naoki
:version: 0.3.0
:date: 2012-12-07
:version: 0.4.0
:date: 2013-10-21

.. image:: https://secure.travis-ci.org/msgpack/msgpack-python.png
:target: https://travis-ci.org/#!/msgpack/msgpack-python
Expand Down Expand Up @@ -39,8 +39,40 @@ amd64. Windows SDK is recommanded way to build amd64 msgpack without any fee.)

Without extension, using pure python implementation on CPython runs slowly.

Notes
-----

Note for msgpack 2.0 support
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

msgpack 2.0 adds two types: *bin* and *ext*.

*raw* was bytes or string type like Python 2's ``str``.
To distinguish string and bytes, msgpack 2.0 adds *bin*.
It is non-string binary like Python 3's ``bytes``.

To use *bin* type for packing ``bytes``, pass ``use_bin_type=True`` to
packer argument.

>>> import msgpack
>>> packed = msgpack.packb([b'spam', u'egg'], use_bin_type=True)
>>> msgpack.unpackb(packed, encoding='utf-8')
['spam', u'egg']

You shoud use it carefully. When you use ``use_bin_type=True``, packed
binary can be unpacked by unpackers supporting msgpack-2.0.

To use *ext* type, pass ``msgpack.ExtType`` object to packer.

>>> import msgpack
>>> packed = msgpack.packb(msgpack.ExtType(42, b'xyzzy'))
>>> msgpack.unpackb(packed)
ExtType(code=42, data='xyzzy')

You can use it with ``default`` and ``ext_hook``. See below.

Note for msgpack 0.2.x users
----------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The msgpack 0.3 have some incompatible changes.

Expand Down Expand Up @@ -140,6 +172,31 @@ It is also possible to pack/unpack custom data types. Here is an example for
``object_pairs_hook`` callback may instead be used to receive a list of
key-value pairs.

Extended types
^^^^^^^^^^^^^^^

It is also possible to pack/unpack custom data types using the msgpack 2.0 feature.

>>> import msgpack
>>> import array
>>> def default(obj):
... if isinstance(obj, array.array) and obj.typecode == 'd':
... return msgpack.ExtType(42, obj.tostring())
... raise TypeError("Unknown type: %r" % (obj,))
...
>>> def ext_hook(code, data):
... if code == 42:
... a = array.array('d')
... a.fromstring(data)
... return a
... return ExtType(code, data)
...
>>> data = array.array('d', [1.2, 3.4])
>>> packed = msgpack.packb(data, default=default)
>>> unpacked = msgpack.unpackb(packed, ext_hook=ext_hook)
>>> data == unpacked
True


Advanced unpacking control
^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
14 changes: 12 additions & 2 deletions msgpack/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,17 @@

from collections import namedtuple

ExtType = namedtuple('ExtType', 'code data')

class ExtType(namedtuple('ExtType', 'code data')):
def __new__(cls, code, data):
if not isinstance(code, int):
raise TypeError("code must be int")
if not isinstance(data, bytes):
raise TypeError("data must be bytes")
if not 0 <= code <= 127:
raise ValueError("code must be 0~127")
return super(ExtType, cls).__new__(cls, code, data)


import os
if os.environ.get('MSGPACK_PUREPYTHON'):
Expand All @@ -26,6 +36,7 @@ def pack(o, stream, **kwargs):
packer = Packer(**kwargs)
stream.write(packer.pack(o))


def packb(o, **kwargs):
"""
Pack object `o` and return packed bytes
Expand All @@ -40,4 +51,3 @@ def packb(o, **kwargs):

dump = pack
dumps = packb

151 changes: 84 additions & 67 deletions msgpack/_packer.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@ from cpython cimport *
from libc.stdlib cimport *
from libc.string cimport *
from libc.limits cimport *
from libc.stdint cimport int8_t

from msgpack.exceptions import PackValueError
from msgpack import ExtType


cdef extern from "pack.h":
struct msgpack_packer:
Expand All @@ -29,11 +32,11 @@ cdef extern from "pack.h":
int msgpack_pack_raw(msgpack_packer* pk, size_t l)
int msgpack_pack_bin(msgpack_packer* pk, size_t l)
int msgpack_pack_raw_body(msgpack_packer* pk, char* body, size_t l)
int msgpack_pack_ext(msgpack_packer* pk, int8_t typecode, size_t l)

cdef int DEFAULT_RECURSE_LIMIT=511



cdef class Packer(object):
"""
MessagePack Packer
Expand Down Expand Up @@ -118,77 +121,87 @@ cdef class Packer(object):
cdef int ret
cdef dict d
cdef size_t L
cdef int default_used = 0

if nest_limit < 0:
raise PackValueError("recursion limit exceeded.")

if o is None:
ret = msgpack_pack_nil(&self.pk)
elif isinstance(o, bool):
if o:
ret = msgpack_pack_true(&self.pk)
else:
ret = msgpack_pack_false(&self.pk)
elif PyLong_Check(o):
if o > 0:
ullval = o
ret = msgpack_pack_unsigned_long_long(&self.pk, ullval)
else:
llval = o
ret = msgpack_pack_long_long(&self.pk, llval)
elif PyInt_Check(o):
longval = o
ret = msgpack_pack_long(&self.pk, longval)
elif PyFloat_Check(o):
if self.use_float:
fval = o
ret = msgpack_pack_float(&self.pk, fval)
else:
dval = o
ret = msgpack_pack_double(&self.pk, dval)
elif PyBytes_Check(o):
rawval = o
L = len(o)
ret = msgpack_pack_bin(&self.pk, L)
if ret == 0:
while True:
if o is None:
ret = msgpack_pack_nil(&self.pk)
elif isinstance(o, bool):
if o:
ret = msgpack_pack_true(&self.pk)
else:
ret = msgpack_pack_false(&self.pk)
elif PyLong_Check(o):
if o > 0:
ullval = o
ret = msgpack_pack_unsigned_long_long(&self.pk, ullval)
else:
llval = o
ret = msgpack_pack_long_long(&self.pk, llval)
elif PyInt_Check(o):
longval = o
ret = msgpack_pack_long(&self.pk, longval)
elif PyFloat_Check(o):
if self.use_float:
fval = o
ret = msgpack_pack_float(&self.pk, fval)
else:
dval = o
ret = msgpack_pack_double(&self.pk, dval)
elif PyBytes_Check(o):
rawval = o
L = len(o)
ret = msgpack_pack_bin(&self.pk, L)
if ret == 0:
ret = msgpack_pack_raw_body(&self.pk, rawval, L)
elif PyUnicode_Check(o):
if not self.encoding:
raise TypeError("Can't encode unicode string: no encoding is specified")
o = PyUnicode_AsEncodedString(o, self.encoding, self.unicode_errors)
rawval = o
ret = msgpack_pack_raw(&self.pk, len(o))
if ret == 0:
ret = msgpack_pack_raw_body(&self.pk, rawval, len(o))
elif PyDict_CheckExact(o):
d = <dict>o
ret = msgpack_pack_map(&self.pk, len(d))
if ret == 0:
for k, v in d.iteritems():
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif PyDict_Check(o):
ret = msgpack_pack_map(&self.pk, len(o))
if ret == 0:
for k, v in o.items():
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif isinstance(o, ExtType):
# This should be before Tuple because ExtType is namedtuple.
longval = o.code
rawval = o.data
L = len(o.data)
ret = msgpack_pack_ext(&self.pk, longval, L)
ret = msgpack_pack_raw_body(&self.pk, rawval, L)
elif PyUnicode_Check(o):
if not self.encoding:
raise TypeError("Can't encode unicode string: no encoding is specified")
o = PyUnicode_AsEncodedString(o, self.encoding, self.unicode_errors)
rawval = o
ret = msgpack_pack_raw(&self.pk, len(o))
if ret == 0:
ret = msgpack_pack_raw_body(&self.pk, rawval, len(o))
elif PyDict_CheckExact(o):
d = <dict>o
ret = msgpack_pack_map(&self.pk, len(d))
if ret == 0:
for k, v in d.iteritems():
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif PyDict_Check(o):
ret = msgpack_pack_map(&self.pk, len(o))
if ret == 0:
for k, v in o.items():
ret = self._pack(k, nest_limit-1)
if ret != 0: break
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif PyTuple_Check(o) or PyList_Check(o):
ret = msgpack_pack_array(&self.pk, len(o))
if ret == 0:
for v in o:
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif self._default:
o = self._default(o)
ret = self._pack(o, nest_limit-1)
else:
raise TypeError("can't serialize %r" % (o,))
return ret
elif PyTuple_Check(o) or PyList_Check(o):
ret = msgpack_pack_array(&self.pk, len(o))
if ret == 0:
for v in o:
ret = self._pack(v, nest_limit-1)
if ret != 0: break
elif not default_used and self._default:
o = self._default(o)
default_used = 1
continue
else:
raise TypeError("can't serialize %r" % (o,))
return ret

cpdef pack(self, object obj):
cdef int ret
Expand All @@ -202,6 +215,10 @@ cdef class Packer(object):
self.pk.length = 0
return buf

def pack_ext_type(self, typecode, data):
msgpack_pack_ext(&self.pk, typecode, len(data))
msgpack_pack_raw_body(&self.pk, data, len(data))

def pack_array_header(self, size_t size):
cdef int ret = msgpack_pack_array(&self.pk, size)
if ret == -1:
Expand Down
26 changes: 18 additions & 8 deletions msgpack/_unpacker.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ from msgpack.exceptions import (
UnpackValueError,
ExtraData,
)
from msgpack import ExtType


cdef extern from "unpack.h":
Expand All @@ -24,15 +25,14 @@ cdef extern from "unpack.h":
PyObject* object_hook
bint has_pairs_hook # call object_hook with k-v pairs
PyObject* list_hook
PyObject* ext_hook
char *encoding
char *unicode_errors

ctypedef struct unpack_context:
msgpack_user user
PyObject* obj
size_t count
unsigned int ct
PyObject* key

ctypedef int (*execute_fn)(unpack_context* ctx, const char* data,
size_t len, size_t* off) except? -1
Expand All @@ -44,7 +44,8 @@ cdef extern from "unpack.h":
object unpack_data(unpack_context* ctx)

cdef inline init_ctx(unpack_context *ctx,
object object_hook, object object_pairs_hook, object list_hook,
object object_hook, object object_pairs_hook,
object list_hook, object ext_hook,
bint use_list, char* encoding, char* unicode_errors):
unpack_init(ctx)
ctx.user.use_list = use_list
Expand All @@ -71,13 +72,20 @@ cdef inline init_ctx(unpack_context *ctx,
raise TypeError("list_hook must be a callable.")
ctx.user.list_hook = <PyObject*>list_hook

if ext_hook is not None:
if not PyCallable_Check(ext_hook):
raise TypeError("ext_hook must be a callable.")
ctx.user.ext_hook = <PyObject*>ext_hook

ctx.user.encoding = encoding
ctx.user.unicode_errors = unicode_errors

def default_read_extended_type(typecode, data):
raise NotImplementedError("Cannot decode extended type with typecode=%d" % typecode)

def unpackb(object packed, object object_hook=None, object list_hook=None,
bint use_list=1, encoding=None, unicode_errors="strict",
object_pairs_hook=None,
):
object_pairs_hook=None, ext_hook=ExtType):
"""
Unpack packed_bytes to object. Returns an unpacked object.

Expand Down Expand Up @@ -106,7 +114,8 @@ def unpackb(object packed, object object_hook=None, object list_hook=None,
unicode_errors = unicode_errors.encode('ascii')
cerr = PyBytes_AsString(unicode_errors)

init_ctx(&ctx, object_hook, object_pairs_hook, list_hook, use_list, cenc, cerr)
init_ctx(&ctx, object_hook, object_pairs_hook, list_hook, ext_hook,
use_list, cenc, cerr)
ret = unpack_construct(&ctx, buf, buf_len, &off)
if ret == 1:
obj = unpack_data(&ctx)
Expand Down Expand Up @@ -211,7 +220,7 @@ cdef class Unpacker(object):
def __init__(self, file_like=None, Py_ssize_t read_size=0, bint use_list=1,
object object_hook=None, object object_pairs_hook=None, object list_hook=None,
str encoding=None, str unicode_errors='strict', int max_buffer_size=0,
):
object ext_hook=ExtType):
cdef char *cenc=NULL, *cerr=NULL

self.file_like = file_like
Expand Down Expand Up @@ -248,7 +257,8 @@ cdef class Unpacker(object):
self.unicode_errors = unicode_errors
cerr = PyBytes_AsString(self.unicode_errors)

init_ctx(&self.ctx, object_hook, object_pairs_hook, list_hook, use_list, cenc, cerr)
init_ctx(&self.ctx, object_hook, object_pairs_hook, list_hook,
ext_hook, use_list, cenc, cerr)

def feed(self, object next_bytes):
"""Append `next_bytes` to internal buffer."""
Expand Down
Loading
0