8000 Use new msgpack spec by default. (#386) · ossdev07/msgpack-python@7e9905b · GitHub
[go: up one dir, main page]

Skip to content

Commit 7e9905b

Browse files
authored
Use new msgpack spec by default. (msgpack#386)
1 parent de32048 commit 7e9905b

File tree

11 files changed

+75
-126
lines changed

11 files changed

+75
-126
lines changed

README.rst

Lines changed: 20 additions & 50 deletions
82
Original file line numberDiff line numberDiff line change
@@ -37,36 +37,16 @@ Sadly, this doesn't work for upgrade install. After `pip install -U msgpack-pyt
3737
msgpack is removed and `import msgpack` fail.
3838

3939

40-
Deprecating encoding option
41-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
40+
Compatibility with old format
41+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4242

43-
encoding and unicode_errors options are deprecated.
43+
You can use ``use_bin_type=False`` option to pack ``bytes``
44+
object into raw type in old msgpack spec, instead of 8000 bin type in new msgpack spec.
4445

45-
In case of packer, use UTF-8 always. Storing other than UTF-8 is not recommended.
46+
You can unpack old msgpack formatk using ``raw=True`` option.
47+
It unpacks str (raw) type in msgpack into Python bytes.
4648

47-
For backward compatibility, you can use ``use_bin_type=False`` and pack ``bytes``
48-
object into msgpack raw type.
49-
50-
In case of unpacker, there is new ``raw`` option. It is ``True`` by default
51-
for backward compatibility, but it is changed to ``False`` in near future.
52-
You can use ``raw=False`` instead of ``encoding='utf-8'``.
53-
54-
Planned backward incompatible changes
55-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
56-
57-
When msgpack 1.0, I planning these breaking changes:
58-
59-
* packer and unpacker: Remove ``encoding`` and ``unicode_errors`` option.
60-
* packer: Change default of ``use_bin_type`` option from False to True.
61-
* unpacker: Change default of ``raw`` option from True to False.
62-
* unpacker: Reduce all ``max_xxx_len`` options for typical usage.
63-
* unpacker: Remove ``write_bytes`` option from all methods.
64-
65-
To avoid these breaking changes breaks your application, please:
66-
67-
* Don't use deprecated options.
68-
* Pass ``use_bin_type`` and ``raw`` options explicitly.
69-
* If your application handle large (>1MB) data, specify ``max_xxx_len`` options too.
49+
See note in below for detail.
7050

7151

7252
Install
@@ -76,6 +56,7 @@ Install
7656

7757
$ pip install msgpack
7858

59+
7960
Pure Python implementation
8061
^^^^^^^^^^^^^^^^^^^^^^^^^^
8162

@@ -100,6 +81,13 @@ Without extension, using pure Python implementation on CPython runs slowly.
10081
How to use
101
----------
10283

84+
.. note::
85+
86+
In examples below, I use ``raw=False`` and ``use_bin_type=True`` for users
87+
using msgpack < 1.0.
88+
These options are default from msgpack 1.0 so you can omit them.
89+
90+
10391
One-shot pack & unpack
10492
^^^^^^^^^^^^^^^^^^^^^^
10593

@@ -252,36 +240,18 @@ Notes
252240
string and binary type
253241
^^^^^^^^^^^^^^^^^^^^^^
254242

255-
Early versions of msgpack didn't distinguish string and binary types (like Python 1).
243+
Early versions of msgpack didn't distinguish string and binary types.
256244
The type for representing both string and binary types was named **raw**.
257245

258-
For backward compatibility reasons, msgpack-python will still default all
259-
strings to byte strings, unless you specify the ``use_bin_type=True`` option in
260-
the packer. If you do so, it will use a non-standard type called **bin** to
261-
serialize byte arrays, and **raw** becomes to mean **str**. If you want to
262-
distinguish **bin** and **raw** in the unpacker, specify ``raw=False``.
263-
264-
Note that Python 2 defaults to byte-arrays over Unicode strings:
265-
266-
.. code-block:: pycon
267-
268-
>>> import msgpack
269-
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs']))
270-
['spam', 'eggs']
271-
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True),
272-
raw=False)
273-
['spam', u'eggs']
274-
275-
This is the same code in Python 3 (same behaviour, but Python 3 has a
276-
different default):
246+
You can pack into and unpack from this old spec using ``use_bin_type=False``
247+
and ``raw=True`` options.
277248

278249
.. code-block:: pycon
279250
280251
>>> import msgpack
281-
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs']))
252+
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=False), raw=True)
282253
[b'spam', b'eggs']
283-
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True),
284-
raw=False)
254+
>>> msgpack.unpackb(msgpack.packb([b'spam', u'eggs'], use_bin_type=True), raw=False)
285255
[b'spam', 'eggs']
286256
287257

msgpack/_packer.pyx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -80,9 +80,7 @@ cdef class Packer(object):
8080
8181
:param bool use_bin_type:
8282
Use bin type introduced in msgpack spec 2.0 for bytes.
83-
It also enables str8 type for unicode.
84-
Current default value is false, but it will be changed to true
85-
in future version. You should specify it explicitly.
83+
It also enables str8 type for unicode. (default: True)
8684
8785
:param bool strict_types:
8886
If set to true, types will be checked to be exact. Derived classes
@@ -113,7 +111,7 @@ cdef class Packer(object):
113111
self.pk.length = 0
114112

115113
def __init__(self, *, default=None, unicode_errors=None,
116-
bint use_single_float=False, bint autoreset=True, bint use_bin_type=False,
114+
bint use_single_float=False, bint autoreset=True, bint use_bin_type=True,
117115
bint strict_types=False):
118116
self.use_float = use_single_float
119117
self.strict_types = strict_types

msgpack/_unpacker.pyx

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ cdef inline int get_data_from_buffer(object obj,
131131

132132

133133
def unpackb(object packed, *, object object_hook=None, object list_hook=None,
134-
bint use_list=True, bint raw=True, bint strict_map_key=False,
134+
bint use_list=True, bint raw=False, bint strict_map_key=False,
135135
unicode_errors=None,
136136
object_pairs_hook=None, ext_hook=ExtType,
137137
Py_ssize_t max_str_len=-1,
@@ -217,12 +217,8 @@ cdef class Unpacker(object):
217217
Otherwise, unpack to Python tuple. (default: True)
218218
219219
:param bool raw:
220-
If true, unpack msgpack raw to Python bytes (default).
221-
Otherwise, unpack to Python str (or unicode on Python 2) by decoding
222-
with UTF-8 encoding (recommended).
223-
Currently, the default is true, but it will be changed to false in
224-
near future. So you must specify it explicitly for keeping backward
225-
compatibility.
220+
If true, unpack msgpack raw to Python bytes.
221+
Otherwise, unpack to Python str by decoding with UTF-8 encoding (default).
226222
227223
:param bool strict_map_key:
228224
If true, only str or bytes are accepted for map (dict) keys.
@@ -268,13 +264,13 @@ cdef class Unpacker(object):
268264
269265
Example of streaming deserialize from file-like object::
270266
271-
unpacker = Unpacker(file_like, raw=False, max_buffer_size=10*1024*1024)
267+
unpacker = Unpacker(file_like, max_buffer_size=10*1024*1024)
272268
for o in unpacker:
273269
process(o)
274270
275271
Example of streaming deserialize from socket::
276272
277-
unpacker = Unpacker(raw=False, max_buffer_size=10*1024*1024)
273+
unpacker = Unpacker(max_buffer_size=10*1024*1024)
278274
while True:
279275
buf = sock.recv(1024**2)
280276
if not buf:
@@ -309,7 +305,7 @@ cdef class Unpacker(object):
309305
self.buf = NULL
310306

311307
def __init__(self, file_like=None, *, Py_ssize_t read_size=0,
312-
bint use_list=True, bint raw=True, bint strict_map_key=False,
308+
bint use_list=True, bint raw=False, bint strict_map_key=False,
313309
object object_hook=None, object object_pairs_hook=None, object list_hook=None,
314310
unicode_errors=None, Py_ssize_t max_buffer_size=0,
315311
object ext_hook=ExtType,

msgpack/fallback.py

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ def _unpack_from(f, b, o=0):
158158
class Unpacker(object):
159159
"""Streaming unpacker.
160160
161-
arguments:
161+
Arguments:
162162
163163
:param file_like:
164164
File-like object having `.read(n)` method.
@@ -172,12 +172,8 @@ class Unpacker(object):
172172
Otherwise, unpack to Python tuple. (default: True)
173173
174174
:param bool raw:
175-
If true, unpack msgpack raw to Python bytes (default).
176-
Otherwise, unpack to Python str (or unicode on Python 2) by decoding
177-
with UTF-8 encoding (recommended).
178-
Currently, the default is true, but it will be changed to false in
179-
near future. So you must specify it explicitly for keeping backward
180-
compatibility.
175+
If true, unpack msgpack raw to Python bytes.
176+
Otherwise, unpack to Python str by decoding with UTF-8 encoding (default).
181177
182178
:param bool strict_map_key:
183179
If true, only str or bytes are accepted for map (dict) keys.
@@ -226,13 +222,13 @@ class Unpacker(object):
226222
227223
Example of streaming deserialize from file-like object::
228224
229-
unpacker = Unpacker(file_like, raw=False, max_buffer_size=10*1024*1024)
225+
unpacker = Unpacker(file_like, max_buffer_size=10*1024*1024)
230226
for o in unpacker:
231227
process(o)
232228
233229
Example of streaming deserialize from socket::
234230
235-
unpacker = Unpacker(raw=False, max_buffer_size=10*1024*1024)
231+
unpacker = Unpacker(max_buffer_size=10*1024*1024)
236232
while True:
237233
buf = sock.recv(1024**2)
238234
if not buf:
@@ -253,7 +249,7 @@ def __init__(
253249
file_like=None,
254250
read_size=0,
255251
use_list=True,
256-
raw=True,
252+
raw=False,
257253
strict_map_key=False,
258254
object_hook=None,
259255
object_pairs_hook=None,
@@ -748,7 +744,7 @@ class Packer(object):
748744
749745
:param bool use_bin_type:
750746
Use bin type introduced in msgpack spec 2.0 for bytes.
751-
It also enables str8 type for unicode.
747+
It also enables str8 type for unicode. (default: True)
752748
753749
:param bool strict_types:
754750
If set to true, types will be checked to be exact. Derived classes
@@ -769,7 +765,7 @@ def __init__(
769765
unicode_errors=None,
770766
use_single_float=False,
771767
autoreset=True,
772-
use_bin_type=False,
768+
use_bin_type=True,
773769
strict_types=False,
774770
):
775771
self._strict_types = strict_types

test/test_buffer.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,15 @@ def test_unpack_buffer():
1717

1818

1919
def test_unpack_bytearray():
20-
buf = bytearray(packb(("foo", "bar")))
20+
buf = bytearray(packb((b"foo", b"bar")))
2121
obj = unpackb(buf, use_list=1)
2222
assert [b"foo", b"bar"] == obj
2323
expected_type = bytes
2424
assert all(type(s) == expected_type for s in obj)
2525

2626

2727
def test_unpack_memoryview():
28-
buf = bytearray(packb(("foo", "bar")))
28+
buf = bytearray(packb((b"foo", b"bar")))
2929
view = memoryview(buf)
3030
obj = unpackb(view, use_list=1)
3131
assert [b"foo", b"bar"] == obj

test/test_case.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
11
#!/usr/bin/env python
22
# coding: utf-8
3-
43
from msgpack import packb, unpackb
54

65

7-
def check(length, obj):
8-
v = packb(obj)
6+
def check(length, obj, use_bin_type=True):
7+
v = packb(obj, use_bin_type=use_bin_type)
98
assert len(v) == length, "%r length should be %r but get %r" % (obj, length, len(v))
10-
assert unpackb(v, use_list=0) == obj
9+
assert unpackb(v, use_list=0, raw=not use_bin_type) == obj
1110

1211

1312
def test_1():
@@ -56,7 +55,7 @@ def test_9():
5655

5756

5857
def check_raw(overhead, num):
59-
check(num + overhead, b" " * num)
58+
check(num + overhead, b" " * num, use_bin_type=False)
6059

6160

6261
def test_fixraw():
@@ -135,4 +134,4 @@ def test_match():
135134

136135

137136
def test_unicode():
138-
assert unpackb(packb("foobar"), use_list=1) == b"foobar"
137+
assert unpackb(packb(u"foobar"), use_list=1) == u"foobar"

test/test_format.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
from msgpack import unpackb
55

66

7-
def check(src, should, use_list=0):
8-
assert unpackb(src, use_list=use_list) == should
7+
def check(src, should, use_list=0, raw=True):
8+
assert unpackb(src, use_list=use_list, raw=raw) == should
99

1010

1111
def testSimpleValue():
@@ -59,6 +59,12 @@ def testRaw():
5959
b"\x00\x00\xdb\x00\x00\x00\x01a\xdb\x00\x00\x00\x02ab",
6060
(b"", b"a", b"ab", b"", b"a", b"ab"),
6161
)
62+
check(
63+
b"\x96\xda\x00\x00\xda\x00\x01a\xda\x00\x02ab\xdb\x00\x00"
64+
b"\x00\x00\xdb\x00\x00\x00\x01a\xdb\x00\x00\x00\x02ab",
65+
("", "a", "ab", "", "a", "ab"),
66+
raw=False,
67+
)
6268

6369

6470
def testArray():

test/test_memoryview.py

Lines changed: 11 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,33 @@
11
#!/usr/bin/env python
22
# coding: utf-8
33

4+
import pytest
45
from array import array
56
from msgpack import packb, unpackb
67
import sys
78

89

9-
# For Python < 3:
10-
# - array type only supports old buffer interface
11-
# - array.frombytes is not available, must use deprecated array.fromstring
12-
if sys.version_info[0] < 3:
10+
pytestmark = pytest.mark.skipif(
11+
sys.version_info[0] < 3, reason="Only Python 3 supports buffer protocol"
12+
)
1313

14-
def make_memoryview(obj):
15-
return memoryview(buffer(obj))
1614

17-
def make_array(f, data):
18-
a = array(f)
19-
a.fromstring(data)
20-
return a
21-
22-
def get_data(a):
23-
return a.tostring()
24-
25-
26-
else:
27-
make_memoryview = memoryview
28-
29-
def make_array(f, data):
30-
a = array(f)
31-
a.frombytes(data)
32-
return a
33-
34-
def get_data(a):
35-
return a.tobytes()
15+
def make_array(f, data):
16+
a = array(f)
17+
a.frombytes(data)
18+
return a
3619

3720

3821
def _runtest(format, nbytes, expected_header, expected_prefix, use_bin_type):
3922
# create a new array
4023
original_array = array(format)
4124
original_array.fromlist([255] * (nbytes // original_array.itemsize))
42-
original_data = get_data(original_array)
43-
view = make_memoryview(original_array)
25+
original_data = original_array.tobytes()
26+
view = memoryview(original_array)
4427

4528
# pack, unpack, and reconstruct array
4629
packed = packb(view, use_bin_type=use_bin_type)
47-
unpacked = unpackb(packed)
30+
unpacked = unpackb(packed, raw=(not use_bin_type))
4831
reconstructed_array = make_array(format, unpacked)
4932

5033
# check that we got the right amount of data

test/test_newspec.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,16 @@ def test_str8():
1010
assert len(b) == len(data) + 2
1111
assert b[0:2] == header + b"\x20"
1212
assert b[2:] == data
13-
assert unpackb(b) == data
13+
assert unpackb(b, raw=True) == data
14+
assert unpackb(b, raw=False) == data.decode()
1415

1516
data = b"x" * 255
1617
b = packb(data.decode(), use_bin_type=True)
1718
assert len(b) == len(data) + 2
1819
assert b[0:2] == < 4E34 span class=pl-s1>header + b"\xff"
1920
assert b[2:] == data
20-
assert unpackb(b) == data
21+
assert unpackb(b, raw=True) == data
22+
assert unpackb(b, raw=False) == data.decode()
2123

2224

2325
def test_bin8():

0 commit comments

Comments
 (0)
0