Quick start¶
This a quick overview of the module.
Installation¶
First install python-rapidjson
:
$ pip install python-rapidjson
If possible this installs a binary wheel, containing the latest version of the package already compiled for your system. Otherwise it will download a source distribution and will try to compile it: as the module is written in C++, in this case you most probably will need to install a minimal C++ compiler toolchain on your system.
Alternatively it is also possible to install it using Conda.
Basic examples¶
python-rapidjson
tries to be compatible with the standard library json.dumps()
and
json.loads()
functions (but see the incompatibilities).
Basic usage looks like this:
>>> from pprint import pprint
>>> from rapidjson import dumps, loads
>>> data = {'foo': 100, 'bar': 'baz'}
>>> dumps(data, sort_keys=True) # for doctest
'{"bar":"baz","foo":100}'
>>> pprint(loads('{"bar":"baz","foo":100}'))
{'bar': 'baz', 'foo': 100}
All JSON data types are supported using their native Python counterparts:
>>> int_number = 42
>>> float_number = 1.4142
>>> string = "√2 ≅ 1.4142"
>>> false = False
>>> true = True
>>> null = None
>>> array = [int_number, float_number, string, false, true, null]
>>> an_object = {'int': int_number, 'float': float_number,
... 'string': string,
... 'true': true, 'false': false,
... 'array': array }
>>> pprint(loads(dumps({'object': an_object})))
{'object': {'array': [42, 1.4142, '√2 ≅ 1.4142', False, True, None],
'false': False,
'float': 1.4142,
'int': 42,
'string': '√2 ≅ 1.4142',
'true': True}}
Python’s lists, tuples and iterators get serialized as JSON arrays:
>>> names_t = ('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat')
>>> names_l = list(names_t)
>>> names_i = iter(names_l)
>>> def names_g():
... for name in names_t:
... yield name
>>> dumps(names_t) == dumps(names_l) == dumps(names_i) == dumps(names_g())
True
Values can also be bytes
or bytearray
instances, which are assumed to
contain proper UTF-8
-encoded strings:
>>> clef = "\N{MUSICAL SYMBOL G CLEF}"
>>> bytes_utf8 = clef.encode('utf-8')
>>> bytearray = bytearray(bytes_utf8)
>>> dumps(clef) == dumps(bytes_utf8) == dumps(bytearray) == '"\\uD834\\uDD1E"'
True
python-rapidjson
can optionally handle also a few other commonly used data types:
>>> import datetime, decimal, uuid
>>> from rapidjson import DM_ISO8601, UM_CANONICAL, NM_DECIMAL
>>> some_day = datetime.date(2016, 8, 28)
>>> some_timestamp = datetime.datetime(2016, 8, 28, 13, 14, 15)
>>> dumps({'a date': some_day, 'a timestamp': some_timestamp})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: datetime.datetime(…) is not JSON serializable
>>> dumps({'a date': some_day, 'a timestamp': some_timestamp},
... datetime_mode=DM_ISO8601,
... sort_keys=True) # for doctests
'{"a date":"2016-08-28","a timestamp":"2016-08-28T13:14:15"}'
>>> as_json = _
>>> pprint(loads(as_json))
{'a date': '2016-08-28', 'a timestamp': '2016-08-28T13:14:15'}
>>> pprint(loads(as_json, datetime_mode=DM_ISO8601))
{'a date': datetime.date(2016, 8, 28),
'a timestamp': datetime.datetime(2016, 8, 28, 13, 14, 15)}
>>> some_uuid = uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
>>> dumps(some_uuid)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: UUID(…) is not JSON serializable
>>> dumps(some_uuid, uuid_mode=UM_CANONICAL)
'"886313e1-3b8a-5372-9b90-0c9aee199e5d"'
>>> as_json = _
>>> loads(as_json)
'886313e1-3b8a-5372-9b90-0c9aee199e5d'
>>> loads(as_json, uuid_mode=UM_CANONICAL)
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
>>> pi = decimal.Decimal('3.1415926535897932384626433832795028841971')
>>> dumps(pi)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Decimal(…) is not JSON serializable
>>> dumps(pi, number_mode=NM_DECIMAL)
'3.1415926535897932384626433832795028841971'
>>> as_json = _
>>> loads(as_json)
3.141592653589793
>>> type(loads(as_json))
<class 'float'>
>>> loads(as_json, number_mode=NM_DECIMAL)
Decimal('3.1415926535897932384626433832795028841971')
The module exposes also a stream interface:
>>> from io import StringIO
>>> from rapidjson import dump, load
>>> stream = StringIO()
>>> dump(data, stream)
>>> stream.seek(0)
0
>>> load(stream) == data
True
Incompatibilities¶
Here are things in the standard json
library that we have decided not to support:
separators
argumentThis is mostly used for pretty printing and not supported by RapidJSON so it isn’t a high priority. We do support
indent
kwarg that would get you nice looking JSON anyways.- Coercing keys when dumping
json
will stringify aTrue
dictionary key as"true"
if you dump it out but when you load it back in it’ll still be a string. We want the dump and load to return the exact same objects so we have decided not to do this coercion by default; you can however useMM_COERCE_KEYS_TO_STRINGS
or adefault
function to mimic that.- Arbitrary encodings
json.loads()
accepts anencoding
kwarg determining the encoding of its input, when that is abytes
orbytearray
instance. AlthoughRapidJSON
is able to cope with several different encodings, we currently supports only the recommended one,UTF-8
.cls
argument toloads()
anddumps()
The
json
top level functions accept acls
parameter that allows to specify custom encoder/decoder class. If you must use that approach, that is you have to use the standardjson
top level functions but want to useRapidJSON
functionalities, the following snippet shows a reasonably simple way to do that:>>> import datetime >>> import json >>> import rapidjson >>> >>> class Encoder: ... def __init__(self, *args, **kwargs): ... # Filter/adapt JSON arguments to RapidJSON ones ... rjkwargs = {'datetime_mode': rapidjson.DM_ISO8601} ... encoder = rapidjson.Encoder(**rjkwargs) ... self.encode = encoder.__call__ >>> >>> json.dumps([1,2,datetime.date(2020, 12, 8)], cls=Encoder) '[1,2,"2020-12-08"]' >>> >>> class Decoder: ... def __init__(self, *args, **kwargs): ... # Filter/adapt JSON arguments to RapidJSON ones ... rjkwargs = {'datetime_mode': rapidjson.DM_ISO8601} ... encoder = rapidjson.Decoder(**rjkwargs) ... self.decode = encoder.__call__ >>> >>> json.loads('[1,2,"2020-12-08"]', cls=Decoder) [1, 2, datetime.date(2020, 12, 8)]
object_pairs_hook
argumentjson
decoding functions accept anobject_pairs_hook
kwarg, a variant ofobject_hook
that selects a different way to translate JSON objects into Python dictionaries by first collecting their content into a sequence of key-value pairs and eventually passing that sequence to the hook function. That behaviour may be easily simulated:>>> def loads(s, object_pairs_hook=None): ... if object_pairs_hook is None: ... d = rapidjson.Decoder() ... else: ... class KWPairsDecoder(rapidjson.Decoder): ... def start_object(self): ... return [] ... def end_object(self, pairs): ... return object_pairs_hook(pairs) ... d = KWPairsDecoder() ... return d(s) >>> >>> loads('{"foo": "bar"}', lambda pairs: ','.join(f'{k}={v}' for k, v in pairs)) 'foo=bar'