loads() function¶
- rapidjson.loads(string, *, object_hook=None, number_mode=None, datetime_mode=None, uuid_mode=None, parse_mode=None, allow_nan=True)¶
Decode the given
JSON
formatted value into Python object.- Parameters:
string – The JSON string to parse, either a Unicode
str
instance or abytes
or abytearray
instance containing anUTF-8
encoded valueobject_hook (callable) – an optional function that will be called with the result of any object literal decoded (a
dict
) and should return the value to use instead of thedict
number_mode (int) – enable particular behaviors in handling numbers
datetime_mode (int) – how should
datetime
anddate
instances be handleduuid_mode (int) – how should
UUID
instances be handledparse_mode (int) – whether the parser should allow non-standard JSON extensions
allow_nan (bool) – compatibility flag equivalent to
number_mode=NM_NAN
- Returns:
An equivalent Python object.
- Raises:
ValueError – if an invalid argument is given
JSONDecodeError – if string is not a valid
JSON
value
object_hook
object_hook may be used to inject a custom deserializer that can replace any
dict
instance found in the JSON structure with a derived object instance:>>> class Point(object): ... def __init__(self, x, y): ... self.x = x ... self.y = y ... def __repr__(self): ... return 'Point(%s, %s)' % (self.x, self.y) ... >>> def point_dejsonifier(d): ... if 'x' in d and 'y' in d: ... return Point(d['x'], d['y']) ... else: ... return d ... >>> loads('{"x":1,"y":2}', object_hook=point_dejsonifier) Point(1, 2)
number_mode
The number_mode argument selects different behaviors in handling numeric values.
By default non-numbers (
nan
,inf
,-inf
) are recognized, becauseNM_NAN
is on by default:>>> loads('[NaN, Infinity]') [nan, inf] >>> loads('[NaN, Infinity]', number_mode=NM_NAN) [nan, inf]
Explicitly setting number_mode or using the compatibility option allow_nan you can avoid that and obtain a
ValueError
exception instead:>>> loads('[NaN, Infinity]', number_mode=NM_NATIVE) Traceback (most recent call last): File "<stdin>", line 1, in <module> rapidjson.JSONDecodeError: … Out of range float values are not JSON compliant >>> loads('[NaN, Infinity]', allow_nan=False) Traceback (most recent call last): File "<stdin>", line 1, in <module> rapidjson.JSONDecodeError: … Out of range float values are not JSON compliant
Normally all floating point literals present in the JSON structure will be loaded as Python
float
instances, withNM_DECIMAL
they will be returned asDecimal
instances instead:>>> loads('1.2345') 1.2345 >>> loads('1.2345', number_mode=NM_DECIMAL) Decimal('1.2345')
When you can be sure that all the numeric values are constrained within the architecture’s hardware limits you can get a sensible speed gain with the
NM_NATIVE
flag. While this is quite faster, integer literals that do not fit into the underlying C librarylong long
limits will be converted (truncated) todouble
numbers:>>> loads('123456789012345678901234567890') 123456789012345678901234567890 >>> loads('123456789012345678901234567890', number_mode=NM_NATIVE) 1.2345678901234566e+29
These flags can be combined together:
>>> loads('[-1, NaN, 3.1415926535897932384626433832795028841971]', ... number_mode=NM_DECIMAL | NM_NAN) [-1, Decimal('NaN'), Decimal('3.1415926535897932384626433832795028841971')]
with the exception of
NM_NATIVE
andNM_DECIMAL
, that does not make sense since there’s little point in creatingDecimal
instances out of possibly truncated float literals:datetime_mode
With datetime_mode you can enable recognition of string literals containing an ISO 8601 representation as either
date
,datetime
ortime
instances:>>> loads('"2016-01-02T01:02:03+01:00"') '2016-01-02T01:02:03+01:00' >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=DM_ISO8601) datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...3600))) >>> loads('"2016-01-02T01:02:03-01:00"', datetime_mode=DM_ISO8601) datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...82800))) >>> loads('"2016-01-02"', datetime_mode=DM_ISO8601) datetime.date(2016, 1, 2) >>> loads('"01:02:03+01:00"', datetime_mode=DM_ISO8601) datetime.time(1, 2, 3, tzinfo=...delta(...3600)))
It can be combined with
DM_SHIFT_TO_UTC
to always obtain values in the UTC timezone:>>> mode = DM_ISO8601 | DM_SHIFT_TO_UTC >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode) datetime.datetime(2016, 1, 2, 0, 2, 3, tzinfo=...utc)
Note
This option is somewhat limited when the value is a non-naïve time literal because negative values cannot be represented by the underlying Python type, so it cannot adapt such values reliably:
>>> mode = DM_ISO8601 | DM_SHIFT_TO_UTC >>> loads('"00:01:02+00:00"', datetime_mode=mode) datetime.time(0, 1, 2, tzinfo=...utc) >>> loads('"00:01:02+01:00"', datetime_mode=mode) Traceback (most recent call last): ... ValueError: ... Time literal cannot be shifted to UTC: 00:01:02+01:00
If you combine it with
DM_NAIVE_IS_UTC
then all values without a timezone will be assumed to be relative to UTC:>>> mode = DM_ISO8601 | DM_NAIVE_IS_UTC >>> loads('"2016-01-02T01:02:03"', datetime_mode=mode) datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...utc) >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode) datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...3600))) >>> loads('"01:02:03"', datetime_mode=mode) datetime.time(1, 2, 3, tzinfo=...utc)
Yet another combination is with
DM_IGNORE_TZ
to ignore the timezone and obtain naïve values:>>> mode = DM_ISO8601 | DM_IGNORE_TZ >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode) datetime.datetime(2016, 1, 2, 1, 2, 3) >>> loads('"01:02:03+01:00"', datetime_mode=mode) datetime.time(1, 2, 3)
The
DM_UNIX_TIME
cannot be used here, because there isn’t a reasonable heuristic to disambiguate between plain numbers and timestamps:>>> loads('[1,2,3]', datetime_mode=DM_UNIX_TIME) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Invalid datetime_mode, can deserialize only from ISO8601
uuid_mode
With uuid_mode you can enable recognition of string literals containing two different representations of
UUID
values:>>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"') 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' >>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"', ... uuid_mode=UM_CANONICAL) UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa') >>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"', ... uuid_mode=UM_HEX) UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa') >>> loads('"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"', ... uuid_mode=UM_CANONICAL) 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' >>> loads('"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"', ... uuid_mode=UM_HEX) UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa')
parse_mode
With parse_mode you can tell the parser to be relaxed, allowing either
C++
/JavaScript
like comments (PM_COMMENTS
):>>> loads('"foo" // one line of explanation') Traceback (most recent call last): File "<stdin>", line 1, in <module> rapidjson.JSONDecodeError: Parse error at offset 6: The document root must not be followed by other values. >>> loads('"bar" /* detailed explanation */') Traceback (most recent call last): File "<stdin>", line 1, in <module> rapidjson.JSONDecodeError: Parse error at offset 6: The document root must not be followed by other values. >>> loads('"foo" // one line of explanation', parse_mode=PM_COMMENTS) 'foo' >>> loads('"bar" /* detailed explanation */', parse_mode=PM_COMMENTS) 'bar'
or trailing commas at the end of arrays and objects (
PM_TRAILING_COMMAS
):>>> loads('[1,]') Traceback (most recent call last): File "<stdin>", line 1, in <module> rapidjson.JSONDecodeError: Parse error at offset 3: Invalid value. >>> loads('[1,]', parse_mode=PM_TRAILING_COMMAS) [1] >>> loads('{"one": 1,}', parse_mode=PM_TRAILING_COMMAS) {'one': 1}
or both:
>>> loads('[1, /* 2, */ 3,]') Traceback (most recent call last): ... rapidjson.JSONDecodeError: Parse error at offset 4: Invalid value. >>> loads('[1, /* 2, */ 3,]', parse_mode=PM_COMMENTS | PM_TRAILING_COMMAS) [1, 3]