rfc8785xml2.original.xml   rfc8785.xml 
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<rfc xmlns:xi="http://www.w3.org/2001/XInclude"
submissionType="independent"
category="info"
tocDepth="4"
docName="draft-rundgren-json-canonicalization-scheme-17"
number="8785"
ipr="trust200902"
obsoletes=""
updates=""
xml:lang="en"
tocInclude="true"
symRefs="true"
sortRefs="true"
version="3">
<front>
<title abbrev="JSON Canonicalization Scheme">
JSON Canonicalization Scheme (JCS)
</title>
<seriesInfo name="RFC" value="8785"/>
<author fullname="Anders Rundgren" initials="A." surname="Rundgren">
<organization>Independent</organization>
<address>
<postal>
<city>Montpellier</city>
<country>France</country>
</postal>
<email>anders.rundgren.net@gmail.com</email>
<uri>https://www.linkedin.com/in/andersrundgren/</uri>
</address>
</author>
<author fullname="Bret Jordan" initials="B." surname="Jordan">
<organization>Broadcom</organization>
<address>
<postal>
<street>1320 Ridder Park Drive</street>
<region>CA</region>
<code>95131</code>
<city>San Jose</city>
<country>United States of America</country>
</postal>
<email>bret.jordan@broadcom.com</email>
</address>
</author>
<author fullname="Samuel Erdtman" initials="S." surname="Erdtman">
<organization>Spotify AB</organization>
<address>
<postal>
<street>Birger Jarlsgatan 61, 4tr</street>
<code>113 56</code>
<city>Stockholm</city>
<country>Sweden</country>
</postal>
<email>erdtman@spotify.com</email>
</address>
</author>
<date year="2020" month="June" />
<area>Security</area>
<workgroup/>
<keyword>JSON</keyword>
<keyword>ECMAScript</keyword>
<keyword>Signatures</keyword>
<keyword>Cryptography</keyword>
<keyword>Canonicalization</keyword>
<abstract>
<t>
Cryptographic operations like hashing and signing need the data to be
expressed in an invariant format so that the operations are reliably
repeatable.
One way to address this is to create a canonical representation of
the data. Canonicalization also permits data to be exchanged in its
original form on the "wire" while cryptographic operations
performed on the canonicalized counterpart of the data in the
producer and consumer endpoints generate consistent results.
</t>
<t>
This document describes the JSON Canonicalization Scheme (JCS).
This specification defines how to create a canonical representation
of JSON data by building on the strict serialization methods for
JSON primitives defined by ECMAScript, constraining JSON data to
the Internet JSON (I-JSON) subset, and by using deterministic property
sorting.
</t>
</abstract>
</front>
<middle>
<section anchor="Introduction" numbered="true" toc="default">
<name>Introduction</name>
<t>
This document describes the JSON Canonicalization Scheme (JCS).
This specification defines how to create a canonical representation
of JSON <xref target="RFC8259" format="default"/> data by building
on the strict serialization methods for
JSON primitives defined by ECMAScript <xref target="ECMA-262"
format="default"/>,
constraining JSON data to the I-JSON <xref target="RFC7493"
format="default"/>
subset, and by using deterministic property sorting. The output from
JCS is a
"hashable" representation of JSON data that can be used by
cryptographic methods.
The subsequent paragraphs outline the primary design considerations.
</t>
<t>
Cryptographic operations like hashing and signing need the data to be
expressed in an invariant format so that the operations are reliably
repeatable.
One way to accomplish this is to convert the data into
a format that has a simple and fixed representation,
like base64url <xref target="RFC4648" format="default"/>.
This is how JSON Web Signature (JWS) <xref target="RFC7515"
format="default"/> addressed this issue.
Another solution is to create a canonical version of the data,
similar to what was done for the XML signature <xref target="XMLDSIG"
format="default"/> standard.
</t>
<t>
The primary advantage with a canonicalizing scheme is that data
can be kept in its original form. This is the core rationale behind
JCS.
Put another way, using canonicalization enables a JSON object to
remain a JSON object
even after being signed. This can simplify system design,
documentation, and logging.
</t>
<t>
To avoid "reinventing the wheel", JCS relies on the serialization of
JSON primitives
(strings, numbers, and literals), as defined by ECMAScript (aka
JavaScript)
<xref target="ECMA-262" format="default"/> beginning with version 6.
</t>
<t>
Seasoned XML developers may recall difficulties getting XML signatures
to validate. This was usually due to different interpretations of the
quite intricate
XML canonicalization rules as well as of the equally complex
Web Services security standards.
The reasons why JCS should not suffer from similar issues are:
</t>
<ul spacing="normal">
<li>
JSON does not have a namespace concept and default values.
</li>
<li>
Data is constrained to the I&nbhy;JSON <xref target="RFC7493"
format="default"/> subset.
This eliminates the need for specific parsers for dealing with
canonicalization.
</li>
<li>
JCS-compatible serialization of JSON primitives is currently
supported
by most web browsers as well as by Node.js <xref target="NODEJS"
format="default"/>.
</li>
<li>
The full JCS specification is currently supported by multiple
open-source implementations (see <xref target="open.source"
format="default"/>).
See also <xref target="impl.guidelines" format="default"/> for
implementation
guidelines.
</li>
</ul>
<t>
JCS is compatible with some existing systems relying on JSON
canonicalization such as JSON Web Key (JWK) Thumbprint <xref
target="RFC7638" format="default"/> and Keybase <xref target="KEYBASE"
format="default"/>.
</t>
<t>
For potential uses outside of cryptography, see <xref
target="I-D.rundgren-comparable-json" format="default"/>.
</t>
<t>
The intended audiences of this document are JSON tool vendors as
well as designers of JSON-based cryptographic solutions.
The reader is assumed to be knowledgeable in ECMAScript, including the
"JSON" object.
</t>
</section>
<section anchor="Terminology" numbered="true" toc="default">
<name>Terminology</name>
<t>
Note that this document is not on the IETF standards track. However, a
conformant
implementation is supposed to adhere to the specified behavior for
security and interoperability reasons. This text uses BCP 14 to
describe that necessary behavior.
</t>
<t>
The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>",
"<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>",
"<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are
to be interpreted as described in BCP&nbsp;14 <xref target="RFC2119"/>
<xref target="RFC8174"/> when, and only when, they appear in all capitals,
as shown here.
</t>
</section>
<section anchor="Operation" numbered="true" toc="default">
<name>Detailed Operation</name>
<t>
This section describes the details related to creating
a canonical JSON representation and how they are addressed by JCS.
</t>
<t>
<xref target="impl.guidelines" format="default"/> describes
the <bcp14>RECOMMENDED</bcp14> way of adding JCS support to existing
JSON tools.
</t>
<section anchor="json.datacreation" numbered="true" toc="default">
<name>Creation of Input Data</name>
<t>
Data to be canonically serialized is usually created by:
</t>
<ul spacing="normal">
<li>
Parsing previously generated JSON data.
</li>
<li>
Programmatically creating data.
</li>
</ul>
<t>
Irrespective of the method used, the data to be serialized
<bcp14>MUST</bcp14> be adapted
for I&nbhy;JSON <xref target="RFC7493" format="default"/>
formatting, which implies the following:
</t>
<ul spacing="normal">
<li>
JSON objects <bcp14>MUST NOT</bcp14> exhibit duplicate property
names.
</li>
<li>
JSON string data <bcp14>MUST</bcp14> be expressible
as Unicode <xref target="UNICODE" format="default"/>.
</li>
<li>
JSON number data <bcp14>MUST</bcp14> be expressible
as IEEE 754 <xref target="IEEE754" format="default"/>
double-precision values.
For applications needing higher precision or longer integers than
offered by IEEE 754 double precision, it is
<bcp14>RECOMMENDED</bcp14> to represent such
numbers as JSON strings; see <xref target="json.bignumbers"
format="default"/> for
details on how this can be performed in an interoperable and
extensible way.
</li>
</ul>
<t>
An additional constraint is that parsed JSON string data <bcp14>MUST
NOT</bcp14> be altered during subsequent serializations. For more
information, see <xref target="string.subtypes" format="default"/>.
</t>
<t>
Note: Although the Unicode standard offers the possibility of
rearranging certain character sequences, referred to as "Unicode
Normalization" <xref target="UCNORM" format="default"/>,
JCS-compliant string processing does not take this into
consideration. That is, all components involved in a scheme
depending on JCS <bcp14>MUST</bcp14> preserve Unicode string data
"as is".
</t>
</section>
<section anchor="json.canonicalization" numbered="true" toc="default">
<name>Generation of Canonical JSON Data</name>
<t>
The following subsections describe the steps required to create a
canonical
JSON representation of the data elaborated on in the previous
section.
</t>
<t>
<xref target="canonicalize.js" format="default"/> shows sample code
for an ECMAScript-based canonicalizer, matching the JCS
specification.
</t>
<section anchor="json.whitespace" numbered="true" toc="default">
<name>Whitespace</name>
<t>
Whitespace between JSON tokens <bcp14>MUST NOT</bcp14> be emitted.
</t>
</section>
<section anchor="json.serialization.data" numbered="true"
toc="default">
<name>Serialization of Primitive Data Types</name>
<t>
Assume the following JSON object is parsed:
</t>
<sourcecode type="json">
{
"numbers": [333333333.33333329, 1E30, 4.50,
2e-3, 0.000000000000000000000000001],
"string": "\u20ac$\u000F\u000aA'\u0042\u0022\u005c\\\"\/",
"literals": [null, true, false]
}
</sourcecode>
<t>
If the parsed data is subsequently serialized using a serializer
compliant with ECMAScript's "JSON.stringify()", the result would
(with a line wrap added for display purposes only) be rather
divergent with respect to the original data:
</t>
<sourcecode type="json">
{"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string":
"€$\u000f\nA'B\"\\\\\"/","literals":[null,true,false]}
</sourcecode>
<t>
The reason for the difference between the parsed data and its
serialized counterpart is due to a wide tolerance on input data
(as defined
by JSON <xref target="RFC8259" format="default"/>), while output
data (as defined by ECMAScript)
has a fixed representation. As can be seen in the example,
numbers are subject to rounding as well.
</t>
<t>
The following subsections describe the serialization of primitive
JSON data types
according to JCS. This part is identical to that of ECMAScript.
In the (unlikely) event that a future version of ECMAScript would
invalidate any of the following serialization methods, it will be
up to the developer community to
either stick to this specification or create a new specification.
</t>
<section anchor="json.ser.literals" numbered="true" toc="default">
<name>Serialization of Literals</name>
<t>
In accordance with JSON <xref target="RFC8259"
format="default"/>,
the literals "null", "true", and
"false" <bcp14>MUST</bcp14> be serialized as null, true, and
false, respectively.
</t>
</section>
<section anchor="json.ser.string" numbered="true" toc="default">
<name>Serialization of Strings</name>
<t>
For JSON string data (which includes JSON object property names
as well), each Unicode code point <bcp14>MUST</bcp14> be
serialized as described below (see Section 24.3.2.2 of <xref
target="ECMA-262" format="default"/>):
</t>
<ul spacing="normal">
<li>
If the Unicode value falls within the traditional ASCII
control character range (U+0000 through U+001F), it
<bcp14>MUST</bcp14> be serialized using lowercase hexadecimal
Unicode notation (\uhhhh) unless it is in the set of
predefined JSON control characters U+0008, U+0009, U+000A,
U+000C, or U+000D, which <bcp14>MUST</bcp14> be serialized as
\b, \t, \n, \f, and \r, respectively.
</li>
<li>
If the Unicode value is outside of the ASCII control character
range, it <bcp14>MUST</bcp14> be serialized "as&nbsp;is"
unless it is equivalent to U+005C&nbsp;(\) or U+0022&nbsp;("),
which <bcp14>MUST</bcp14> be serialized as \\ and \",
respectively.
</li>
</ul>
<t>
Finally, the resulting sequence of Unicode code points
<bcp14>MUST</bcp14> be enclosed in double quotes (").
</t>
<t>
Note: Since invalid Unicode data like "lone surrogates" (e.g.,
U+DEAD)
may lead to interoperability issues including broken signatures,
occurrences of such data <bcp14>MUST</bcp14> cause a compliant
JCS implementation to terminate
with an appropriate error.
</t>
</section>
<section anchor="json.ser.number" numbered="true" toc="default">
<name>Serialization of Numbers</name>
<t>
ECMAScript builds on the IEEE 754 <xref target="IEEE754"
format="default"/> double-precision standard for representing
JSON number data. Such data <bcp14>MUST</bcp14> be serialized
according to Section 7.1.12.1 of <xref target="ECMA-262"
format="default"/>, including the "Note 2" enhancement.
</t>
<t>
Due to the relative complexity of this part, the algorithm
itself is not included in this document.
For implementers of JCS-compliant number serialization,
Google's implementation in V8 <xref target="V8"
format="default"/> may serve as a reference.
Another compatible number serialization reference implementation
is Ryu <xref target="RYU" format="default"/>,
which is used by the JCS open-source Java implementation
mentioned in <xref target="open.source" format="default"/>.
<xref target="json.ieee754.test" format="default"/> holds a set
of IEEE 754 sample values and their
corresponding JSON serialization.
</t>
<t>
Note: Since Not a Number (NaN) and Infinity
are not permitted in JSON, occurrences of NaN or
Infinity <bcp14>MUST</bcp14> cause a compliant JCS
implementation to terminate with an appropriate error.
</t>
</section>
</section>
<section anchor="json.sorting.properties" numbered="true"
toc="default">
<name>Sorting of Object Properties</name>
<t>
Although the previous step normalized the representation of
primitive JSON data types, the result would not yet qualify as
"canonical" since JSON object properties are not in lexicographic
(alphabetical) order.
</t>
<t>
Applied to the sample in <xref target="json.serialization.data"
format="default"/>,
a properly canonicalized version should (with a
line wrap added for display purposes only) read as:
</t>
<sourcecode type="json">
{"literals":[null,true,false],"numbers":[333333333.3333333,
1e+30,4.5,0.002,1e-27],"string":"€$\u000f\nA'B\"\\\\\"/"}
</sourcecode>
<t>
The rules for lexicographic sorting of JSON object
properties according to JCS are as follows:
</t>
<ul spacing="normal">
<li>
JSON object properties <bcp14>MUST</bcp14> be sorted
recursively,
which means that JSON child Objects
<bcp14>MUST</bcp14> have their properties sorted as well.
</li>
<li>
JSON array data <bcp14>MUST</bcp14> also be scanned for the
presence of JSON objects (if an object is found, then its
properties <bcp14>MUST</bcp14> be sorted),
but array element order <bcp14>MUST NOT</bcp14> be changed.
</li>
</ul>
<t>
When a JSON object is about to have its properties
sorted, the following measures <bcp14>MUST</bcp14> be adhered to:
</t>
<ul spacing="normal">
<li>
The sorting process is applied to property name strings in their
"raw" (unescaped) form.
That is, a newline character is treated as U+000A.
</li>
<li>
Property name strings to be sorted are formatted
as arrays of UTF-16 <xref target="UNICODE" format="default"/>
code units.
The sorting is based on pure value comparisons, where code units
are treated as
unsigned integers, independent of locale settings.
</li>
<li>
<t>
Property name strings either have different values at some
index that is
a valid index for both strings, or their lengths are
different, or both.
If they have different values at one or more index
positions, let k be the smallest such index; then, the string
whose
value at position k has the smaller value, as determined by
using
the "&lt;" operator, lexicographically precedes the other
string.
If there is no index position at which they differ,
then the shorter string lexicographically precedes the longer
string.
</t>
<t>
In plain English, this means that property names are sorted in
ascending order like the following:
</t>
<artwork type="ascii-art" align="left" alt=""><![CDATA[
""
"a"
"aa"
"ab"
]]></artwork>
</li>
</ul>
<t>
The rationale for basing the sorting algorithm on UTF-16 code
units is that
it maps directly to the string type in ECMAScript (featured in web
browsers
and Node.js), Java, and .NET. In addition, JSON only supports
escape sequences
expressed as UTF-16 code units, making knowledge and handling of
such data
a necessity anyway.
Systems using another internal representation of string data will
need to convert
JSON property name strings into arrays of UTF-16 code units before
sorting.
The conversion from UTF-8 or UTF-32 to UTF-16 is defined by the
Unicode <xref target="UNICODE" format="default"/> standard.
</t>
<t>
The following JSON test data can be used for verifying the correctness of
the sorting scheme in a JCS implementation:
</t>
<sourcecode type="json">
{
"\u20ac": "Euro Sign",
"\r": "Carriage Return",
"\ufb33": "Hebrew Letter Dalet With Dagesh",
"1": "One",
"\ud83d\ude00": "Emoji: Grinning Face",
"\u0080": "Control",
"\u00f6": "Latin Small Letter O With Diaeresis"
}
</sourcecode>
<t>
Expected argument order after sorting property strings:
</t>
<artwork type="ascii-art" align="left" alt=""><![CDATA[ "Carriage Ret
urn"
"One"
"Control"
"Latin Small Letter O With Diaeresis"
"Euro Sign"
"Emoji: Grinning Face"
"Hebrew Letter Dalet With Dagesh"]]></artwork>
<t>
Note: For the purpose of obtaining a deterministic property order,
sorting of data encoded in UTF-8 or UTF-32 would also work, but
the outcome for JSON data like above would differ and thus be
incompatible with this specification.
However, in practice, property names are rarely defined outside of
7-bit ASCII, making it possible to sort string data in UTF-8 or
UTF-32 format without conversion to UTF-16 and still be compatible
with JCS. Whether or not this is a viable option depends on the
environment JCS is used in.
</t>
</section>
<section anchor="json.utf8" numbered="true" toc="default">
<name>UTF-8 Generation</name>
<t>
Finally, in order to create a platform-independent representation,
the result of the preceding step <bcp14>MUST</bcp14> be encoded in
UTF-8.
</t>
<t>
Applied to the sample in <xref target="json.sorting.properties"
format="default"/>, this
should yield the following bytes, here shown in hexadecimal
notation:
</t>
<artwork type="ascii-art" align="left" alt=""><![CDATA[ 7b 22 6c 69 7
4 65 72 61 6c 73 22 3a 5b 6e 75 6c 6c 2c 74 72
75 65 2c 66 61 6c 73 65 5d 2c 22 6e 75 6d 62 65 72 73 22 3a
5b 33 33 33 33 33 33 33 33 33 2e 33 33 33 33 33 33 33 2c 31
65 2b 33 30 2c 34 2e 35 2c 30 2e 30 30 32 2c 31 65 2d 32 37
5d 2c 22 73 74 72 69 6e 67 22 3a 22 e2 82 ac 24 5c 75 30 30
30 66 5c 6e 41 27 42 5c 22 5c 5c 5c 5c 5c 22 2f 22 7d]]></artwork>
<t>
This data is intended to be usable as input to cryptographic
methods.
</t>
</section>
</section>
</section>
<section anchor="IANA" numbered="true" toc="default">
<name>IANA Considerations</name>
<t>
This document has no IANA actions.
</t>
</section>
<section anchor="Security" numbered="true" toc="default">
<name>Security Considerations</name>
<t>
It is crucial to perform sanity checks on input data to avoid
overflowing buffers and similar things that could affect the
integrity of the system.
</t>
<t>
When JCS is applied to signature schemes like the one described
in <xref target="impl.guidelines" format="default"/>,
applications <bcp14>MUST</bcp14> perform the following operations
before acting
upon received data:
</t>
<ol spacing="normal" type="1">
<li>
Parse the JSON data and verify that it adheres to I-JSON.
</li>
<li>
Verify the data for correctness according to the conventions defined
by the
ecosystem where it is to be used. This also includes locating the
property holding the signature data.
</li>
<li>
Verify the signature.
</li>
</ol>
<t>
If any of these steps fail, the operation in progress
<bcp14>MUST</bcp14> be aborted.
</t>
</section>
</middle>
<back>
<displayreference target="I-D.rundgren-comparable-json" to="JSONCOMP"/>
<references>
<name>References</name>
<references>
<name>Normative References</name>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8259.xml"/>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7493.xml"/>
<reference anchor="ECMA-262"
target="https://www.ecma-international.org/ecma-262/10.0/index
.html">
<front>
<title>ECMAScript 2019 Language Specification</title>
<author>
<organization>ECMA International</organization>
</author>
<date year="2019" month="June"/>
</front>
<refcontent>Standard ECMA-262 10th Edition</refcontent>
</reference>
<reference anchor="IEEE754"
target="https://ieeexplore.ieee.org/document/8766229">
<front>
<title>IEEE Standard for Floating-Point Arithmetic</title>
<seriesInfo name="DOI" value="10.1109/IEEESTD.2019.8766229"/>
<author>
<organization>IEEE</organization>
</author>
</front>
<refcontent>IEEE 754-2019</refcontent>
</reference>
<reference anchor="UNICODE"
target="https://www.unicode.org/versions/latest/">
<front>
<title>The Unicode Standard</title>
<author>
<organization>The Unicode Consortium</organization>
</author>
</front>
</reference>
<reference anchor="UCNORM"
target="https://www.unicode.org/reports/tr15/">
<front>
<title>Unicode Normalization Forms</title>
<author>
<organization>The Unicode Consortium</organization>
</author>
</front>
</reference>
</references>
<references>
<name>Informative References</name>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7638.xml"/>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml"/>
<xi:include
href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7515.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.rundgren-compara
ble-json.xml"/>
<reference anchor="V8" target="https://v8.dev/">
<front>
<title>What is V8?</title>
<author>
<organization>Google LLC</organization>
</author>
</front>
</reference>
<reference anchor="RYU" target="https://github.com/ulfjack/ryu">
<front>
<title>Ryu floating point number serializing algorithm</title>
<author></author>
<date month="May" year="2020"/>
</front>
<refcontent>commit 27d3c55</refcontent>
</reference>
<reference anchor="NODEJS" target="https://nodejs.org">
<front>
<title>Node.js</title>
<author>
<organization>OpenJS Foundation
</organization>
</author>
</front>
</reference>
<reference anchor="KEYBASE"
target="https://keybase.io/docs/api/1.0/canonical_packings">
<front>
<title>Canonical Packings for JSON and Msgpack</title>
<author>
<organization>Keybase
</organization>
</author>
</front>
</reference>
<reference anchor="OPENAPI" target="https://www.openapis.org/">
<front>
<title>The OpenAPI Specification: a broadly adopted industry
standard for describing modern APIs</title>
<author>
<organization>OpenAPI Initiative
</organization>
</author>
</front>
</reference>
<reference anchor="XMLDSIG"
target="https://www.w3.org/TR/xmldsig-core1/">
<front>
<title>XML Signature Syntax and Processing Version 1.1</title>
<author>
<organization>W3C</organization>
</author>
<date month="April" year="2013"/>
</front>
<refcontent>W3C Recommendation</refcontent>
</reference>
</references>
</references>
<section anchor="canonicalize.js" numbered="true" toc="default">
<name>ECMAScript Sample Canonicalizer</name>
<t>
Below is an example of a JCS canonicalizer for usage with
ECMAScript-based systems:
</t>
<sourcecode type="javascript">
////////////////////////////////////////////////////////////
// Since the primary purpose of this code is highlighting //
// the core of the JCS algorithm, error handling and //
// UTF-8 generation were not implemented. //
////////////////////////////////////////////////////////////
var canonicalize = function(object) {
var buffer = '';
serialize(object);
return buffer;
function serialize(object) {
if (object === null || typeof object !== 'object' ||
object.toJSON != null) {
/////////////////////////////////////////////////
// Primitive type or toJSON, use "JSON" //
/////////////////////////////////////////////////
buffer += JSON.stringify(object);
} else if (Array.isArray(object)) {
/////////////////////////////////////////////////
// Array - Maintain element order //
/////////////////////////////////////////////////
buffer += '[';
let next = false;
object.forEach((element) => {
if (next) {
buffer += ',';
}
next = true;
/////////////////////////////////////////
// Array element - Recursive expansion //
/////////////////////////////////////////
serialize(element);
});
buffer += ']';
} else {
/////////////////////////////////////////////////
// Object - Sort properties before serializing //
/////////////////////////////////////////////////
buffer += '{';
let next = false;
Object.keys(object).sort().forEach((property) => {
if (next) {
buffer += ',';
}
next = true;
/////////////////////////////////////////////
// Property names are strings, use "JSON" //
/////////////////////////////////////////////
buffer += JSON.stringify(property);
buffer += ':';
//////////////////////////////////////////
// Property value - Recursive expansion //
//////////////////////////////////////////
serialize(object[property]);
});
buffer += '}';
}
}
};
</sourcecode>
</section>
<section anchor="json.ieee754.test" numbered="true" toc="default">
<name>Number Serialization Samples</name>
<t>
The following table holds a set of ECMAScript-compatible number
serialization samples,
including some edge cases. The column
"IEEE 754" refers to the internal
ECMAScript representation of the "Number" data type, which is based on
the
IEEE 754 <xref target="IEEE754" format="default"/> standard using
64-bit (double-precision) values,
here expressed in hexadecimal.
</t>
<table anchor="ECMAScript-TABLE">
<name>ECMAScript-Compatible JSON Number Serialization Samples</name>
<thead>
<tr>
<th align="center">IEEE 754</th>
<th align="center">JSON Representation</th>
<th align="center">Comment</th>
</tr> </thead> <tbody>
<tr> <td><!--remove for
TXT/PDF--><tt>0000000000000000</tt><!--remove for TXT/PDF--></td>
<td><!--remove for TXT/PDF--><tt>0<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Zero<!--remove for TXT/PDF--></tt></td> </tr>
<tr>
<td><!--remove for TXT/PDF--><tt>8000000000000000<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>0<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Minus zero<!--remove for
TXT/PDF--></tt></td> </tr>
<tr> <td><!--remove for
TXT/PDF--><tt>0000000000000001<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>5e-324<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Min pos number<!--remove for
TXT/PDF--></tt></td> </tr>
<tr> <td><!--remove for
TXT/PDF--><tt>8000000000000001<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>-5e-324<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Min neg number<!--remove for
TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>7fefffffffffffff<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>1.7976931348623157e+308<!--remove for
TXT/PDF--></tt></td> <td><!--remove for TXT/PDF--><tt>Max pos number<!--remove
for TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>ffefffffffffffff<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>-1.7976931348623157e+308<!--remove for
TXT/PDF--></tt></td> <td><!--remove for TXT/PDF--><tt>Max neg number<!--remove
for TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>4340000000000000<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>9007199254740992<!--remove for
TXT/PDF--></tt></td> <td><!--remove for TXT/PDF--><tt>Max pos
int&#xa0;&#xa0;&#xa0;&#xa0;(1)<!--remove for TXT/PDF--></tt></td> </tr> <tr>
<td><!--remove for TXT/PDF--><tt>c340000000000000<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>-9007199254740992<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Max neg
int&#xa0;&#xa0;&#xa0;&#xa0;(1)<!--remove for TXT/PDF--></tt></td> </tr> <tr>
<td><!--remove for TXT/PDF--><tt>4430000000000000<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>295147905179352830000<!--remove for TXT/PDF--></tt></td>
<td><!--remove for
TXT/PDF--><tt>~2**68&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;(2)<!-
-remove
for TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>7fffffffffffffff<!--remove for TXT/PDF--></tt></td>
<td></td>
<td><!--remove for
TXT/PDF--><tt>NaN&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#x
a0;&#xa0;(3)<!--remov\
e for TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>7ff0000000000000<!--remove for TXT/PDF--></tt></td>
<td></td>
<td><!--remove for
TXT/PDF--><tt>Infinity&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;(3)<!--remove
for TXT/PDF--></tt></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>44b52d02c7e14af5<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>9.999999999999997e+22<!--remove for
TXT/PDF--></tt></td> <td></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>44b52d02c7e14af6<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>1e+23<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>44b52d02c7e14af7<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>1.0000000000000001e+23<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>444b1ae4d6e2ef4e<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>999999999999999700000<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>444b1ae4d6e2ef4f<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>999999999999999900000<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>444b1ae4d6e2ef50<!--remove for
TXT/PDF--></tt></td> <td><!--remove for TXT/PDF--><tt>1e+21 <!--remove for
TXT/PDF--></tt></td> <td></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>3eb0c6f7a0b5ed8c<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>9.999999999999997e-7<!--remove for
TXT/PDF--></tt></td> <td></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>3eb0c6f7a0b5ed8d<!--remove for TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>0.000001<!--remove for TXT/PDF--></tt></td>
<td>
</td>
</tr>
<tr>
<td><!--remove for TXT/PDF--><tt>41b3de4355555553<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>333333333.3333332<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>41b3de4355555554<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>333333333.33333325<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>41b3de4355555555<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>333333333.3333333<!--remove for TXT/PDF--></tt></td>
<td></td> </tr>
<tr> <td><!--remove for TXT/PDF--><tt>41b3de4355555556<!--remove for
TXT/PDF--></tt></td> <td><!--remove for
TXT/PDF--><tt>333333333.3333334<!--remove for
TXT/PDF--></tt></td><td></td> </tr> <tr> <td><!--remove for
TXT/PDF--><tt>41b3de4355555557<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>333333333.33333343<!--remove for
TXT/PDF--></tt></td>
<td></td>
</tr>
<tr>
<td><!--remove for TXT/PDF--><tt>becbf647612f3696<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>-0.0000033333333333333333<!--remove for
TXT/PDF--></tt></td>
<td></td>
</tr>
<tr>
<td><!--remove for TXT/PDF--><tt>43143ff3c1cb0959<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>1424953923781206.2<!--remove for
TXT/PDF--></tt></td>
<td><!--remove for TXT/PDF--><tt>Round to even &#xa0;(4)<!--remove for
TXT/PDF--></tt></td>
</tr>
</tbody>
</table>
<t>
Notes:
</t>
<ol spacing="normal" type="(%d)">
<li>
For maximum compliance with the ECMAScript "JSON" object,
values that are to be interpreted as true integers
<bcp14>SHOULD</bcp14> be in the range -9007199254740991 to
9007199254740991.
However, how numbers are used in applications does not affect the
JCS algorithm.
</li>
<li>
Although a set of specific integers like 2**68 could be regarded as
having
extended precision, the JCS/ECMAScript number serialization
algorithm does not take this into consideration.
</li>
<li>
Values out of range are not permitted in JSON.
See <xref target="json.ser.number" format="default"/>.
</li>
<li>
This number is exactly 1424953923781206.25 but will, after the "Note
2" rule
mentioned in <xref target="json.ser.number" format="default"/>, be
truncated and
rounded to the closest even value.
</li>
</ol>
<t>
For a more exhaustive validation of a JCS number serializer, you may
test against a file (currently) available in the development portal
(see <xref target="json.development" format="default"/>) containing a
large set of sample values. Another option is running V8 <xref
target="V8" format="default"/> as a live reference together with a
program generating a substantial amount of random IEEE 754 values.
</t>
</section>
<section anchor="json.wireformat" numbered="true" toc="default">
<name>Canonicalized JSON as "Wire Format"</name>
<t>
Since the result from the canonicalization process (see <xref
target="json.utf8" format="default"/>) is fully valid JSON, it can
also be used as "Wire Format". However, this is just an option since
cryptographic schemes based on JCS, in most cases, would not depend on
that externally supplied JSON data already being canonicalized.
</t>
<t>
In fact, the ECMAScript standard way of serializing objects using
"JSON.stringify()" produces a
more "logical" format, where properties are
kept in the order they were created or received. The
example below shows an address record that could benefit from
ECMAScript standard serialization:
</t>
<sourcecode type="json">
{
"name": "John Doe",
"address": "2000 Sunset Boulevard",
"city": "Los Angeles",
"zip": "90001",
"state": "CA"
}
</sourcecode>
<t>
Using canonicalization, the properties above would be output in the
order
"address", "city", "name", "state", and "zip", which adds fuzziness
to the data from a human (developer or technical support) perspective.
Canonicalization also converts JSON data into a single line of text,
which may
be less than ideal for debugging and logging.
</t>
</section>
<section anchor="json.bignumbers" numbered="true" toc="default">
<name>Dealing with Big Numbers</name>
<t>
There are several issues associated with the
JSON number type, here illustrated by the following
sample object:
</t>
<sourcecode type="json">
{
"giantNumber": 1.4e+9999,
"payMeThis": 26000.33,
"int64Max": 9223372036854775807
}
</sourcecode>
<t>
Although the sample above conforms to JSON <xref target="RFC8259"
format="default"/>,
applications would normally use different native data types for
storing
"giantNumber" and "int64Max". In addition, monetary data like
"payMeThis" would
presumably not rely on floating-point data types due to rounding
issues with respect
to decimal arithmetic.
</t>
<t>
The established way of handling this kind of "overloading" of the
JSON number type (at least in an extensible manner) is through
mapping mechanisms, instructing parsers what to do with different
properties
based on their name. However, this greatly limits the value of using
the
JSON number type outside of its original, somewhat constrained
JavaScript context.
The ECMAScript "JSON" object does not support mappings to the JSON
number type either.
</t>
<t>
Due to the above, numbers that do not have a natural place in the
current
JSON ecosystem <bcp14>MUST</bcp14> be wrapped using the JSON string
type. This is close to
a de facto standard for open systems. This is also applicable for
other data types that do not have direct support in JSON, like
"DateTime"
objects as described in <xref target="string.subtypes"
format="default"/>.
</t>
<t>
Aided by a system using the JSON string type, be it programmatic like
</t>
<sourcecode type="">
var obj = JSON.parse('{"giantNumber": "1.4e+9999"}');
var biggie = new BigNumber(obj.giantNumber);
</sourcecode>
<t>
or declarative schemes like OpenAPI <xref target="OPENAPI"
format="default"/>,
JCS imposes no limits on applications, including when using
ECMAScript.
</t>
</section>
<section anchor="string.subtypes" numbered="true" toc="default">
<name>String Subtype Handling</name>
<t>
Due to the limited set of data types featured in JSON, the JSON string
type is commonly used for holding subtypes. This can, depending on
JSON parsing method, lead to interoperability problems, which
<bcp14>MUST</bcp14> be dealt with by JCS-compliant applications
targeting a wider audience.
</t>
<t>
Assume you want to parse a JSON object where the schema
designer assigned the property "big" for holding a "BigInt" subtype
and
"time" for holding a "DateTime" subtype, while "val" is supposed to be
a JSON number
compliant with JCS. The following example shows such an object:
</t>
<sourcecode type="json">
{
"time": "2019-01-28T07:45:10Z",
"big": "055",
"val": 3.5
}
</sourcecode>
<t>Parsing of this object can be accomplished by the following
ECMAScript statement:</t>
<sourcecode type="">
var object = JSON.parse(JSON_object_featured_as_a_string);
</sourcecode>
<t>After parsing, the actual data can be extracted, which for subtypes,
also involves a conversion step using the result of the parsing process
(an ECMAScript object) as input:</t>
<sourcecode type="">
... = new Date(object.time); // Date object
... = BigInt(object.big); // Big integer
... = object.val; // JSON/JS number
</sourcecode>
<t>
Note that the "BigInt" data type is currently only natively supported
by V8 <xref target="V8" format="default"/>.
</t>
<t>
Canonicalization of "object" using the sample code in <xref
target="canonicalize.js" format="default"/> would return the
following string:
</t>
<sourcecode type="json">
{"big":"055","time":"2019-01-28T07:45:10Z","val":3.5}
</sourcecode>
<t>
Although this is (with respect to JCS) technically correct, there is
another way of parsing JSON data, which also can be used with
ECMAScript as shown below:
</t>
<sourcecode type="">
// "BigInt" requires the following code to become JSON serializable
BigInt.prototype.toJSON = function() {
return this.toString();
};
// JSON parsing using a "stream"-based method
var object = JSON.parse(JSON_object_featured_as_a_string,
(k,v) => k == 'time' ? new Date(v) : k == 'big' ? BigInt(v) : v
);
</sourcecode>
<t>
If you now apply the canonicalizer in <xref target="canonicalize.js"
format="default"/> to "object", the following string would be
generated:
</t>
<sourcecode type="json">
{"big":"55","time":"2019-01-28T07:45:10.000Z","val":3.5}
</sourcecode>
<t>
In this case, the string arguments for "big" and "time" have changed
with respect to the original,
presumably making an application depending on JCS fail.
</t>
<t>
The reason for the deviation is that in stream- and schema-based JSON
parsers,
the original string argument is typically replaced on the fly
by the native subtype that, when serialized, may exhibit a different
and platform-dependent pattern.
</t>
<t>
That is, stream- and schema-based parsing <bcp14>MUST</bcp14> treat
subtypes as "pure" (immutable) JSON string types and perform the
actual conversion to the designated native type in a subsequent step.
In modern programming platforms like Go, Java, and C#, this can be
achieved with moderate efforts by combining annotations, getters, and
setters. Below is an example in C#/Json.NET showing a part of a class
that is serializable as a JSON object:
</t>
<sourcecode type="">
// The "pure" string solution uses a local
// string variable for JSON serialization while
// exposing another type to the application
[JsonProperty("amount")]
private string _amount;
[JsonIgnore]
public decimal Amount {
get { return decimal.Parse(_amount); }
set { _amount = value.ToString(); }
}
</sourcecode>
<t>
In an application, "Amount" can be accessed as any other property
while it is actually represented by a quoted string in JSON contexts.
</t>
<t>
Note: The example above also addresses the constraints on numeric data
implied by I-JSON (the C# "decimal" data type has quite different
characteristics compared to IEEE 754 double precision).
</t>
<section anchor="string.subtypes.arrays" numbered="true" toc="default">
<name>Subtypes in Arrays</name>
<t>
Since the JSON array construct permits mixing arbitrary JSON data
types,
custom parsing and serialization code may be required
to cope with subtypes anyway.
</t>
</section>
</section>
<section anchor="impl.guidelines" numbered="true" toc="default">
<name>Implementation Guidelines</name>
<t>
The optimal solution is integrating support for JCS directly
in JSON serializers (parsers need no changes).
That is, canonicalization would just be an additional "mode"
for a JSON serializer. However, this is currently not the case.
Fortunately, JCS support can be introduced through externally supplied
canonicalizer software acting as a post processor to existing
JSON serializers. This arrangement also relieves the JCS implementer
from
having to deal with how underlying data is to be represented in JSON.
</t>
<t>
The post processor concept enables signature creation schemes like the
following:
</t>
<ol spacing="normal" type="1">
<li>
Create the data to be signed.
</li>
<li>
Serialize the data using existing JSON tools.
</li>
<li>
Let the external canonicalizer process the serialized data and
return canonicalized result data.
</li>
<li>
Sign the canonicalized data.
</li>
<li>
Add the resulting signature value to the original JSON data
through a designated signature property.
</li>
<li>
Serialize the completed (now signed) JSON object using existing
JSON tools.
</li>
</ol>
<t>
A compatible signature verification scheme would then be as follows:
</t>
<ol spacing="normal" type="1">
<li>
Parse the signed JSON data using existing JSON tools.
</li>
<li>
Read and save the signature value from the designated signature
property.
</li>
<li>
Remove the signature property from the parsed JSON object.
</li>
<li>
Serialize the remaining JSON data using existing JSON tools.
</li>
<li>
Let the external canonicalizer process the serialized data and
return canonicalized result data.
</li>
<li>
Verify that the canonicalized data matches the saved signature
value
using the algorithm and key used for creating the signature.
</li>
</ol>
<t>
A canonicalizer like above is effectively only a "filter", potentially
usable with
a multitude of quite different cryptographic schemes.
</t>
<t>
Using a JSON serializer with integrated JCS support, the serialization
performed
before the canonicalization step could be eliminated for both
processes.
</t>
</section>
<section anchor="open.source" numbered="true" toc="default">
<name>Open-Source Implementations</name>
<t>
The following open-source implementations have been verified to be
compatible with JCS:
</t>
<ul spacing="normal">
<li>
JavaScript: <eref
target="https://www.npmjs.com/package/canonicalize"
brackets="angle"/>
</li>
<li>
Java: <eref
target="https://github.com/erdtman/java-json-canonicalization"
brackets="angle"/>
</li>
<li>
Go: <eref
target="https://github.com/cyberphone/json-canonicalization/tree/mast
er/go"
brackets="angle"/>
</li>
<li>
.NET/C#: <eref
target="https://github.com/cyberphone/json-canonicalization/tree/mast
er/dotnet"
brackets="angle"/>
</li>
<li>
Python: <eref
target="https://github.com/cyberphone/json-canonicalization/tree/mast
er/python3"
brackets="angle"/>
</li>
</ul>
</section>
<section anchor="json.otherefforts" numbered="true" toc="default">
<name>Other JSON Canonicalization Efforts</name>
<t>
There are (and have been) other efforts creating "Canonical JSON".
Below is a list of URLs to some of them:
</t>
<ul spacing="normal">
<li>
<eref
target="https://tools.ietf.org/html/draft-staykov-hu-json-can
onical-form-00"
brackets="angle"/>
</li>
<li>
<eref target="https://gibson042.github.io/canonicaljson-spec/"
brackets="angle"/>
</li>
<li>
<eref target="http://wiki.laptop.org/go/Canonical_JSON"
brackets="angle"/>
</li>
</ul>
<t>
The listed efforts all build on text-level JSON-to-JSON
transformations. The primary feature of text-level canonicalization is
that it can be made neutral to the flavor of JSON used. However, such
schemes also imply major changes to the JSON parsing process, which is
a likely hurdle for adoption. Albeit at the expense of certain JSON
and application constraints, JCS was designed to be compatible with
existing JSON tools.
</t>
</section>
<section anchor="json.development" numbered="true" toc="default">
<name>Development Portal</name>
<t>
The JCS specification is currently developed at:
<eref target="https://github.com/cyberphone/ietf-json-canon"
brackets="angle"/>.
</t>
<t>
JCS source code and extensive test data is available at:
<eref target="https://github.com/cyberphone/json-canonicalization"
brackets="angle"/>.
</t>
</section>
<section anchor="Acknowledgements" numbered="false" toc="default">
<name>Acknowledgements</name>
<t>
Building on ECMAScript number serialization was
originally proposed by <contact fullname="James Manger"/>. This
ultimately led to the
adoption of the entire ECMAScript serialization scheme for JSON
primitives.
</t>
<t>
Other people who have contributed with valuable input to this
specification include
<contact fullname="Scott Ananian"/>,
<contact fullname="Tim Bray"/>,
<contact fullname="Ben Campbell"/>,
<contact fullname="Adrian Farell"/>,
<contact fullname="Richard Gibson"/>,
<contact fullname="Bron Gondwana"/>,
<contact fullname="John-Mark Gurney"/>,
<contact fullname="Mike Jones,"/>
<contact fullname="John Levine"/>,
<contact fullname="Mark Miller"/>,
<contact fullname="Matthew Miller"/>,
<contact fullname="Mark Nottingham"/>,
<contact fullname="Mike Samuel"/>,
<contact fullname="Jim Schaad"/>,
<contact fullname="Robert Tupelo-Schneck"/>,
and <contact fullname="Michal Wadas"/>.
</t>
<t>
For carrying out real-world concept verification, the software and
support for number serialization provided by
<contact fullname="Ulf Adams"/>,
<contact fullname="Tanner Gooding"/>,
and <contact fullname="Remy Oudompheng"/>
was very helpful.
</t>
</section>
</back>
</rfc>
 End of changes. 1 change blocks. 
lines changed or deleted lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/