RDF 1.2 N-Quads

A line-based syntax for RDF datasets

W3C First Public Working Draft

More details about this document
This version:
https://www.w3.org/TR/2023/WD-rdf12-n-quads-20230504/
Latest published version:
https://www.w3.org/TR/rdf12-n-quads/
Latest editor's draft:
https://w3c.github.io/rdf-n-quads/spec/
History:
https://www.w3.org/standards/history/rdf12-n-quads
Commit history
Test suite:
https://w3c.github.io/rdf-tests/nquads/
Latest Recommendation:
https://www.w3.org/TR/n-quads
Editors:
Gregg Kellogg
Dominik Tomaszuk
Former editor:
Gavin Carothers
Feedback:
GitHub w3c/rdf-n-quads (pull requests, new issue, open issues)
public-rdf-star-wg@w3.org with subject line [rdf12-n-quads] … message topic … (archives)

Abstract

N-Quads is a line-based, plain text format for encoding an RDF dataset.

Status of This Document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document is part of the RDF 1.2 document suite. The N-Quads format is a line-based RDF syntax, which is an extension of N-Triples [RDF12-N-TRIPLES]. The main distinction is that N-Quads allows the encoding of multiple graphs in a single document representing an RDF Dataset.

This document was published by the RDF-star Working Group as a First Public Working Draft using the Recommendation track.

Publication as a First Public Working Draft does not imply endorsement by W3C and its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

This section is non-normative.

This document defines N-Quads, a concrete syntax for RDF [RDF12-CONCEPTS], and an extension of N-Triples [RDF12-N-TRIPLES]. N-Quads is an easy to parse, line-based, concrete syntax for RDF Datasets [RDF12-CONCEPTS].

As with N-Triples, an N-Quads document contains no parsing directives.

N-Quads statements are a sequence of RDF terms representing the subject, predicate, and object of an RDF Triple and an optional graph name identifying a named graph associated with the triple within an RDF dataset, also known as a quad. These may be separated by white space (spaces #x20 or tabs #x9). This sequence is terminated by a '.' (optionally followed by white space and/or a comment), and a new line (optional at the end of a document).

Example 1: Use of comments in N-Quads

<http://one.example/subject1> <http://one.example/predicate1> <http://one.example/object1> <http://example.org/graph3> . # comments here
# or on a line by themselves
_:subject1 <http://an.example/predicate1> "object1" <http://example.org/graph1> .
_:subject2 <http://an.example/predicate2> "object2" <http://example.org/graph5> .

The RDF dataset represented by an N-Quads document contains exactly each quad matching the N-Quads statement production.

2. N-Quads Language

This section is non-normative.

An N-Quads document allows writing down an RDF dataset in a textual form. An RDF dataset is made up of simple statements consisting of a subject, predicate, object, an optional graph name and optional blank lines. Comments may be given after a '#' that is not part of another lexical token and continue to the end of the line.

2.1 Simple Statements

A simple statement extends the definition of simple triple in [RDF12-N-TRIPLES] with an optional named graph.

The simplest statement is a sequence of (subject, predicate, object) terms forming an RDF triple and an optional graph name (a blank node identifier or IRI) labeling what named graph in a dataset the triple belongs to. White space (spaces U+0020 or tabs U+0009) may surround terms, except where significant as noted in the grammar.

Comments are treated as white space, and may be given after a '#' that is not part of another lexical token and continue to the end of the line.

The graph name can be omitted, in which case the triples are considered part of the default graph of the RDF dataset.

Example 2: Simple Statement

<http://example.org/#spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/#green-goblin> <http://example.org/graphs/spiderman> .

2.2 IRIs

As in N-Triples, IRIs may be written only as absolute IRIs. IRIs are enclosed in '<' and '>' and may contain numeric escape sequences (described below). For example <http://example.org/#green-goblin>.

2.3 RDF Literals

As in N-Triples, literals are used to identify values such as strings, numbers, dates.

The representation of the lexical form consists of an initial delimiter " (U+0022), a sequence of permitted characters or numeric escape sequence or string escape sequence, and a final delimiter.

Literals may not contain the characters ", LF (U+000A), or CR (U+000D) except in their escaped forms. In addition '\' (U+005C) may not appear in any quoted literal except as part of an escape sequence and a " (U+0022) character can only be included in a quote literal using an escape sequence.

The corresponding RDF lexical form is the characters between the delimiters, after processing any escape sequences. If present, the language tag is preceded by a '@' (U+0040). If there is no language tag, there may be a datatype IRI, preceded by '^^' (U+005E U+005E). If there is no datatype IRI and no language tag it is a simple literal and the datatype is http://www.w3.org/2001/XMLSchema#string.

2.4 RDF Blank Nodes

As in N-Triples, RDF blank nodes are expressed as _: followed by a blank node label which is a series of name characters. The characters in the label are built upon PN_CHARS_BASE, liberalized as follows:

A fresh RDF blank node is allocated for each unique blank node label in a document. Repeated use of the same blank node label identifies the same RDF blank node.

Example 3: Blank nodes in N-Quads

_:alice <http://xmlns.com/foaf/0.1/knows> _:bob .
_:bob <http://xmlns.com/foaf/0.1/knows> _:alice .

3. A Canonical form of N-Quads

This section defines a canonical form of N-Quads which has a completely specified layout. The grammar for the language is unchanged.

Canonical N-Quads extends Canonical N-Triples in [RDF12-N-TRIPLES] to include graphLabel.

While the N-Quads syntax allows choices for the representation and layout of RDF data, the canonical form of N-Quads provides a unique syntactic representation of any quad. Each code point can be represented by only one of UCHAR, ECHAR, or unencoded character, where the relevant production allows for a choice in representation. Each quad is represented entirely on a single line with specified white space.

Canonical N-Quads has the following additional constraints on layout:

4. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MUST, MUST NOT, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This specification defines conformance criteria for:

A conforming N-Quads document is a Unicode string that conforms to the grammar and additional constraints defined in 5. N-Quads Grammar, starting with the nquadsDoc production. An N-Quads document serializes an RDF dataset.

Note

N-Quads documents do not provide a way of serializing empty graphs that may be part of an RDF dataset.

A conforming Canonical N-Quads document is an N-Quads document that follows the additional constraints of Canonical N-Quads.

A conforming N-Quads parser is a system capable of reading N-Quads documents on behalf of an application. It makes the serialized RDF dataset, as defined in 6. Parsing, available to the application, usually through some form of API.

The IRI that identifies the N-Quads language is: http://www.w3.org/ns/formats/N-Quads

4.1 Media Type and Content Encoding

The media type of N-Quads is application/n-quads. The content encoding of N-Quads is always UTF-8. See N-Quads Media Type for the media type registration form.

4.1.1 Other Media Types

The original specification, N-Quads: Extending N-Triples with Context, proposed the use of media type text/x-nquads with an encoding using 7-bit US-ASCII.

5. N-Quads Grammar

An N-Quads document is a Unicode [UNICODE] character string encoded in UTF-8.

5.1 White Space

White space (tab U+0009 or space U+0020) is allowed outside of terminals. Rule names below in capitals indicate where white space is significant.

White space is significant in the production STRING_LITERAL_QUOTE.

A blank line, consisting of only white space and/or a comment, may appear wherever a statement production is allowed, and is treated as white space.

Note

As with, N-Triples [RDF12-N-TRIPLES], N-Quads allows only horizontal white space (tab U+0009 or space U+0020).

5.2 Comments

Comments in N-Quads start at '#' outside an IRIREF or STRING_LITERAL_QUOTE, and continue to the end of line (marked by characters CR (U+000D or LF (U+000A)) or end of file if there is no end of line after the comment marker. Comments are treated as white space.

5.3 Grammar

The EBNF used here is defined in XML 1.0 [EBNF-NOTATION].

Escape sequence rules are the same as N-Triples [RDF12-N-TRIPLES] and Turtle [RDF12-TURTLE]. However, as only the STRING_LITERAL_QUOTE production is allowed new lines in literals MUST be escaped.

[1] nquadsDoc ::= statement? (EOL statement)* EOL?
[2] statement ::= subject predicate object graphLabel? '.'
[3] subject ::= IRIREF | BLANK_NODE_LABEL
[4] predicate ::= IRIREF
[5] object ::= IRIREF | BLANK_NODE_LABEL | literal
[6] graphLabel ::= IRIREF | BLANK_NODE_LABEL
[7] literal ::= STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG)?

Productions for terminals

[144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
[8] EOL ::= [#xD#xA]+
[10] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
[11] STRING_LITERAL_QUOTE ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
[141s] BLANK_NODE_LABEL ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)?
[12] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
[153s] ECHAR ::= '\' [tbnrf"'\]
[157s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[158s] PN_CHARS_U ::= PN_CHARS_BASE | '_'
[160s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
[162s] HEX ::= [0-9] | [A-F] | [a-f]

6. Parsing

Parsing N-Quads requires a state of one item:

6.1 RDF Term Constructors

This table maps productions and lexical tokens to RDF terms or components of RDF terms listed in 6. Parsing:

productiontypeprocedure
IRIREFIRIThe characters between "<" and ">" are taken, with the escape sequences unescaped, to form the unicode string of the IRI.
STRING_LITERAL_QUOTElexical formThe characters between the outermost '"'s are taken, with escape sequences unescaped, to form the unicode string of a lexical form.
LANGTAGlanguage tagThe characters following the @ form the unicode string of the language tag.
literalliteralThe literal has a lexical form of the first rule argument, STRING_LITERAL_QUOTE, and either a language tag of LANGTAG or a datatype IRI of iri, depending on which rule matched the input. If the LANGTAG rule matched, the datatype is rdf:langString and the language tag is LANGTAG. If neither a language tag nor a datatype IRI is provided, the literal has a datatype of xsd:string.
BLANK_NODE_LABELblank nodeThe string matching the second argument, PN_LOCAL, is a key in bnodeLabels. If there is no corresponding blank node in the map, one is allocated.

6.2 RDF Dataset Construction

An N-Quads document defines an RDF dataset composed of RDF graphs composed of a set of RDF triples. The statement production produces a triple defined by the terms constructed for subject, predicate, and object. This RDF triple is added to the graph labeled by the production graphLabel, if no graphLabel is present the triple is added to the RDF dataset's default graph.

A. Privacy Considerations

This section is non-normative.

Editor's note

TODO

B. Security Considerations

This section is non-normative.

The STRING_LITERAL_QUOTE production allows the use of unescaped control characters. Although this specification does not directly expose this content to an end user, it might be presented through a user agent, which may cause the presented text to be obfuscated due to presentation of such characters.

N-Quads is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [RFC3023] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning.

The N-Quads language is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (for example, PGP encryption, checksum validation, password-protected compression) may also be used on N-Quads documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.

N-Quads can express data which is presented to the user, such as RDF Schema labels. Applications rendering strings retrieved from untrusted N-Quads documents, or using unescaped characters, SHOULD use warnings and other appropriate means to limit the possibility that malignant strings might be used to mislead the reader. The security considerations in the media type registration for XML ([RFC3023] section 10) provide additional guidance around the expression of arbitrary data and markup.

N-Quads uses IRIs as term identifiers. Applications interpreting data expressed in N-Quads SHOULD address the security issues of Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7.

Multiple IRIs may have the same appearance. Characters in different scripts may look similar (for instance, a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (for example, LATIN SMALL LETTER "E" followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER "E" WITH ACUTE). Any person or application that is writing or interpreting data in N-Quads must take care to use the IRI that matches the intended semantics, and avoid IRIs that may look similar. Further information about matching visually similar characters can be found in Unicode Security Considerations [UNICODE-SECURITY] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.

C. N-Quads Internet Media Type, File Extension and Macintosh File Type

Contact:
Eric Prud'hommeaux
See also:
How to Register a Media Type for a W3C Specification
Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

The Internet Media Type / MIME Type for N-Quads is "application/n-quads".

It is recommended that N-Quads files have the extension ".nq" (all lowercase) on all platforms.

It is recommended that N-Quads files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows will be submitted to the IESG for review, approval, and registration with IANA.

Type name:
application
Subtype name:
n-quads
Required parameters:
None
Optional parameters:
None
Encoding considerations:
The syntax of N-Quads is expressed over code points in Unicode [UNICODE]. The encoding is always UTF-8 [UTF-8].
Unicode code points may also be expressed using an \uXXXX (U+0 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-F]
Security considerations:
See B. Security Considerations.
Interoperability considerations:
There are no known interoperability issues.
Published specification:
This specification.
Applications which use this media type:
No widely deployed applications are known to use this media type. It may be used by some web services and clients consuming their data.
Additional information:
Magic number(s):
None.
File extension(s):
".nq"
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
The N-Quads specification is the product of the RDF WG. The W3C reserves change control over this specifications.

D. Acknowledgments

This section is non-normative.

D.1 Acknowledgments for RDF 1.1

This section is non-normative.

The editor of the RDF 1.1 edition acknowledges valuable contributions from Gregg Kellogg, Andy Seaborne, Eric Prud'hommeaux, Dave Beckett, David Robillard, Gregory Williams, Antoine Zimmermann, Sandro Hawke, Richard Cyganiak, Pat Hayes, Henry S. Thompson, Bob Ferris, Henry Story, Andreas Harth, Lee Feigenbaum, Peter Ansell, Evan Patton and David Booth.

This specification is a product of extensive deliberations by the members of the RDF Working Group chaired by Guus Schreiber and David Wood. It draws upon the earlier specification in N-Quads: Extending N-Triples with Context, edited by Richard Cyganiak, Andreas Harth, and Aidan Hogan.

D.2 Acknowledgments for RDF 1.2

This section is non-normative.

The editors of the RDF 1.2 edition acknowledge valuable contributions from Andy Seaborne.

In addition to the editors, the following people have contributed to this specification: Pierre-Antoine Champin

Members of the RDF-star Working Group Group included Adrian Gschwend, Andy Seaborne, Antoine Zimmermann, Dan Brickley, David Chaves-Fraga, Dominik Tomaszuk, Dörthe Arndt, Enrico Franconi, Fabien Gandon, Gregg Kellogg, Gregory Williams, Jesse Wright, Julián Arenas-Guerrero, Olaf Hartig, Ora Lassila, Pasquale Lisena, Peter Patel-Schneider, Pierre-Antoine Champin, Raphaël Troncy, Ruben Taelman, Rémi Ceres, Sarven Capadisli, Souripriya Das, Ted Thibodeau, and Timothée Haudebourg.

Editor's note

Recognize members of the Task Force? Not an easy to find list of contributors.

E. Changes between RDF 1.1 and RDF 1.2

This section is non-normative.

F. Index

This section is non-normative.

F.1 Terms defined by this specification

F.2 Terms defined by reference

G. Issue summary

This section is non-normative.

There are no issues listed in this specification.

H. References

H.1 Normative references

[EBNF-NOTATION]
EBNF Notation. Tim Bray; Jean Paoli; Michael Sperberg-McQueen; Eve Maler; François Yergeau et al. W3C. W3C Recommendation. URL: https://www.w3.org/TR/xml/#sec-notation
[RDF12-CONCEPTS]
RDF 1.2 Concepts and Abstract Syntax. Richard Cyganiak; David Wood; Markus Lanthaler. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-concepts/spec/
[RDF12-N-TRIPLES]
RDF 1.2 N-Triples. Gavin Carothers; Andy Seaborne. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-n-triples/spec/
[RDF12-TURTLE]
RDF 1.2 Turtle. Eric Prud'hommeaux; Gavin Carothers. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-turtle/spec/
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[UNICODE]
The Unicode Standard. Unicode Consortium. URL: https://www.unicode.org/versions/latest/
[UTF-8]
UTF-8, a transformation format of ISO 10646. F. Yergeau. IETF. November 2003. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3629

H.2 Informative references

[RDF12-NEW]
What’s New in RDF 1.2. David Wood. W3C. DNOTE. URL: https://w3c.github.io/rdf-new/spec/
[RDF12-PRIMER]
RDF 1.2 Primer. Guus Schreiber; Yves Raimond. W3C. DNOTE. URL: https://w3c.github.io/rdf-primer/spec/
[RDF12-SCHEMA]
RDF 1.2 Schema. Dan Brickley; Ramanathan Guha. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-schema/spec/
[RDF12-SEMANTICS]
RDF 1.2 Semantics. Patrick Hayes; Peter Patel-Schneider. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-semantics/spec/
[RDF12-TRIG]
RDF 1.2 TriG. Gavin Carothers; Andy Seaborne. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-trig/spec/
[RDF12-XML]
RDF 1.2 XML Syntax. Fabien Gandon; Guus Schreiber. W3C. W3C Working Draft. URL: https://w3c.github.io/rdf-xml/spec/
[RFC3023]
XML Media Types. M. Murata; S. St. Laurent; D. Kohn. IETF. January 2001. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3023
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[RFC3987]
Internationalized Resource Identifiers (IRIs). M. Duerst; M. Suignard. IETF. January 2005. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3987
[SPARQL12-CONCEPTS]
SPARQL 1.2 Concepts. The W3C RDF-star Working Group. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-concepts/spec/
[SPARQL12-ENTAILMENT]
SPARQL 1.2 Entailment Regimes. Birte Glimm; Chimezie Ogbuji. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-entailment/spec/
[SPARQL12-FEDERATED-QUERY]
SPARQL 1.2 Federated Query. Eric Prud'hommeaux; Carlos Buil Aranda. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-federated-query/spec/
[SPARQL12-GRAPH-STORE-PROTOCOL]
SPARQL 1.2 Graph Store HTTP Protocol. Chimezie Ogbuji. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-graph-store-protocol/spec/
[SPARQL12-NEW]
What’s New in SPARQL 1.2. The W3C RDF-star Working Group. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-new/spec/
[SPARQL12-PROTOCOL]
SPARQL 1.2 Protocol. Lee Feigenbaum; Gregory Williams; Kendall Clark; Elias Torres. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-protocol/spec/
[SPARQL12-QUERY]
SPARQL 1.2 Query Language. Steven Harris; Andy Seaborne. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-query/spec/
[SPARQL12-RESULTS-CSV-TSV]
SPARQL 1.2 Query Results CSV and TSV Formats. Andy Seaborne. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-results-csv-tsv/spec/
[SPARQL12-RESULTS-JSON]
SPARQL 1.2 Query Results JSON Format. Andy Seaborne. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-results-json/spec/
[SPARQL12-RESULTS-XML]
SPARQL 1.2 Query Results XML Formats. Sandro Hawke; Dave Beckett; Jeen Broekstra. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-results-xml/spec/
[SPARQL12-SERVICE-DESCRIPTION]
SPARQL 1.2 Service Description. Gregory Williams. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-service-description/spec/
[SPARQL12-UPDATE]
SPARQL 1.2 Update. Paula Gearon; Alexandre Passant; Axel Polleres. W3C. W3C Working Draft. URL: https://w3c.github.io/sparql-update/spec/
[UNICODE-SECURITY]
Unicode Security Considerations. Mark Davis; Michel Suignard. Unicode Consortium. 19 September 2014. Unicode Technical Report #36. URL: https://www.unicode.org/reports/tr36/tr36-15.html