QuiXDM

QuiXDM is an ubiquitous open-source implementation of an Streaming Data Model to process XML, JSON, YAML, RDF, CSV, HTML

This project is maintained by innovimax

QuiXDM

QuiXDM is an ubiquitous open-source datamodel to process in a Streaming fashion:

Build Status Coverity Scan Build Status

Getting Started

To install it

Why QuiXDM?

There is SAX,StAX, DOM, Jackson, Jena, CSVParser, HTMLParser out there for processing data

Feature\API SAX StAX DOM Jackson QuiXDM
in memory/streaming streaming streaming in memory streaming streaming
push/pull push pull -- pull pull
data model low level XML low level XML low level XML low level JSON XPath Data Model
handle sequence no no no no yes
handle json/yaml no no no yes yes
handle rdf no no no no yes
handle csv no no no no yes
handle html no no no no yes

How does it work?

It uses a consistent datamodel to represent all those contents in streaming.

// Here is the grammar of events
sequence       := START_SEQUENCE, (document|json_yaml|table|semantic)*, END_SEQUENCE
document       := START_DOCUMENT, (PROCESSING-INSTRUCTION|COMMENT)*, element, (PROCESSING-INSTRUCTION|COMMENT)*, END_DOCUMENT
json_yaml      := START_JSON, object, END_JSON
table          := START_TABLE, header*, array_of_array, END_TABLE
semantic       := START_RDF, statement*, END_RDF
element        := START_ELEMENT, (NAMESPACE|ATTRIBUTE)*, (TEXT|element|PROCESSING-INSTRUCTION|COMMENT)*, END_ELEMENT
object         := START_OBJECT, (KEY_NAME, value)*, END_OBJECT
value          := object|array|flat_value
flat_value     := VALUE_FALSE|VALUE_TRUE|VALUE_NUMBER|VALUE_NULL|VALUE_STRING
array          := START_ARRAY, value*, END_ARRAY
array_of_array := START_ARRAY, flat_array+, END_ARRAY
flat_array     := START_ARRAY, flat_value*, END_ARRAY
statement      := START_PREDICATE, SUBJECT, OBJECT, GRAPH?, END_PREDICATE

Mostly look at QuiXToken.java

Use

With Object creation (à la javax.xml.stream.XMLEventReader)

Simplest way to use, is to instantiate innovimax.quixproc.datamodel.in.QuiXEventStreamReader.java

Iterable<Source> sources = 
        "/tmp/file/file_aaa.xml",   
        "/tmp/file/file_aab.json",
        "/tmp/file/file_aac.csv",
        "/tmp/file/file_aad.yml",
        "/tmp/file/file_aae.n3" 
;
QuiXEventStreamReader qesr = new QuiXEventStreamReader(sources);
while(qesr.hasNext()) {
    System.out.println(qesr.next());
}

Lightweight iterator without Object creation (à la javax.xml.stream.XMLStreamReader)

TODO

Why QuiXCharStream and QuiXQName?

Well it comes from the fact that Streaming interface in XML should really be streaming. The truth is that there is no such character streaming interface in Java.

Having such context, that's why QuiXCharStream and QuiXQName went live in order to :

Contributors

Innovimax is contributing to this work

Related Projects

QuiXDM can be used standalone

This is the data model of QuiXPath and QuiXProc

It is part of two bigger projects :