Googling for such a converter didn't bring me to an undisputedly stable convertor, and my fingers itching to do some Clojure programming, the choice was easily made: I had to try this with Clojure. Hers is the result: https://github.com/kolov/x2j.
The implementation is shockingly compact. I am new to Clojure and writing this compact code was by no means fast, and the code is by no means optimal. Still, it covers the tests. I am impressed wit the compactness of the result. I produce a jar with class Converter, exposing the following 3 methods:
public static String x2j(String xml); public static String j2x(String jsonContainingOneElement); public static String j2x(String jsonContainingManyElements, String elementName);
Let's try it with MongoDB. Define two XMLs:
Joe Main Street Atlanta 34234234324 books tv John Jarvis Street Atlanta NC 679898 books film
In Java:
String xml1 = "" + " Joe " + ... ; String xml2 = ...
Let's start. Connect to MongoDB:
Mongo m = new Mongo(SERVER, PORT); DB db = m.getDB(DB); boolean auth = db.authenticate(USER, PASSWORD); DBCollection coll = db.getCollection(COLLECTION_NAME);
Write the two XMLs:
String json1 = Converter.x2j(xml1); System.out.println("XML: " + xml1); System.out.println("XML -> Json: " + json1); coll.insert((DBObject) JSON.parse(json1)); String json2 = Converter.x2j(xml2); System.out.println("XML: " + xml2); System.out.println("XML -> Json: " + json2); coll.insert((DBObject) JSON.parse(json2));
The log shows:
XML:XML -> Json: {"person":{"hobbies":{"hobby":["books","tv"]},"id":{"#text":"34234234324","@type":"passport"},"address":{"street":"Main Street","city":"Atlanta"},"name":"Joe"}} XML: Joe Main Street Atlanta 34234234324 books tv XML -> Json: {"person":{"hobbies":{"hobby":["books","film"]},"id":{"#text":"NC 679898","@type":"licence"},"address":{"street":"Jarvis Street","city":"Atlanta"},"name":"John"}} John Jarvis Street Atlanta NC 679898 books film
That looks OK. Let's now query some data:
DBCursor cursorDoc = coll.find(new BasicDBObject("person.hobbies.hobby", "tv")) while (cursorDoc.hasNext()) { DBObject value = cursorDoc.next(); System.out.println("Read Json: " + JSON.serialize(value)); . . }Oops, the log:
Read Json: { "_id" : { "$oid" : "4f36d8c103648a32fb075bc0"} , "person" : { "hobbies" : { "hobby" : [ "books" , "tv"]} , "id" : { "#text" : "34234234324" , "@type" : "passport"} , "address" : { "street" : "Main Street" , "city" : "Atlanta"} , "name" : "Joe"}}
The returned Json has to elements: the person entity and some ID data I don't care about at the moment. That's the reason for the Convertor method:
public static String j2x(String jsonContainingManyElements, String elementName);
Let's filter out _id and get the non-id-element only:
private String convertDbToXml(DBObject value) { String json = JSON.serialize(value); for (String key : value.keySet()) { if (!key.equals("_id")) { return Converter.j2x(json, key); } } return null; }
With this, I run two searches:
findMatches(coll, "person.hobbies.hobby", "tv"); findMatches(coll, "person.hobbies.hobby", "books");
Here's the result:
Searching: person.hobbies.hobby=tv Read Json: { "_id" : { "$oid" : "4f36d8c103648a32fb075bc0"} , "person" : { "hobbies" : { "hobby" : [ "books" , "tv"]} , "id" : { "#text" : "34234234324" , "@type" : "passport"} , "address" : { "street" : "Main Street" , "city" : "Atlanta"} , "name" : "Joe"}} Json -> XML:Searching: person.hobbies.hobby=books Read Json: { "_id" : { "$oid" : "4f36d8c103648a32fb075bc0"} , "person" : { "hobbies" : { "hobby" : [ "books" , "tv"]} , "id" : { "#text" : "34234234324" , "@type" : "passport"} , "address" : { "street" : "Main Street" , "city" : "Atlanta"} , "name" : "Joe"}} Json -> XML: books tv Main Street Atlanta Joe Read Json: { "_id" : { "$oid" : "4f36d8c103648a32fb075bc1"} , "person" : { "hobbies" : { "hobby" : [ "books" , "film"]} , "id" : { "#text" : "NC 679898" , "@type" : "licence"} , "address" : { "street" : "Jarvis Street" , "city" : "Atlanta"} , "name" : "John"}} Json -> XML: books tv Main Street Atlanta Joe books film Jarvis Street Atlanta John
That looks fine, I get back the XML I wrote earlier and the search on XML content worked.
How good is this approach? I don't know, an XML database should be a better choice for extensive XML storage. The conversion leaves several issues open:
- absolutely no namespace support
- Querying XML data in Json terms and not in the 'native' XPath.
Still, it is an easy way to add basic XML functionality to a JSON datastore.