Tuesday, March 17, 2009

Erlware Erlang Software Review: JSON and on and on with Ktuo

There is a ton of interesting software created in the Erlang community. Erlware is about making that software more visible. Periodically we will release tutorials, writeups, and reviews of software that has been published to the Erlware repositories for free community use. This is the first such post.

From json.org
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.

seems to be more or less true and it is getting tons of use these days in all kinds of places from writing config files, to passing messages from web applications to javascript front ends, to being the communications backbone of full scale enterprise grade distributed systems. How do you parse and deal with it in Erlang?

One such library for doing so is called Ktuo. What the heck kind of name is Ktuo. This is something so nerdy that I even hesitate to mention it, but since I am so nerdy that I got a kick out of it I will go ahead. It is Json right shifted one letter. J -> K, s -> t, o -> u (u is the next vowel), n -> o. Ok, now that we have gotten that out of the way how to we use this library.

The first place to turn is the docs at Erlware application documentation. That is a great reference for more detail on what we are about to walk through here.

Before we go through how to encode and decode Json lets touch a bit on Json itself and how it is structured.

The Structure of JSON


JSON itself has a spec so simple it makes me smile. Json consists of either strings, numbers, true or false, and null. These can sit by themselves or they can be combined with two structures for aggregation, the array or the object.

Json arrays are denoted with [] square brackets. The following are valid JSON arrays:


["hello", "world"]
["hello", 2.33]
[true, false, "json"]
[true, [1, 2]]
[]


JSON objects are denoted with {} curly brackets and consist of pairs. Pairs are strings separated from any of the JSON types we have just mentioned including objects themselves. The following are valid json objects:


{"key":"value"}
{"key":10}
{"key":{"key2":"value"}}
{"key":[1, 2]}
{}


That is it, pretty much the entire spec explained. Nice and simple. For more info you can checkout json.org. So, how to parse and deal with these constructions in Erlang.

Let's move into how to use Ktuo itself now and start with decoding json.

Decoding Json with Ktuo's ktj_decode


Lets say we have the object we described above {"key":"value"}. To decode this we would use ktj_decode:decode("{\"key\":\"value\"}"). Notice that strings have been escaped. This is not required however and if the escaped quotes are not provided as in ktj_decode:decode("{key:value}") ktuo will make sense of the input and the result will be the same. True and false however are exceptions to this rule being special JSON types themselves. The result of the above invocation of ktj_decode is as follows:


ktj_decode:decode("{key:value}").
{{obj, [{"key", "value"}]}, [], {0,11}}


So what does that all mean? We have a three tuple of the form {JSONValue, UnparsedRemainder, LineInfo} I am going to start at the back and explain first the line info bit. Ktuo was designed to parse JSON streams. This means that you can feed the decode function more than one json term. One term for example would be "[1,2,3]" but "[1,2,3][4,5,6]" is an example of two terms. In the second example the LineInfo value would indicate that the first term was parsed by showing position 0 to position 7 was parsed. Since the string is longer than that we know there are other terms to parse. Those remainder terms would be placed into the UnparsedRemainder. The example below demonstrates this.


ktj_decode:decode("[1,2,3][4,5,6]").
{[1,2,3],"[4,5,6]",{0,7}}


decode tells us that we have parsed to char 7 and that we have "[4,5,6]" left to go. With UnparsedRemainder and LineInfo explained we dive into JSONValue. JSONValue is a term that contains an Erlang representation of a JSON term. What follows is the formal breakdown of that spec.


value() = object() | json_number() | array() | json_string() | json_bool() | null()
object() = {obj, [{key(), value()}]}
array() = [value()]
key() = string()
json_string() = binary()
json_number() = int() | float()
json_bool() = true | false


That spec is actually fairly easy to read. This is written in Erlang edoc style where in types are atoms followed by (). Most of these are very straight forward and almost literal. A JSON array of numbers would convert to an Erlang list of numbers. A boolean value of true would convert to the Erlang atom true. The only one requiring a bit of extra identification is the JSON object type; so, a JSON object {"Key":"Value"} would convert to the erlang term specified by object() which is {obj, [{"Key", "Value"}]}. Basically an enclosing tuple with the identifying atom() obj followed by a list of tuples each of which contains a key value.

Know that we understand that encoding becomes very easy, in fact so easy that we really don't need a section for it... Here is the section


Encoding Json with Ktuo's ktj_encode


the ktj_encode:encode/1 function takes only an Erlang term json representation as specified above and it converts it into valid JSON. If you run the function however you may notice that the result does not look much like the JSON string you expect.


ktj_encode:encode({obj, [{"key", "value"}]}).
[123,
[34,"key",34],
58,
[91,"118",44,"97",44,"108",44,"117",44,"101",93],
125]


What is that, does not look much like what we are after does it?, well, that is because the result is a deep list, we simply have to flatten the list if we wish to pretty print it out at the shell, as such:


lists:flatten(ktj_encode:encode({obj, [{"key", "value"}]})).
"{\"key\":[118,97,108,117,101]}"




Summary and Info


The KTUO application can be pulled via the Erlang package management tool Faxien by using the command: faxien install-app ktuo or if you want to just put he app in your local directory pull it with faxien fetch-app ktuo ./ or it can be pulled directly from the Erlware repositories via your browser. Instructions for the direct download are
here.

That's all she wrote. If you have any questions post them as comments you can email the Erlware questions mailing list at erlware-questions@googlegroups.com. Patches and extra features should go to erlware-dev@googlegroups.com.

1 comment:

Anonymous said...

I found this site using [url=http://google.com]google.com[/url] And i want to thank you for your work. You have done really very good site. Great work, great site! Thank you!

Sorry for offtopic