Wednesday, May 30, 2012

Some JSON please

One thing still missing from the basic MySQL stuff is some more JSON oriented features. After some use of JSON (here at Recorded Future, JSON is THE data format), I have begun to accept this format, and have even begun to appreciate it. JSON really is a useful, flexible format that even it reasonably well standardized and has suport in most languages.

CSV used to the the obvious data exchange format for RDBMSes, but in the case of NoSQL, I sense that JSON is beginning to take that role, not only in the obvious JavaScript focued NoSQL databases such as Mongo. JSON is more object focused then CSV, is more standardized, has some better features and is in general more flexible.

Of course, a format with nested objects and arrays such as JSON would be hard to import into MySQL in the general case, at least if you want to expand arrays into lookup tables, and foreign keys constructs for keys etc. But export should be reasonable simple, and import of simple JSON structures should not be too hard.

While thinking about this, I realized that maybe I could be the one to create some JSON / MySQL import and export tools. Many of the solutions I have seen so far were based on some scripting language or MySQL Stored Routines (which doesn't mean other tools aren't out there, I haven't researched it that much), so I wanted something else. Why? Because NoSQL often means a lot of JSON structured data, so good performance is really important, which in turn means loading data using INSERT arrays and multple threads. Export means some means of multi-threading and possible some additional performance enhancing methods.

I would really like some input here. I have slowly started on a simple JSON loading tool for MySQL. Right now, this tool can only load flat structures, and embedded array or objects gets loaded as text. But it is multithreaded and it does load using INSERT arrays. But what other functions would you like? Data mapping? Breaking up an array into foreign keys? In one level or even in multiple levels?

In conclusion, I like JSON, I think it's a useful data format with good standardization and some modern features, and that it is well beyond, say, CSV (which isn't standardized at all and has quite a few format variatioons, for no good reason).



hingo said...

Hi Anders

I agree, JSON is the new XML. I've been exploring this area last year in the form of a Drizzle plugin. You've probably noticed it in my blog The plugin will get some serious love this Summer in the form of a GSoC project.

Anyway, we've been focused mostly on actually providing JSON based access over HTTP. Stewart implemented and API where you use SQL and get the result set as a JSON array. Me and Mohit are working on a key-value kind of API which somewhat resembles CouchDB and MongoDB.

I've often thought that this plugin could be backported to MySQL too. I'm happy to help if that would be interesting to you.

Otoh, if you're mainly interested in JSON as an export/import format, then it is maybe not what you were looking for. In that case you might be interested in lib_mysqludf_json (...or not, I've always found them a bit lacking in terms of actual usefulness / ease of use)

Anders Karlsson said...

Yepp, I've seen that. Stewarts work is great, curl to access MySQL is a real cool idea. As for JS in the server, again, this is brilliant.

What I am aiming for is slightly different though, so we can all play. My idea was to allow JSON data to be imported and exported to / from MySQL. In the process, I want to massage the data so as to, as far as possible, break up a JSON Object structure into traditional SQL TABLE format.

My think is more for those that have a JSON data source or target and want to exchange data with MySQL.

All this said, I'll sure have a closer look at what Drizzle does in this area and give it a shot. And I'll do some more JSON and JavaScript blog posts.