Friday, April 10, 2009

Don't forget the spatial extensions

MySQL does have spatial extensions, and I plan to do some more writing on these in the near future. This is the first blogpost on the subject. The spatial extensions in MySQL looks like what they do in most other databases in terms of overall functionality, but the current implementation leaves a fair bit to be desired when it comes to features and implementation. Spatial extensions include extensions in a bunch of areas:
  • Spatial databases. These are datatypes such as point, polygon and geometry etc.
  • Spatinal functions. These are functions like distance, size and also functions that check for relations between spatial objects, such as contains and overlaps.
  • Spatial indexing. This is typically R-Tree indexes (R for Rectangle) that can be applied when using the Spatial functions above.
Now, all that said, and having said that this support is similar in other RDBMS systems, where is MySQL lacking? For one thing, the overlap, size and other functions actually aren't applied the the geometric object itself, but to it's Minimum Bounding Rectangle (MBR). The MBR for an object, say a circle or a polygon, is the smallest rectangle that can contain the whole object. What this means in the case of Spatial Operators is that there is an error with these functions, if you think that they will return the value as when applied to the object(s) themselves. This is why these functions in MySQL actually aren't called contains but MBRContains for example.

And as if this wasn't enough, only MyISAM supports R-Tree indexing, and not all Storage Engines supports the Spatial (GIS) datatypes.

Thirdly, MySQL can only work with a flat or euclidic coordinate system. This is a disadvantage and will introduce an error, in particular when used with larger geometries, as the world really isn't flat. At least that wasn't the case last time I looked.

And the drawbacks I mention above are just the most basic ones, there are a few more minor omissions and issues.

So are we doomed now when it comes to GIS functions with MySQL? Nope, there has been some work done to fix this. On MySQL Forge you can find a GIS Functions document, about a fix to these issues. This has also been implemented in a variation of MySQL 5.1.23 that is available for download. In addition, the is a talk at the MySQL User Conference on this subject.

So there is hope, and I have a plea for you: Support me in having the 5.1.23 implementation I mention above become part of standard MySQL. There is no reason not to do this, in my mind, but for some reason, it's not happening. Also, download this code, I'd be really happy if someone would want to use this code and develop it further. So yes, there is hope!

And I'll do some more blogging on this subject, if nothing else, I will discuss ways around the current limitations of Spatial Support in MySQL.

/Karlsson
Still opinionated

2 comments:

arjen said...

There is a reason for not bothering, and that is that PostgreSQL has a very functional GIS implementation.

MySQL should not try to be everything to everybody, as the result would be that won't be particularly good at anything.

Note that the RTREE indexing in MySQL only works in MyISAM, whereas most deployments use InnoDB.

But the key factor is this: my clients often ponder using the GIS functionality, when basic math on regular tables and indexes is more suitable. That's easily sorted.

Where advanced GIS capabilities are required, I recommend PostgreSQL. It doesn't happen very often, but it does happen - and the requirements then are such that MySQL is not even close.
I really don't think it's worthwhile to spend time on this, there are sufficient other important issues (where there are no alternatives)

Karlsson said...

Most deåöoyments don't use InnoDB, that isn't true, there are probably more using MyISAM still.

As for what you say about MySQL should not try to do evreything, well, I sort of agree. But now we DO have GIS in MySQL already, so why don't we just make it work right!

And why it is in PostGIS makes it not worthwile to have in MySQL I do not understand? PostGIS is also a SQL based database, should we abandon SQL? And Oracle/DB2/Informix has fine spatial implementations, so why bother with PostGIS?

As for use of GIS, well this is a request i hear A LOT. And when I hear it, I usually return with a question: "Will what is in the 5.1.23 GIS Preview be good enough?", and the answer has so far been a pretty resounding yes, so, I tend to disagree with you.

I do understand where you are coming from though. Advanced GIS applications, and in those cases, even the 5.1.23 perviw stuff will not be good enough. The reason we need GIS now is a different one though, which is the growth of location based information on the web. More and more web sites are adding location based services, one way or the other. This is due to make factors, such as cheap GPS's, Google maps etc. And those web-sites are already using MySQL in many case. And adding Postgres just for the GIS stuff, and having the rest in MySQL is just really impractical and unnecessary. And I'd rather see them stick to MySQL and move to Postgres, just because we lack these features (and this happening right now).

/Karlsson