Friday, 5 February 2010

HatBox for Derby and H2

I just saw the HatBox spatial extension to Derby and H2. (Cute name - Derby, HatBOX - get it?)

And of course HatBox uses JTS!

This is great - H2 is a fantastic pure Java database, which has been crying out for spatial support. (Perhaps if this extension proves its worth it will be merged into the main H2 codebase at some point?)

However, HatBox is still a "user-space extension" - which means that it has not enhanced the SQL evaluation engine itself with knowledge about spatial indexes and when to use them. So to utilize spatial indexing you have to explicitly join to the spatial index table, which results in ugly SQL like this:

select ID, GEOM from T1 as t
inner join
HATBOX_MBR_INTERSECTS_ENV('PUBLIC','T1',145.05,145.25,-37.25,-37.05) as i

This is the same approach used in Sqlite and ESRI SDE and other spatial extensions which operate in user space rather than DB engine space (Mike Butler of SDBE fame used to call this "staying above the blood-brain barrier" 8^). Basically you are adding the index filter condition which for scalar indexes the SQL engine adds automatically.

In contrast, the "big boys" like PostGIS, Oracle, SQL Server, DB2, Informix, etc have actually extended their database engine to handle spatial datatypes and indexes. Most of these systems actually have provided a general extensibility mechanism which allows a clean separation between the engine core and the new datatypes. PostgreSQL is probably the one which takes this to the ultimate extent.

User-space spatial extensions are for a first approach, but it would be really nice to be able to play with the big boys and incorporate knowledge of spatial indexes and functions directly into the database engine. This should be easierto do in Java than in C - are you listening, H2?


Dave said...

I took a quick look at this and it does appear that there is some type of spatial indexing going on (I didn't look too deeply).

Several years ago I did the "Spatial DB in a Box" that used JTS as an engine for Derby, Hsql, and Postgresql - but it was very rudimentary (i.e. calling it "working" would be too kind!). Its good to see that folks who know what they're doing are adding support to the the DBs. I'm impressed!

Kudos to them!

Dr JTS said...

Hi, Dave!

Of course I remember Spatial-DB-in-a-box well... An idea ahead of its time! (Perhaps the name HatBox is a tribute?)

As I said in the blog post, I think the spatial indexing in HatBox is of the "outside-the-engine" kind. Easier to implement, but not as slick (or performant, probably) than native indexing. Hopefully someone will get keen and build a native index for H2...