Saturday 18 August 2012

Battle of the SSLs - Round 3

After the warmup of rounds 1 & 2 of the Battle of the Spatial Scripting Languages, it's time to pick up the pace!  The last two rounds were relatively simple tasks which any ETL system worth its keep should be able to tackle.  So here's a more interesting problem:
Given a KML file of volcano locations, and a Shapefile of country boundaries, determine the countries with the greatest density of volcanoes.

It's slightly contrived, but contains a nice variety of spatial and data manipulation tasks, including reading spatial data formats, spatial joins, and grouping and sorting.

Here's the JEQL script:

KMLReader tvol
  file: "http://www.volcano.si.edu/ge/GVPWorldVolcanoes-List.kmz";
ShapefileReader tworld file: "geocommons-world.shp";
Mem tworld;

tc = select NAME, first(AREA) area, count(*) num 
    from tvol join tworld 
      on GeomPrep.contains(tworld.GEOMETRY, tvol.geometry) 
    group by NAME;

td = select NAME, num, area, Val.toDouble(num)/area density 
    from tc where area > 0 
    order by density desc 
    limit 20;

HtmlWriter td file: "vol_density.html";

A few points to notice:
  • JEQL can read KML/KMZ files from the Web
  • JEQL's way of  chaining select statements together is more readable and maintainable than the regular SQL nested syntax
  • The first() aggregate function in JEQL allows selecting a value from a non-aggregated column.  This is often very awkward and inefficient to do in regular SQL.
  • The GeomPrep function set uses the JTS PreparedGeometry API to optimize repeated geometry predicate evaluation
And here's the places where you might not want to invest in hillside property:

Dominica              4    75 0.0533333333333333
Grenada               1    34 0.0294117647058824
Tonga                 2    72 0.0277777777777778
St. Vincent and the   1    39 0.0256410256410256
St. Lucia             1    61 0.0163934426229508
Martinique            1   106 0.0094339622641509
Comoros               2   223 0.0089686098654709
El Salvador          17  2072 0.0082046332046332
Western Samoa         2   283 0.0070671378091873
Guadeloupe            1   169 0.0059171597633136
Reunion               1   250 0.004
Vanuatu               4  1219 0.0032813781788351
Cape Verde            1   403 0.0024813895781638
Iceland              24 10025 0.0023940149625935
Solomon Islands       6  2799 0.0021436227224009
Guatemala            21 10843 0.0019367333763719
Costa Rica            9  5106 0.0017626321974148
Japan                64 36450 0.0017558299039781
Armenia               4  2820 0.0014184397163121
Nicaragua            16 12140 0.0013179571663921

No comments: