Monday, 13 March 2023

Simplifying Polygonal Coverages with JTS

A new capability for the JTS Topology Suite is operations to process Simple Polygonal Coverages.  A Simple Polygon Coverage is a set of edge-matched, non-overlapping polygonal geometries (which may be non-contiguous, and have holes).  Typically this is used to model an area in which every point has a value from some domain.  A classic example of a polygonal coverage is a set of administrative boundaries, such as those available from GADM or Natural Earth.

GADM polygonal coverage for France Level 1   ( 198,350 vertices)

The first coverage operations provided are:

  • Coverage Validation, to check if a set of polygons forms a topologically-valid coverage
  • Coverage Union, which takes advantage of coverage topology to provide a very fast union operation

Another operation on polygonal coverages is simplification.  Simplifying a coverage reduces the number of vertices it contains, while preserving the coverage topology.  Preserving topology means that the simplified polygons still form a valid coverage, and that polygons which had a common edge in the input coverage (i.e. which were adjacent) still share an edge in the simplified result.  Reducing dataset size via simplification can provide more efficient storage and download, and faster visualization at smaller scales.

France coverage simplified with tolerance 0.01   (7,918 vertices)
The decrease in resolution is hardly noticeable at this scale

Closeup of simplified coverage (tolerance = 0.01)

Simplification is perhaps the most requested algorithm for polygonal coverages (for example, see GIS StackExchange questions here, here and here.)  My colleague Paul Ramsey sometimes calls it the "killer app" for polygonal coverages.  Earlier this century there was no easily-available software implementing this capability.  Users often had to resort to the complex approach of extracting the linework from the dataset, dissolving it, simplifying the lines (with a tool which would not cause further overlaps), re-polygonizing, and re-attaching feature attribution to the result geometries. 

More recently tooling has emerged to provide this functionality.  Simplification is the raison-d'etre of the widely-used and cited MapShaper tool.  GRASS has the v.generalize module.  And good old OpenJUMP added Simplify Polygon Coverage a while back.  

JTS has provided the TopologyPreservingSimplifier algorithm for many years, but this only operates on Polygons and MultiPolygons, not on polygonal coverages.  (Attempting to use it on polygonal coverages can introduce gaps and overlaps, resulting in complaints like this.)  But coverage simplification has been lacking in JTS/GEOS - until now.

JTS Coverage Simplification

Recent work on Polygon Hulls provided ideas for an implementation of coverage simplification. The CoverageSimplifier class uses an area-based simplification approach similar to the well-known Visvalingam-Whyatt algorithm. This provides good results for simplifying areal features (as opposed to linear ones).  It's possible to use a Douglas-Peucker based approach as well, so this may be a future option.

The degree of simplification is determined by a tolerance value.  The value is equivalent roughly to the maximum distance a simplified edge can change (technically speaking, it is the square root of the area tolerance for the Visvalingam-Whyatt algorithm).  

The algorithm progressively simplifies all coverage edges, while ensuring that no edges cross another edge, or touch at endpoints.  This provides the maximum amount of simplification (up to the tolerance) while still maintaining the coverage topology.

The coverage of course should be valid according to the JTS CoverageValidator class.  Invalid coverages can still be simplified, but only edges with valid topology will have it maintained.

France coverage simplified with tolerance = 0.1   ( 1,928 vertices)

France coverage simplified with tolerance = 0.5    ( 1,527 vertices)
The coverage topology (adjacency relationship) is always preserved.

Inner Simplification

The implementation also provides the interesting option of Inner Simplification.  This mode simplifies only inner, shared edges, leaving the outer boundary edges unchanged.  This allows portions of a coverage to be simplified, by ensuring that the simplified polygons will fit exactly into the original coverage.  (This GIS-SE question is an example of how this can be used.)

Inner Simplification of France coverage

It would also be possible to provide Outer Simplification, where only outer edges are simplified.  It's not clear what the use case would be for this - if you have ideas, leave a comment!

GEOS and PostGIS

As usual, this algorithm will be ported to GEOS, from where it will be available to downstream projects such as PostGIS and Shapely.  

For PostGIS, the intention is to expose this as a window function (perhaps ST_CoverageSimplify).  That will make it easy to process a set of features (records) and maintain the attributes for each feature.

1 comment:

Marco B said...

Great to see this! This will definitely be a big game changer once it is incorporated in PostGIS as well.

Seeing the area based ST_SimplifyVW in PostGIS already generating results with very few gaps and overlaps in polygonal coverages, always made me wonder if we couldn't have a simplification algorithm that was entirely faultless for these kind of cases without resorting to node/edge/junction type topologies.

Now seeing such functionality being developed, is really nice.

It also nicely supplements the recent work done on osm2pgsql and the new flex styles, where e.g. Paul Norman managed to develop a Lua style taking advantage of the new flex options of osm2pgsql to de-duplicate OSM boundary lines (, allowing for more advanced open hatch type cartographic styling of boundary lines without the issues of overlapping symbols for duplicated lines for different admin_levels and bordering polygons that would otherwise occur.