Saturday 20 June 2020

JTS OverlayNG - Tolerant Topology Transformation

This is another in a series of posts about the new OverlayNG algorithm being developed for the JTS Topology Suite. (Previous ones are here and here).  Overlay is a core spatial function which allows computing the set-theoretic boolean operations of intersection, union, difference, and symmetric difference over all geometry types. OverlayNG introduces significant improvements in performance, robustness and code design.

JTS has always provided the ability to specify a fixed-precision model for computing geometry constructions (including overlay).  This ensures that output coordinates have a defined, limited precision.  This can reduce the size of data transfers and storage, and generally leads to cleaner, simpler geometric output.  The original overlay implementation had some issues with robustness, which were exacerbated by using fixed-precision.  One of the biggest improvements in OverlayNG is that fixed-precision overlay is now guaranteed to be fully robust.  This is achieved by using an implementation of the well-known snap-rounding noding paradigm. 

Geometric algorithms which operate in a fixed-precision model can encounter situations called topology collapse.  This happens when line segments and points become coincident due to vertices or intersection points being rounded.  The OverlayNG algorithm detects occurrences of topology collapse and transforms them into valid topology in the overlay result.

Topology collapse during overlay with a fixed precision model

As a bonus, handling topology collapse during the overlay process also allows it to be tolerated when present in the original input geometries.  This means that some kinds of "mildly" invalid geometry (according to the OGC model) are acceptable as input.  Invalid geometry is transformed to valid geometry during the overlay process.

Specifically,  input geometry may contain the following situations, which are invalid in the OGC geometry model:
  • A ring which self-touches at discrete points (the so-called "inverted polygon" or "exverted hole")
  • A ring which self-touches in one or more line segments
  • Rings which touch other ones along one or more line segments 
Note that this does not extend to handling polygons that overlap, rather than simply touch.  These are "strongly invalid", and will trigger a TopologyException during overlay.

An interesting use for this capability is to process individual geometries.  By simply computing the union of a single geometry the geometry is transformed into an OGC-valid geometry.  In this way OverlayNG functions as a (partial) "MakeValid" operation.  
A polygon which self-touches in a line transforms to a valid polygon with a hole

A polygon which self-touches in a point transforms to a valid polygon with a hole

A collection of polygons which touch in lines transforms to a valid polygon with a hole

Moreover, some spatial systems use geometry models which do not conform to the OGC semantics.  Some systems (such as ArcGIS) actually specify the use of inverted polygons and exverted holes in their topology model.  And in days of yore there were systems which were unable to model holes explicitly, and so used a "connected hole" topology hack (AKA "lollipop holes".) This represented  holes as an inversion connected by a zero-width corridor to the polygon shell. Both of these models are accepted by OverlayNG. Thus it provides a convenient way to convert from these non-standard models into OGC-valid topology. 

This is one more reason why overlay is the real workhorse of spatial data processing!

Thursday 18 June 2020

JTS OverlayNG - Noding Strategies

In a previous post I unveiled the exciting new improvement in the JTS Topology Suite called OverlayNG.  This new implementation provides significant improvements to the core function of spatial overlay.  Overlay supports computing the set-theoretic boolean operations of intersection, union, difference, and symmetric difference over all geometric types.

One of the design goals of JTS is to create modular, reusable data structures and processes to implement spatial algorithms.  This increases development velocity and testability, and makes algorithms easier to understand.  In spatial algorithms it is not always obvious how to identify appropriate abstractions for reuse, so this is an on-going effort of design and refactoring.

After the implementation of spatial overlay in the very first release of JTS, it became clear that overlay can be split into the following phases:

  1. Noding, in which an set of possibly-intersecting linestrings is converted to an arrangement in which linestrings touch only at endpoints
  2. Topology Analysis, during which the topology graph of the noded arrangement is determined
  3. Result Extraction, in which the geometric components of the desired result are extracted from the topology graph

It also became clear that the Noding phase is critical, since it determines the overall performance and robustness of the overlay computation.  Moreover, tradeoffs between these two qualities can be made by using different noding strategies.  For instance, the "classic" JTS noding approach is fast, but susceptible to robustness issues.  Alternatively, noding using the well-known snap-rounding paradigm is slower, but can be made fully robust. 

To encapsulate this concept, JTS introduced the Noder API.  Since it post-dated the original overlay code, using it in overlay had to await a reworking of that codebase.  The OverlayNG project provided this opportunity.  OverlayNG allows supplying a specific Noder class to be used during overlay. 

One of the main goals of the OverlayNG project was to develop a noder to provide fully robust noding.  This would eliminate the notorious TopologyException errors which bedevil the use of overlay.  The effort has paid off with the development of not one, but two new noders.  The Snapping Noder has very good performance and (with the addition of some heuristics, and so far as is known) provides robust full-precision evaluation.  And the Snap-Rounding noder provides guaranteed robustness as well the ability to enforce a fixed-precision model for output.

So now OverlayNG can be run with the following suite of noders, depending on use case.  The images show the result of intersection and union on the following geometries:

Fast Full-Precision Noder

The MCIndexNoder noding strategy has been available since the early days of JTS. It has very good performance due to the use of monotone chains and the STRtree spatial index.  However, it is a relatively simple algorithm which due to numerical robustness issues does not always produce a valid noding.  In overlay it is always used in conjunction with a noding validator, so that noding failure can be detected and an alternative strategy used to perform the operation successfully.

Intersection and Union with full-precision floating noding

Snapping Noder

The SnappingNoder is a refinement of the MCIndexNoder which snaps existing input and computed intersection vertices together, if they are closer than a snap distance tolerance.  This dramatically improves the robustness of the noding, with only minor impact on performance. 

Noding robustness issues are generally caused by nearly coincident line segments, or by very short line segments.  Snapping mitigates both of these situations.  The choice of snap tolerance is a heuristic one.  Generally, a smaller snap distance has less chance of distorting the topology, but it need to be large enough to resolve intersection computation imprecision.  In practice, excellent robustness is provided by using a very small snap distance (e.g. a factor of 10^12 smaller than the geometry magnitude).

Snapping of course risks creating topology collapses, but OverlayNG  is designed to handle these correctly.  However, there are occasional situations where the snapped arrangement is too invalid to  be handled.  This can be detected, and with some simple heuristic adjustments (e.g. a more aggressive  snap distance) the overlay can be rerun.  This strategy has proven to be fully robust in all cases tried so far.

Intersection and Union with Snapping Noding (snap tolerance = 0.5)

Snap-Rounding Noder

The SnapRoundingNoder implements the well-known snap-rounding paradigm.  It provides fully robust noding by rounding and snapping linework to a fixed-precision grid.  This has the unavoidable effect of rounding every output vertex to the precision grid. This may or may not be desirable depending on the situation.  A useful side effect is that it provides an effective means of reducing the precision of geometries in a topologically valid way. 

In the early stages of OverlayNG design and development I expected that snap-rounding would be required to ensure fully-robust overlay, in spite of the downside of fixed-precision output.  But the development of the SnappingNoder and accompanying heuristics means that this noder need only be used when control over overlay output precision is desired. 

Using an appropriate precision model is a highly worthwhile goal in spatial data management, since it reduces the amount of memory needed to represent data, and improves robustness and portability.  This is unfortunately often neglected, mostly due to lack of tools available to enforce it.  Hopefully this capability will encourage users to maintain a precision model which is better matched to the true precision of their data.

Intersection and Union with Snap-Rounding noding (precision scale = 1)

Segment Extracting Noder

This is a special-purpose noder which is really more of a "non-noder". It simply extracts every line segment in the input.  It is used on geometry collections which form valid, fully-noded, non-overlapping polygonal coverages.  When used with OverlyNG, this has the effect of dissolving the duplicate line segments and producing the union of the input coverage. By taking advantage of the structure inherent in the coverage model the SegmentExtractingNoder offers very fast performance.  It can also operate on fully-noded linear networks. 

Union of a Polygonal Coverage with SegmentExtractingNoder

The support for pluggable noding and the development of a suite of fast and/or robust noders constitutes the biggest advance of the OverlayNG code.  It finally allows JTS to provide fully robust noding and true support for a fixed-precision model!  This has been a dream of mine for more than a decade.  It's good to think that the end of the era of TopologyException issues is in sight!