Intersection of MultiPolygons
Overlay operations are used in many kinds of spatial processes. Any system aspiring to provide full-featured geometry processing simply has to provide overlay operations. In fact, many geometry libraries exist solely to provide implementations of overlay. Notable libraries include the ESRI Java API, Clipper and wagyu. Some of these provide overlay only for polygons, which is the most difficult case to compute.
Overlay in JTS
The JTS overlay algorithm supports the full OGC SFS geometry model, allowing any combination of polygons, lines and points as input. In addition, JTS provides an explicit precision model, to allow constraining output to a desired precision. The overlay algorithm is also available in C++ in GEOS. There it provides overlay operations in numerous systems, including PostGIS, QGIS, Shapely, and r-sf. This codebase has had an long lifespan; it was developed back in 2001 for the very first release of JTS, and while there have been improvements over the years the core of the design has remained unchanged.
However, there are some long-standing issues with JTS overlay. The most serious one is that in spite of much valiant effort over the years, overlay is not fully robust. The constructive nature of overlay operations makes them particularly susceptible to the robustness issues which are notorious in geometric algorithms using floating-point numerics. It can happen that running an overlay operation on seemingly innocuous, valid inputs results in the dreaded TopologyException being thrown. There is a steady trickle of issue reports about this in JTS, and even more for GEOS (such as here, here and here...).
Another issue is that the codebase is complex, and thus hard to debug and modify. Partly this is because of the diversity of inputs and the explicit precision model. To support this the JTS overlay algorithm has a rich and detailed semantics. But some of the complexity is due to the original design of the code. This makes it difficult to incorporate new ideas for improvements in performance and robustness.
Next Generation Overlay
So for many years it's been on my mind that JTS overlay needs a thorough overhaul. I chipped away at the problem over time, but it was clear that it was going to be a major effort. Now, thanks to the support of my employer Crunchy Data, I've at last been able to focus on a complete rewrite of the JTS overlay module. It's called OverlayNG.
The basic algorithm remains the same:
- Extract the input linework, and node it together
- Build a topology graph from the noded linework
- Compute a full topological labelling of the graph
- Extract the resultant polygons, lines and points from the graph
This algorithm is time-tested and is able to handle the complexities of multiple geometry types and topology collapse. The new codebase benefits from 20 years of experience to become simpler and more modular, with increased testability, and potential for reuse.
- A snap-rounding noder is available to make overlay fully robust. This eliminates the possibility of TopologyExceptions (when an appropriate precision model is used).
- Snap-rounding allows full support for specifying the output precision model. The precision model can be specified independently for each overlay call, which is more flexible and easier to use. The use of snap-rounding also provides fully valid precision reduction for geometries. This makes it feasible for the first time to fully operate in a fixed-precision regime.
Precision Reduction turned all the way up to 11
- Significant performance optimizations are included (notably, one which makes polygon intersection much faster in many cases)
- Pluggable noding allows providing different noding strategies. One use is to run OverlayNG with the original floating-point noder, which is faster than snap-rounding (but of course has the robustness issues noted above). Another is to use a special-purpose noder to provide very fast polygonal coverage union.
Union of a polygonal coverage (10x faster with OverlayNG)
- A modular and cleaner codebase allows easier testability, maintenance, enhancement and reuse. A winged-edge graph model is used for the topology graph. This is simpler and less memory intensive.
- The rebuild gives an opportunity to make some semantic improvements:
- Empty results are returned as empty atomic geometries of appropriate type, rather than awkward-to-handle empty GeometryCollections
- Linear output is merged node-to-node. This gives union a more natural and useful semantic
A benefit of the new codebase is that it is easier to enhance and extend. For example, it should be straightforward to finally provide a SplitPolygon function for JTS. Another potential extension is overlay for Polygonal Coverages.
Code that is so widely used needs to be thoroughly tested against real-world workloads. Initially OverlayNG will be released as a separate API in JTS. This allows it to be used along with the original overlay. It can be used as a fallback for cases which fail in the original overlay process. Once the new code has been proved out in real world use, it is likely to become the standard overlay code path. Also, the code will be ported to GEOS soon, where we're hoping it will provide significant benefits to the many systems that use GEOS.
I'll be posting more articles about aspects of OverlayNG soon. The code is almost ready to release, after some final testing. In the meantime, the pre-release code is available in a Git branch. It would be great to get as much beta-testing as possible before final release, so try it out and log some feedback!