- Iterated Union: Iterate over the polygons and union them one-by-one into a result polygon
- Buffer Union: Collect the polygons into a GeometryCollection and run buffer(0) on the collection
Since this is such a common operation, I've been keen to find a faster and more robust approach. An alternative strategy is to use a concept I call Cascaded Union. The idea is to union small subsets of the input together, then union groups of the resulting polygons, and so on recursively until the final union of all areas is computed. This can be thought of as a post-order traversal of a tree, where the union is performed at each interior node. If the tree structure is determined by the spatial proximity of the input polygons, and there is overlap or adjacency in the input dataset, this algorithm can be quite efficient. This is because union operations higher in the tree are faster, since linework is "merged away" from the lower levels.
This picture shows the algorithm in action:
Handily, the spatial indexes already in JTS can be used to determine an appropriate tree structure. Currently the Sort-Tile-Recursive packed R-Tree index is used - this seems to produce an effective tree (although I suspect that the algorithm is not very senstive to the precise spatial tree used)
It turns out that Cascaded Union can be much more efficient than Iterated Union. On a test dataset with 30,000 small polygons with a high degree of overlap, the performance results were:
Cascaded Union: 20 sec
Iterated Union: 3 hours 40 min !
Not bad - well worth the effort of coding... Also, it's more robust than Buffer Union, and often much faster as well.
This will be provided via the Geometry.union() (Unary union) method in JTS 1.9.