Interior points for country boundaries
There are some important use cases for this capability:
- Constructing a "proxy point" for a polygon to use in drill-down spatial query. This has all kinds of applications:
- recovering attribution after polygon coverage generalization
- determining parentage during polygon overlay
- faster and more robust spatial join between polygonal datasets
- Cartographic rendering including:
- Creating leader lines
- Placing labels for polygons (for which it is a quick solution but not necessarily a quality one. More on this in a later post)
JTS Scan-Line Interior Point Algorithm
- Determine a Y-ordinate which is distinct to every polygon vertex Y-ordinate and close to the centre of the vertical extent
- Draw a scan line across the polygon and determine the segments of intersection
- Choose the interior point as the midpoint of the widest intersection segment
Locating a polygon interior point along a scan-line
The current code has been in the JTS codebase since the release of the very first version back in 2001. It is elegantly simple, but is quite non-optimal, since it uses the overlay intersection algorithm. This is overkill for the computation of the scan-line intersection segments. It also has a couple of serious drawbacks: slow performance, and the requirement that the input polygon be valid. These are not just theoretical concerns. They have been noticed in the user community, and have caused client projects to have to resort to awkward workarounds. It's even documented as a known limitation in PostGIS.
Thanks to Crunchy Data recognizing the importance of geospatial, we've been able to look into fixing this. It turns out that a relatively simple change makes a big improvement. The scan-line intersections can be computed via a linear-time scan of the polygon edges. This works even for invalid input (except for a few pathological situations).
Interior points of invalid polygons
(LH invalid polygon shows suboptimal point placement)
Best of all, it's much faster - providing performance comparable to the (less useful) centroid computation. The performance results speak for themselves:
Dataset | # polys | # points | Time | Prev time |
Improvement |
---|---|---|---|---|---|
World countries |
244
|
366,951 | 25 ms | 686 ms |
x 27
|
Land Cover | 64,090 | 366,951 | 78 ms | 6.35 s |
x 81
|
More Ideas
Some further improvements that could be investigated:
- Use the centroid to provide the Y-ordinate. This is probably better in some situations, and worse in others. But perhaps there's a fast way to choose the best one?
- Use multiple scan-lines (both vertical and horizontal)
- Provide better handling of short/zero-width scan-line intersections
- Support clipping the interior point to a rectangle. This would provide better results for cartographic labelling
[1] JTS was based on the original OGC SFS specification (version 1.1). The spec actually does include a method Surface.pointOnSurface. The reason for the different choice of name is lost to time, but possibly the reasoning was that the JTS function is more general, since it handles all types of geometry. (One of the design principles of JTS is Geometric Uniformity, AKA No Surprises. Wherever possible spatial operations are generalized to apply to all geometric types. There is almost always a sensible generalization that can be defined, and it's often quite useful.)
[1a] Also, the acronym IPA is much better than the one for PointOnSurface.
[2] Apparently Oracle has decided to provide blazingly fast performance by simply returning a point on the boundary of the polygon. Thus proving the maxim that for every problem there is an solution which is simple, fast, and dead wrong.
No comments:
Post a Comment