Recently JTS gained the ability to compute Concave Hulls of point sets. The algorithm used is based the Chi-shapes approach described by Duckham et al. It works by eroding border triangles from the Delaunay Triangulation of the input points, in order of longest triangle edge length, down to a threshold length provided as a parameter value.
Alpha-Shapes
Another popular approach for computing concave hulls is the Alpha-shape construction originally proposed by Edelsbrunner et al. Papers describing alpha-shapes tend to be lengthy and somewhat opaque theoretical expositions, but the actual algorithm is fairly simple. It is also based on eroding Delaunay triangles, but uses the circumradius of the triangles as the criteria for removal. Border triangles whose circumradius is greater than alpha are removed.
Another way of understanding this is to imagine triangles on the border of the triangulation being "scooped out" by a disc of radius alpha:
Given the similar basis of both algorithms, it was easy to generalize the JTS ConcaveHull class to be able to compute alpha-shapes as well. This is now available in JTS via the ConcaveHull.alphaShape function.
Seeking Alpha
An issue that became apparent was: what is the definition of alpha? There are at least three alternatives in use:
- The original Edelsbrunner 1983 paper defines alpha as the inverse of the radius of the eroding disc.
- In a later 1994 paper Edelsbrunner defines alpha as the radius of the eroding disc. The same definition is used in his 2010 survey of alpha shape research.
- The CGAL geometry library implements alpha to be the square of the radius of the eroding disc. (The derivation of this definition is not obvious to me; perhaps it is intended to make the parameter have areal units?)
The simplest option is to define alpha to be the radius of the eroding disc. This has the additional benefit of being congruent with the edge-length parameter. For both parameters, 0 produces a result with maximum concaveness, and for values larger than some (data-dependent) value the result is the convex hull of the input.
Alpha-Shape VS Concave Hull
An obvious question is: how do alpha-shapes differ to edge-length-based concave hulls? Having both in JTS makes it easy to compare the hulls on various datasets. For comparison purposes the alpha and edge-length parameters are chosen to produce hulls with the same area (as close as possible).
Here's an illustration of an Alpha-shape and a Concave Hull generated for the same dataset, with essentially identical area.
Overall the shapes are fairly similar. Alpha-shapes tend to follow the data points more closely in "shallow bays", whereas the concave hull tends to include them. Conversely, the concave hull can sometimes erode deeper bays. The effect of keeping shallow bays is to make the Concave Hull slightly smoother than the Alpha-shape.Avoiding Cavities
However, there is one way in which alpha-shapes can be less well-formed than concave hulls. The measurement of a circumradius is independent of the orientation of the triangle. This means it is not sensitive to whether the triangle has a long or short edge on the border of the triangulation. This can result in what I am calling "cavitation". Cavitation occurs where narrow triangles reaching deep into the interior of the dataset are removed. This in turn exposes more interior triangles to being eroded. This is visible in the example below, where the Alpha-shape contains a large cavity which is obviously not desirable.
Recommendation
Future Work
- Best of Both? - perhaps there is a strategy which combines aspects of both Alpha-shapes and Concave Hulls, to erode shallow bays but avoid the phenomenon of cavitation?
- Disconnected Hulls - the "classic" Alpha-shape definition allows disconnected MultiPolygons as a result. Currently, the JTS algorithm always produces a result which is a connected single polygon. This seems like the most commonly desired behaviour, but there is a possibility to extend the implementation to optionally allow a disconnected result . CGAL provides a parameter to control the number of polygons in the result, which could be implemented as well.