Now that OverlayNG has landed on JTS master, it is getting attention from downstream projects interested in porting it or using it. One of the JTS ports is the Net Topology Suite (NTS) project, and it is very proactive about tracking the JTS codebase. Soon after the OverlayNG commit an NTS developer noticed an issue while running the performance tests in JTS: the InteriorPointInAreaPerfTest was now causing a StackOverflowError. This turned out to be caused by the change to GeometryPrecisionReducer to use OverlayNG with Snap-Rounding to perform robust precision reduction of polygons. Further investigation revealed that failure was occurring while querying the KdTree used in the HotPixelIndex in the SnapRoundingNoder. Moreover, in addition to the outright failure, even when queries did succeed they were taking an excessively long time for large inputs.
The reason for using a K-D tree as the search structure for HotPixels is that it supports two kinds of queries with a single structure:
- Lookup queries to find the HotPixel for a coordinate. These queries are performed while building the index incrementally. Hence a dynamic data structure like a K-D tree is required, rather than a static index such as STRtree.
- Range queries to find all HotPixels within a given extent. These queries are run after the index is loaded.
The JTS KdTree supports both of these queries more efficiently than QuadTree, which is the other dynamic index in JTS. However, K-D trees are somewhat notorious for becoming unbalanced if inserted points are coherent (i.e. contain monotonic runs in either ordinate dimension). This is exactly the situation which occurs in large, relatively smooth polygons - which is the test data used in InteriorPointInAreaPerfTest. An unbalanced tree is deeper than it would be if perfectly balanced. In extreme cases this leads to trees of very large depth relative to their size. This slows down query performance and, because the KdTree query implementation uses recursion, can also lead to stack overflow.
A perfectly balanced K-D tree, with depth = logN. In the worst case an unbalanced tree has only one node at each level (depth = N).
I considered a few alternatives to overcome this major blocker. One was to use a QuadTree, but as mentioned this would reduce performance. There are schemes to load a K-D tree in a balanced way, but they seemed complex and potentially non-performant.
It's always great when people who file issues are also able to provide code to fix the problem. And happily the NTS developer submitted a pull request with a very nice solution. He observed that while the K-D tree was being built incrementally, in fact the inserted points were all available beforehand. He proposed randomizing them before insertion, using the elegantly simple Fisher-Yates shuffle. This worked extremely well, providing a huge performance boost and eliminated the stack overflow error. Here's the relative performance times:
|Num Pts||Randomized||In Order|
|10K||126 ms||341 ms|
|20K||172 ms||1924 ms|
|50K||417 ms||12.3 s|
|100K||1030 ms||59 s|
|200K||1729 ms||240 s|
Once the solution was merged into JTS, my colleague Paul Ramsey quickly ported it to GEOS, so that PostGIS and all the other downstream clients of GEOS would not encounter this issue.
It's surprising to me that this performance issue hasn't shown up in the other two JTS uses of KdTree: SnappingNoder and ConstrainedDelaunayTriangulator. More investigation required!
You can contact me for a peaceful time frame, a proper Call Girls in Vasant Vihar conference and even an occasion or a capacity; I will take a position right nearby and will complement and improve your area amongst the team. First of all hiring Russian Call Girls in Aerocity, Delhi does not mean you will get infected with immune diseases. Our Agency is a genuine and trusted escort agency and provides a call girl who is educated and well mannered.
Escorts Service in Aerocity
Escorts Service in Mahipalpur
Body Massage in Kapashera, Delhi
Massage Parlor in Manesar
Post a Comment