Sunday, June 10, 2007

History of JTS and GEOS

It seems a good thing to record the history and provenance of JTS and GEOS, for the following reasons:
  • JTS and GEOS have attained a certain prominence in the open-source geospatial world. Sometimes comments are made which make assumptions about their goals and project structure which indicate a lack of knowledge about the true situation. Time to set the record straight!
  • The legal attribution of code provenance bears little relationship to the true origin of the intellectual property incorporated in these projects. It seems only fair that the people who actually contributed to these projects have their participation recorded.
  • As the designer/lead developer for these projects, I have the most knowledge about their origin and history.

JTS

JTS is the result of the fortuitous convergence of two situations. The first was that I had been working on spatial algorithms since the mid-90's. At that time I held a position in the BC Ministry of Forests, with a major focus being geomatics and spatial data management. While there I developed algorithms in C++ for polygonization and the beginnings of a polygon overlay algorithm. These were based on a simple spatial data model and included implementations of several fundamental computational geometric functions. A colleague, Brian Howden, was very encouraging in this work. Brian and I both felt that what the world really needed was a good spatial library, and this motivated me to start packaging my code as a reusable library. However, progress was slow and I did not feel like I had a clear idea of what such a library should look like. Moreover, library development in C++ was quite painful, since it requires such painstaking attention to details of memory management and code structure.

The second situation was my connection with Dr. Mark Sondheim, of the Geographic Data BC (GDBC) branch of the BC Ministry of Environment, Lands and Parks. I had known Mark since the early 90's, and had interacted with him on numerous occasions while I was working for the BC Government. Over the years we had many stimulating discussions about geospatial processing, and shared a similar interest in increasing the accessibility and formality of spatial data processing.

The JTS Topology Suite formally started life as a project conceived and initiated by Mark. The name and acronym were his idea, as was the concept of using the OGC Simple Features Specification as the basis for the API design. This latter idea was a key choice, since the SFS strikes a good balance between functionality and design complexity. Mark was confident that the SFS geometry model provided everything needed for most geospatial work, and time has proven him correct.

By 2000 I had moved to Vivid Solutions Inc. as a consultant, working primarily in Java and as much as possible on geospatial projects. While there I worked on a couple of spatial projects for GDBC which were Java based (most notably, a streaming parser for SAIF, a forerunner of GML developed by Mark Sondheim and others). I was also working on various projects which utilized my C++ spatial library (including some Java-based ones, necessitating the awkward and limiting use of JNI). Howver, there was never sufficient time or budget to contemplate re-implementing the C++ geometry library in Java (and, I have to admit, perhaps not the vision on my part that this would be as useful as it has turned out. In my defense, at that time Java was still proving itself as an efficient tool for development work).

Luckily, Mark did have the vision, and the access to funding. As a result, the JTS project was initiated in Fall 2000. The project goal was very clear: to develop a Java API which implemented the OGC Simple Features Specification. Given my experience and previous work, I suspect it was an easy choice to award the contract to Vivid Solutions. The project team consisted of myself as designer and lead developer, and Jonathan Aquino as a developer.

The project charter did not highlight the requirement for efficiency and robustness, since Mark wisely realized that these goals would have to be proven to be attainable during the course of the project. In hindsight the need for efficient performance should have been obvious; the need for robustness perhaps less so. (JTS is not exceptional in this area - a disappointly large percentage of published computational geometry algorithms fail to address this issue.) In any case, once the basic functionality was achieved, these two aspects emerged as much more important. Even today, they are still the most important issues in JTS development.

David Skea (at that time also with GDBC) contributed valuable direction and guidance. Most notably, it was his recommendation that an explicit precision model be provided, and he provided much insight about implementing robust spatial algorithms.

At the outset of the project it seemed that the main challenges were going to be designing the algorithms for spatial relationships and polygon overlay. Buffering was lurking in the background with an unknown level of difficulty. Mike Butler (the "father of SDE") had warned me that buffering would be a significant challenge, but since at that time he was still under NDA to ESRI, he did not provide me with any implementation advice. Later in the project when I tackled buffering I found out how right he was!

Development proceeded fairly quickly, with myself tackling API and algorithm design, and Jon working on I/O and implementations of structural methods. Jon also worked on the Unit Test facilities. David Skea contributed the core of the line segment intersection implementation, including the key development of deVillier's robust 2x2 determinant algorithm. Yao Cui of GDBC did good work on categorizing and defining test cases for the spatial predicates (which now appear as part of the JTS validation tests, a useful body of work in its own right).

In the middle of the project I realized that spatial visualization would be essential for designing and debugging spatial algorithms, and I also realized that Java2D and Swing would provide a great platform to build a visualization tool. This led to the development on the TestBuilder as an important component in the JTS toolkit. Jon and I shared development of the TestBuilder. This experience was very valuable in our subsequent development of JUMP.

JTS 1.0 was released in February 2002. The scheduling of this release was primarily motivated by contractual requirements. As soon as it was released I continued work on validation and buffer improvements. This resulted in the release of Version 1.1 in March 2002. At this point the contract with GDBC came to an end. I was encouraged by the success of the JTS development, and I made a personal commitment to continue to enhance JTS with better and more algorithms.

Subsequent releases of JTS have been motivated by my desire to see the library grow, by bug reports and suggestions from clients and users, and occasionally by directly-funded work (notably, work I carried out in 2006 on improving the robustness of polygon overlay). JTS has moved beyond the basic SFS by providing functionality such as polygonization, simplification, linear referencing, affine transformations, and a wide variety of structural functions. One day perhaps it will even provide a true topology API !

It's fair to say that JTS has been wildly successful in fulfilling the goal of providing a full-featured, robust, efficient library of spatial operations. It is being used in numerous spatial applications, notably:

  • JUMP (and derived projects)
  • GeoTools (and derived projects),
  • the BC Goverment/Moxie Media Internet Mapping Framework
There are many, many other applications and projects using JTS as well.

GEOS

In 2003 PostGIS was emerging as a serious and useful spatial database. However, there was one thing it sorely lacked - a complete set of spatial functions. Paul Ramsey and I strategized that the functions in JTS were an excellent basis for filling this gap. The big catch was that PostGIS required a pure C implementation. Never being one to shy away from thinking big, Paul proposed that we port JTS to C++. At that time Dave Blasby was still involved with PostGIS development, and it was he who came up with the name of Geometry Engine (Open Source) - GEOS.

Initially the GEOS project was funded jointly by Refractions Research Inc. and Vivid Solutions, with myself as Technical Advisor (and to a limited extent Architect), and a University of Victoria Master's student named Yury Bychov as developer and designer. Since I was busy on other projects, and in any case would not claim to be an expert in the byzantine area of C++ code design, the plan was that Yury would provide any extra design needed to accomodate differences between Java and C++, and I would advise as much as possible on the purely functional design of the API.

We knew that C++'s lacks of automated memory management would be a major challenge, and I also expected that its much more flexible (some would say baroque) features for code organization (e.g. templates, namespaces) would require some careful thought. I had no idea that the porting process would be quite so painful and prolonged, however. In the end it took longer to port JTS to C++ than it did to develop the entirety of JTS up to Version 1.1! [This is the main reason why my strategy for designing and implementing spatial algorithms remains focussed on Java as the development language, with the GEOS port happening only after the algorithms are relatively stable and well-understood. In my opinion C++ is less effective for algorithm design since it requires too much brainpower devoted to irrelevant issues.]

Since its initial release GEOS has been incorporated in PostGIS, of course, and also in MapGuide and apparently in at least a few other C-based projects. GEOS code has been incorporated by Safe Software in their FME product. MapServer has announced plans to incorporate GEOS in their codebase (the status of this is unknown to me). Frank Warmerdam's OGR project I believe uses GEOS.

GEOS also continues to undergo development, both in the orginal mode of direct ports of JTS code, and also to a certain extent with independent functionality (primarily involving making it better adapted to C++ programming styles).

Martin Davis, May 2007

5 comments:

Jonathan said...

Subscribed!

Mark said...
This comment has been removed by the author.
Mark said...

I thank Martin for his kind comments. However, I would like to emphasize or clarify the role of Dave Skea. In addition to his technical dialogue with Martin, Dave and I spent considerable time early on discussing the need for something like JTS. GeoConnections provided partial support for both the initial JTS development and the subsequent JCS effort. The proposals to GeoConnections were written jointly by Dave and me.

Martin also describes the role played by Yao Cui. Both Martin and Yao put together a very large number of validation tests, and Yao's taxonomy is of particular note in its own right, as Martin states. The validation work done by both of them is of fundamental interest.

Finally it's worth mentioning that JTS and GEOS were - and continue to be - of tremendous practical benefit to a number of critical, operational projects because of their incorporation into the various tools that Martin describes. Suffice it to say here that the geospatial community is richer for Martin's original and continuing efforts with JTS and GEOS.

mloskot said...

Yes, I confirm OGR has an option to use GEOS as a geometry engine and in most cases users build and OGR+GEOS.

Personally, since a year, I've been testing GEOS on mobile devices based on Windows CE system with very good results, so I'd say GEOS is 100% usable here.

Peter said...

JTS and its descendants, and derived projects have already changed the world. When I first laid eyes on JUMP 2004-ish, I said to my self "holy s**t!" The future is still unfolding. Kudos to the visionairies who pulled this off! Don't stop now!