Saturday, June 9, 2007

Quirks of the "Contains" Spatial Predicate

One of the most useful OGC standards is their specification of the Dimensionally-Extended 9 Intersection Model for spatial relationships. (To give credit where is it due, this was originally developed by Egenhofer, Clementini and others). In its most general form this model is quite complex, so to make it usable for mortal programmers, a set of commonly-useful "named predicates" have been specified. These include obvious relationships such as intersects, disjoint, touches, equals, and contains.

But are they really so obvious? In fact not - several of them have subtle aspects to their definition which are contrary to intuition.

In particular, "contains" (and its converse "within") has an aspect of its definition which may produce unexpected behaviour. This quirk can be expressed as "Polygons do not contain their boundary". More precisely, the definition of contains is:

Geometry A contains Geometry B iff no points of B lie in the exterior of A, and at least one point of the interior of B lies in the interior of A

That last clause causes the trap - because of it, a LineString which is completely contained in the boundary of a Polygon is not considered to be contained in that Polygon!

This behaviour could easily trip up someone who is simply trying to find all LineStrings which have no points outside a given Polygon. In fact, this is probably the most common usage of contains. For this reason it's useful to define another predicate called covers, which has the intuitively expected semantics:

Geometry A covers Geometry B iff no points of B lie in the exterior of A

Its converse coveredBy is also useful as well. It's a bit of a mystery why OGC did not define this predicate. In any case, Oracle Spatial does provide this predicate. I have added it to JTS as well.

There's a further bonus to defining the covers predicate. It is much easier to optimize the common use case:

covers(Rectangle, Geometry)

All that is necessary to determine this condition is to perform a simple bounding box comparison. This is not possible with contains, because even if the bounding box of Geometry is covered by the Rectangle, a further expensive operation is required to test if the Geometry lies wholly in the boundary of the Rectangle (in which case the predicate fails).

Covers "simplifies" the defintion of contains by making it more general (inclusive). There's another possibility as well, which is to simplify in the direction of being less inclusive. This would produce a predicate which might be called containsProperly, with the definition:

Geometry A containsProperly Geometry B iff all points of B lie in the interior of A

Interestingly, some recent work I'm doing indicates that this predicate might play a very useful role - but that's a story for another post.


References

DE-9IM: Clementini, Eliseo, Di Felice, and van Osstrom; "A Small Set of Formal Topological Relationships Suitable for End-User Interaction," D. Abel and B.C. Ooi (Ed.), Advances in Spatial Database—Third International Symposium. SSD '93. LNCS 692. Pp. 277-295.

9 Intersection model: M. J. Egenhofer and J. Herring; "Categorizing binary topological relationships between regions, lines, and points in geographic databases," Tech. Report, Department of Surveying Engineering, University of Maine, Orono, ME 1991.

3 comments:

Tobias said...

Oracle seems to have a different understanding of 'Covers':

COVERS: The interior of one object is completely contained in the interior or the boundary of the other object and their boundaries intersect.

http://download.oracle.com/docs/cd/E11882_01/appdev.112/e11830/sdo_intro.htm#i880253

Dr JTS said...

Yes, good point. I'd overlooked that detail in their definition. And it looks like their definition follows the original proposal by Egenhofer in Point-Set Topological Spatial Relations.

But I have to say that the Egenhofer definitions seem to be pedantic rather than actually useful. I can't imagine when I'd use the Oracle definition, whereas I think the JTS definition of "covers" is obviously useful (for reasons I explained in the post. Plus it's more efficient to execute.

So I'm not really inclined to change it...

bi t said...

as I can find the boundary of a polygon example a Property bordered on the north property B, east by the C, etc.... my email:bitduarte@gmail.com