Saturday, 9 June 2007

Quirks of the "Contains" Spatial Predicate

One of the most useful OGC standards is their specification of the Dimensionally-Extended 9 Intersection Model for spatial relationships. (To give credit where is it due, this was originally developed by Egenhofer, Clementini and others). In its most general form this model is quite complex, so to make it usable for mortal programmers, a set of commonly-useful "named predicates" have been specified. These include obvious relationships such as intersects, disjoint, touches, equals, and contains.

But are they really so obvious? In fact not - several of them have subtle aspects to their definition which are contrary to intuition.

In particular, "contains" (and its converse "within") has an aspect of its definition which may produce unexpected behaviour. This quirk can be expressed as "Polygons do not contain their boundary". More precisely, the definition of contains is:

Geometry A contains Geometry B iff no points of B lie in the exterior of A, and at least one point of the interior of B lies in the interior of A

That last clause causes the trap - because of it, a LineString which is completely contained in the boundary of a Polygon is not considered to be contained in that Polygon!

This behaviour could easily trip up someone who is simply trying to find all LineStrings which have no points outside a given Polygon. In fact, this is probably the most common usage of contains. For this reason it's useful to define another predicate called covers, which has the intuitively expected semantics:

Geometry A covers Geometry B iff no points of B lie in the exterior of A

Its converse coveredBy is also useful as well. It's a bit of a mystery why OGC did not define this predicate. In any case, Oracle Spatial does provide this predicate. I have added it to JTS as well.

There's a further bonus to defining the covers predicate. It is much easier to optimize the common use case:

covers(Rectangle, Geometry)

All that is necessary to determine this condition is to perform a simple bounding box comparison. This is not possible with contains, because even if the bounding box of Geometry is covered by the Rectangle, a further expensive operation is required to test if the Geometry lies wholly in the boundary of the Rectangle (in which case the predicate fails).

Covers "simplifies" the defintion of contains by making it more general (inclusive). There's another possibility as well, which is to simplify in the direction of being less inclusive. This would produce a predicate which might be called containsProperly, with the definition:

Geometry A containsProperly Geometry B iff all points of B lie in the interior of A

Interestingly, some recent work I'm doing indicates that this predicate might play a very useful role - but that's a story for another post.


References

DE-9IM: Clementini, Eliseo, Di Felice, and van Osstrom; "A Small Set of Formal Topological Relationships Suitable for End-User Interaction," D. Abel and B.C. Ooi (Ed.), Advances in Spatial Database—Third International Symposium. SSD '93. LNCS 692. Pp. 277-295.

9 Intersection model: M. J. Egenhofer and J. Herring; "Categorizing binary topological relationships between regions, lines, and points in geographic databases," Tech. Report, Department of Surveying Engineering, University of Maine, Orono, ME 1991.

5 comments:

Unknown said...

Oracle seems to have a different understanding of 'Covers':

COVERS: The interior of one object is completely contained in the interior or the boundary of the other object and their boundaries intersect.

http://download.oracle.com/docs/cd/E11882_01/appdev.112/e11830/sdo_intro.htm#i880253

Dr JTS said...

Yes, good point. I'd overlooked that detail in their definition. And it looks like their definition follows the original proposal by Egenhofer in Point-Set Topological Spatial Relations.

But I have to say that the Egenhofer definitions seem to be pedantic rather than actually useful. I can't imagine when I'd use the Oracle definition, whereas I think the JTS definition of "covers" is obviously useful (for reasons I explained in the post. Plus it's more efficient to execute.

So I'm not really inclined to change it...

Jaan said...

You were not exactly correct when you wrote "Geometry A covers Geometry B iff no points of B lie in the exterior of A". The Javadoc of Geometry.covers in JTS says there is an exception: "If either geometry is empty, the value of this predicate is false." (Without the exception, the predicate would also be true whenever B is empty.) I think that's the counter-intuitive thing about "covers". Or at least contradicting the expectation that "contains" and "covers" mean the same as in set theory (ubiquitous in modern mathematics). ("Contains" has unfortunately also other reasons why it doesn't work as in set theory, even if we identify a geometry with the set of its interior points.) Without this exception, the definition of "covers" would also be simpler in terms of the DE-9IM matrices (one mask matrix needed instead of four), so I wonder, why this exception was added to the standard (it comes from OGC standard, does it?).

Dr JTS said...

@Jaan,

Actually the condition "covers is false if either geometry is empty" was something I added. The OGC SFS spec doesn't explicitly define the result if either geometry is empty - I guess they intend it to be determined by the mathematical definition.

This may have been a mistake... I agree that it makes sense that covers should be true if B is empty. This agrees with the mathematical description.

I think this leads to a DE-9IM pattern of ******FF* - do you agree?

Jaan said...

Yes, ******FF* is what I meant. Then indeed "A covers B" always means "B is a subset of A" if a geometric object is regarded as the set of its interior and boundary points. I think this definition would be more regular (then e. g. if A covers B and C then A also covers their intersection). But actually I don't remember having problems with it in practice (maybe haven't applied covers to empty geometry). We have used JTS in many projects, as it is an indispensable tool when handling polygons in Java, with good performance — thank you for the effort!