问题描述:

I have a collection of `java.awt.Shape`

objects covering a two-dimensional plane with no overlap. These are from a data set of U.S. Counties, at a fairly low resolution. For an (x,y) latitude/longitude point, I want a quick way to identify the shape which county contains that point. What's an optimal way to index this?

Brute force would look like:

`for (Shape eachShape : countyShapes) {`

if (eachShape.contains(x, y)) {

return eachShape;

}

}

To optimize this, I can store the min/max bounds of the (possibly complex) shapes, and only call `contains(x, y)`

on shapes whose rectangular bounds encompass a given x,y coordinate. What's the best way to build this index? A SortedMultiset would work for indexing on the `x`

minima and maxima, but how to also include the `y`

coordinates in the index?

For this specific implementation, doing a few seconds of up-front work to index the shapes is not a problem.

If possible you could try a bitmap with each shape in a different color. Then simply query the point and the color and lookup the shape.

This question is outside the scope of Stackoverflow but the answer is probably Binary Space Partitioning.

Roughly:

- Divide the space in two either on the x coordinate or y coordinate using the mid-point of the range.
- Create a list of counties on the two sides of that line (and divided by that line).
- On each side of that line divide the collections again by the other dimension.
- Continue recursively building a tree dividing alternately by x and y until you reach a satisfactory set of objects to examine by brute force.

The conventional algorithm actually divides the shapes lying across the boundary but that might not be necessary here.

A smart implementation might look for the most efficient line to divide on which is the one where the longest of the two lists is the smallest. That involves more up front calculation but a more efficient and consistently performing partition.

You could use an existing GIS library like GeoTools and then all the hard work is already done.

Simply load your shapefile of counties and execute queries like

```
"the_geom" contains 'POINT(x y)
```

The quickstart tutorial will show you how to load the shapes, and the query tutorial will show you how to query them.

Having min an max values of coordinates of the bounds not guarantee that you can determine if one point is in or out in any situations. If you want achieve this by yourself you should implement some algorithm. There's a good one that is called "radial algorithm", I recommend that uses this, and it isn't so complicated to implement, there are sufficient bibliography and examples. Hope this help.