Geo-shapes are completely different from geo-points. Until now we have worked with simple geo-location and rectangle searches. However, with geo-shapes, the sky is the limit. On a map, you can simply draw a line, polygon, or circle and ask Elasticsearch to populate the data according to the co-ordinates of your queries, as seen in the following image:
Let's see some of the most important geo-shapes.
A point is a single geographical coordinate, such as your current location shown by your smart-phone. A point in Elasticsearch is represented as follows:
{ "location" : { "type" : "point", "coordinates" : [28.498564, 77.0812823] } }
A linestring
can be defined in two ways. If it contains two coordinates, it will be a straight line, but if it contains more than two points, it will be an arbitrary path:
{ "location" : { "type" : "linestring", "coordinates" : [[-77.03653, 38.897676], [-77.009051, 38.889939]] } }
A circle contains a coordinate as its centre point and a radius. For example:
{ "location" : { "type" : "circle", "coordinates" : [-45.0, 45.0], "radius" : "500m" } }
A polygon is composed of a list of points with the condition that its first and last points are the same, to make it closed. For example:
{ "location": { "type": "polygon", "coordinates": [ [ [-5.756836, 49.991408], [-7.250977, 55.124723], [1.845703, 51.500194], [-5.756836, 49.991408] ] ] } }
An envelope is a bounding rectangle and is created by specifying only the top-left and bottom-right points. For example:
{"location": { "type":"envelope", "coordinates":[[-45,45],[45,-45]] } }
Similar to geo-points, geo-shapes are also not dynamically identified by Elasticsearch, and a mapping needs to be defined before putting in the data.
The mapping for a geo-point field can be defined in the following format:
"location": { "type": "geo_shape", "tree": " quadtree " }
The tree
parameter defines which kind of grid encoding is to be used for geo-shapes. It defaults to geo_hash
, but can also be set to quadtree
.
Geohash versus Quadtree
Geohashes transform a two-dimension spatial point (latitude-longitude) into an alphanumerical string or hash and is used by Elasticsearch as a default encoding scheme for geo-point data. Geohashes divide the world into a grid of 32 cells, and each cell is given an alphanumeric character.
Quadtrees are similar to geohashes, except that they are built on a quadrant that is, there are only four cells at each level instead of 32. As per my experience with geo data, quadtrees are faster and provide more performance in comparison to geohashes.
Indexing a geo-shape value in a point form is easier and follows this syntax:
location": { "type": "Point", "coordinates": [13.400544, 52.530286] }
Python example
The same previous location data can be used for indexing with Python in the following way:
doc = dict() location = dict() location['coordinates'] = [13.400544, 52.530286] doc['location'] = location doc['location']['type'] = 'Point' es.index(index=index_name, doc_type=doc_type, body=doc)
Java example
List<Double> coordinates = new ArrayList<Double>(); coordinates.add(13.400544); coordinates.add(52.530286); Map<String, Object> location = new HashMap<String, Object>(); location.put("coordinates", coordinates); location.put("type", "Point"); Map<String, Object> document = new HashMap<String, Object>(); document.put("location", location); IndexResponse response = client.prepareIndex().setIndex(indexName).setType(docType) .setSource(document).setId("1").execute().actionGet();
Java programmers need to add the following dependencies in the pom.xml
file to be able to work with geo-spatial data. If you are using Jar files in your class path, the Spatial4J
and JTS Jar files can be found under Elasticsearch home's lib directory:
<dependency> <groupId>com.spatial4j</groupId> <artifactId>spatial4j</artifactId> <version>0.4.1</version> </dependency> <dependency> <groupId>com.vividsolutions</groupId> <artifactId>jts</artifactId> <version>1.13</version> <exclusions> <exclusion> <groupId>xerces</groupId> <artifactId>xercesImpl</artifactId> </exclusion> </exclusions></dependency>
The data we have stored previously can be queried using any geo shape type. Let's see a few examples to search the previous document in both Python and Java languages.
Python example
Searching on linestring
is done as follows:
query = { "query": { "bool": { "must": { "match_all": {} }, "filter": { "geo_shape": { "location": { "shape": { "type": "linestring", "coordinates": [[ 13.400544,52.530286],[13.4006,52.5303]] } } } } } } } response = es.search(index=index_name, doc_type=doc_type, body=query)
Searching inside an envelope is done like this:
query = { "query": { "bool": { "must": { "match_all": {} }, "filter": { "geo_shape": { "location": { "shape": { "type": "envelope", "coordinates": [[13,53],[14,52]] } } } } } } } response = es.search(index=index_name, doc_type=doc_type, body=query)
Similarly, you can search all type of shapes by specifying the type and the corresponding coordinates for that shape.
Java example
Apart from QueryBuilder
, you also need to import the following statement that is used to build various geo shape queries:
import org.elasticsearch.common.geo.builders.ShapeBuilder;
Then you can build the query, as follows:
QueryBuilder lineStringQuery = QueryBuilders.boolQuery() .must(QueryBuilders.matchAllQuery()) .filter(QueryBuilders.geoShapeQuery(geoShapeFieldName, ShapeBuilder.newLineString() .point(13.400544, 52.530286) .point(13.4006, 52.5303))); SearchResponse response = client.prepareSearch(indexName) .setTypes(docType) .setQuery(lineStringQuery) .execute().actionGet();
To search using Envelope
:
QueryBuilder envelopQuery = QueryBuilders.boolQuery() .must(QueryBuilders.matchAllQuery()) .filter(QueryBuilders.geoShapeQuery(geoShapeFieldName, ShapeBuilder.newEnvelope() .topLeft(13.0, 53.0) .bottomRight(14.0, 52.0)));
As shown in the preceding code, an envelope takes top-left and bottom-right points similar to what we saw for bounding box queries:
SearchResponse response = client.prepareSearch(indexName).setTypes(docType) .setQuery(envelopQuery) .execute().actionGet();