In the previous chapters, we discussed how to create index mappings and index the data. But what if you already have the mappings created, and data indexed, but you want to modify the structure of the index? Of course one could say that we could just create a new index with new mappings, but that is not always a possibility, especially in a production environment. This is possible to some extent. For example, by default, if we index a document with a new field, Elasticsearch will add that field to the index structure. Let's now look at how to modify the index structure manually.
For situations where mapping changes are needed and they are not possible because of conflicts with the current index structure, it is very good to use aliases – both read and write ones. We will discuss aliasing in the Index aliasing section of Chapter 10, Administrating Your Cluster.
Let's assume that we have the following mappings for our users index stored in the user.json
file:
{ "user" : { "properties" : { "name" : {"type" : "string"} } } }
As you can see, it is very simple. It just has a single property that will hold the user name. Now let's create an index
called users
and let's use the preceding mappings to create our type. To do that, we will run the following commands:
curl -XPOST 'localhost:9200/users' curl -XPUT 'localhost:9200/users/user/_mapping' -d @user.json
If everything goes well, we will have our index (called users
) and type (called user
) created. So now let's try to add a new field to the mappings.
In order to illustrate how to add a new field to our mappings, we assume that we want to add a phone number to the data stored for each user. In order to do that, we need to send an HTTP PUT command to the /index_name/type_name/_mapping
REST end point with the proper body that will include our new field. For example, to add the mentioned phone field, we will run the following command:
curl -XPUT 'http://localhost:9200/users/user/_mapping' -d '{ "user" : { "properties" : { "phone" : {"type" : "string", index : "not_analyzed"} } } }'
Similar to the previous command we ran, if everything goes well, we should have a new field added to our index structure.
Of course, Elasticsearch won't reindex our data or populate the newly added field automatically. It will just alter the mappings held by the master node and populate the mappings to all the other nodes in the cluster and that's all. Data reindexation must be done by us or the application that indexes the data in our environment. Until then, the old documents won't have the newly added field. This is crucial to remember. If you don't have the original documents, you can use the _source
field to get the original data from Elasticsearch and index them once again.
To ensure everything is okay, we can run the GET
HTTP request to the _mapping
REST end point and Elasticsearch will return the appropriate mappings. An example command to get the mappings for our user type in the users index will look as follows:
curl -XGET 'localhost:9200/users/user/_mapping?pretty'
Our users
index structure contains two fields: name
and phone
. Let's imagine that we indexed some data but after a while we decided that we want to search on the phone field and we would like to change its index
property from not_analyzed
to analyzed
. Because we already know how to alter the index structure, we will run the following command:
curl -XPUT 'http://localhost:9200/users/user/_mapping?pretty' -d '{ "user" : { "properties" : { "phone" : {"type" : "string", "store" : "yes", "index" : "analyzed"} } } }'
What Elasticsearch will return is a response indicating an error, which looks as follows:
{ "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Mapper for [phone] conflicts with existing mapping in other types: [mapper [phone] has different [index] values, mapper [phone] has different [store] values, mapper [phone] has different [omit_norms] values, cannot change from disable to enabled, mapper [phone] has different [analyzer]]" } ], "type" : "illegal_argument_exception", "reason" : "Mapper for [phone] conflicts with existing mapping in other types: [mapper [phone] has different [index] values, mapper [phone] has different [store] values, mapper [phone] has different [omit_norms] values, cannot change from disable to enabled, mapper [phone] has different [analyzer]]" }, "status" : 400 }
This is because we can't change a field that was set to be not_analyzed
to one that is analyzed
. And not only that, in most cases you won't be able to update the fields mapping. This is a good thing, because if we would be allowed to change such settings, we would confuse Elasticsearch and Lucene. Imagine that we already have many documents with the phone field set to not_analyzed
and we are allowed to change the mappings to analyzed. Elasticsearch wouldn't change the data that was already indexed, but the queries that are analyzed would be processed with a different logic and thus you wouldn't be able to properly find your data.
However, to give you some examples of what is prohibited and what is not, we decided to mention some of the operations for both the cases. For example, the following modification can be safely made:
The following modifications are prohibited or will not work:
Remember that the preceding mentioned examples of allowed and not allowed updates do not mention all the possibilities of update API usage and you have to try for yourself if the update you are trying to do will work.