Preface

Welcome to the world of Elasticsearch and Mastering Elasticsearch Second Edition. While reading the book, you'll be taken through different topics—all connected to Elasticsearch. Please remember though that this book is not meant for beginners and we really treat the book as a follow-up or second part of Elasticsearch Server Second Edition. There is a lot of new content in the book and, sometimes, you can refer to the content of Elasticsearch Server Second Edition within this book.

Throughout the book, we will discuss different topics related to Elasticsearch and Lucene. We start with an introduction to the world of Lucene and Elasticsearch to introduce you to the world of queries provided by Elasticsearch, where we discuss different topics related to queries, such as filtering and which query to choose in a particular situation. Of course, querying is not all and, because of that, the book you are holding in your hands provides information on newly introduced aggregations and features that will help you give meaning to the data you have indexed in Elasticsearch indices, and provide a better search experience for your users.

Even though, for most users, querying and data analysis are the most interesting parts of Elasticsearch, they are not all that we need to discuss. Because of this, the book tries to bring you additional information when it comes to index architecture such as choosing the right number of shards and replicas, adjusting the shard allocation behavior, and so on. We will also get into the places where Elasticsearch meets Lucene, and we will discuss topics such as different scoring algorithms, choosing the right store mechanism, what the differences between them are, and why choosing the proper one matters.

Last, but not least, we touch on the administration part of Elasticsearch by discussing discovery and recovery modules, and the human-friendly Cat API, which allows us to very quickly get relevant administrative information in a form that most humans should be able to read without parsing JSON responses. We also talk about and use tribe nodes, giving us possibilities of creating federated searches across many nodes.

Because of the title of the book, we couldn't omit performance-related topics, and we decided to dedicate a whole chapter to it. We talk about doc values and the improvements they bring, how garbage collector works, and what to do when it does not work as we expect. Finally, we talk about Elasticsearch scaling and how to prepare it for high indexing and querying use cases.

Just as with the first edition of the book, we decided to end the book with the development of Elasticsearch plugins, showing you how to set up the Apache Maven project and develop two types of plugins—custom REST action and custom analysis.

If you think that you are interested in these topics after reading about them, we think this is a book for you and, hopefully, you will like the book after reading the last words of the summary in Chapter 9, Developing Elasticsearch Plugins.

What this book covers

Chapter 1, Introduction to Elasticsearch, guides you through how Apache Lucene works and will reintroduce you to the world of Elasticsearch, describing the basic concepts and showing you how Elasticsearch works internally.

Chapter 2, Power User Query DSL, describes how the Apache Lucene scoring works, why Elasticsearch rewrites queries, what query templates are, and how we can use them. In addition to that, it explains the usage of filters and which query should be used in a particular use case.

Chapter 3, Not Only Full Text Search, describes queries rescoring, multimatching control, and different types of aggregations that will help you with data analysis—significant terms aggregation and top terms aggregation that allow us to group documents with a certain criteria. In addition to that, it discusses relationship handling in Elasticsearch and extends your knowledge about scripting in Elasticsearch.

Chapter 4, Improving the User Search Experience, covers user search experience improvements. It introduces you to the world of Suggesters, which allows you to correct user query spelling mistakes and build efficient autocomplete mechanisms. In addition to that, you'll see how to improve query relevance by using different queries and the Elasticsearch functionality with a real-life example.

Chapter 5, The Index Distribution Architecture, covers techniques for choosing the right amount of shards and replicas, how routing works, how shard allocation works, and how to alter its behavior. In addition to that, we discuss what query execution preference is and how it allows us to choose where the queries are going to be executed.

Chapter 6, Low-level Index Control, describes how to alter the Apache Lucene scoring and how to choose an alternative scoring algorithm. It also covers NRT searching and indexing and transaction log usage, and allows you to understand segment merging and tune it for your use case. At the end of the chapter, you will also find information about Elasticsearch caching and request breakers aiming to prevent out-of-memory situations.

Chapter 7, Elasticsearch Administration, describes what the discovery, gateway, and recovery modules are, how to configure them, and why you should bother. We also describe what the Cat API is, how to back up and restore your data to different cloud services (such as Amazon AWS or Microsoft Azure), and how to use tribe nodes—Elasticsearch federated search.

Chapter 8, Improving Performance, covers Elasticsearch performance-related topics ranging from using doc values to help with field data cache memory usage through the JVM garbage collector work, and queries benchmarking to scaling Elasticsearch and preparing it for high indexing and querying scenarios.

Chapter 9, Developing Elasticsearch Plugins, covers Elasticsearch plugins' development by showing and describing in depth how to write your own REST action and language analysis plugin.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset