Index configuration

The next mandatory section of the configuration file is the index section. This section defines how to index the data and identifies certain properties to look for before indexing the data.

There can be multiple indexes in a single configuration file and an index can extend another index as was done in Chapter 5, Feed Search, when we created a main and delta indexing and searching schemes.

There is another powerful searching scheme that should be used if you are indexing billions of records and terabytes of data. This scheme is called distributed searching.

Distributed searching

Distributed searching is useful in searching through a large amount of data, which if kept in one single index would cause high query latency (search time), and will serve a fewer number of queries per second.

In Sphinx, the distribution is done horizontally, that is, a search is performed across different nodes and processing is done in parallel.

To enable distributed searching you need to use type option in the index section of the configuration file and set its value to distributed.

Set up an index on multiple servers

Let's understand the distributed searching scheme using an example. We will use the same database as we did in our previous exercise. We will use two servers for distribution.

In our example we assume the following:

  • First (primary) server's IP is 192.168.1.1
  • Second server's IP is 192.168.1.2
  • The database is served from first (192.168.1.1) server and both servers use the same database
  • The search query will be issued on the first server
  • Both servers have Sphinx installed

The set up would appear similar to the next schematic:

Set up an index on multiple servers
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset