Time for action - creating a basic search script

  1. Add the searchd config section to /usr/local/sphinx/etc/sphinx-blog.conf:
    source blog {
    # source options
    }
    index posts {
    # index options
    }
    indexer {
    # indexer options
    }
    # searchd options (used by search daemon)
    searchd
    {
    listen = 9312
    log = /usr/local/sphinx/var/log/searchd.log
    query_log = /usr/local/sphinx/var/log/query.log
    max_children = 30
    pid_file = /usr/local/sphinx/var/log/searchd.pid
    }
    
  2. Start the searchd daemon (as root user):
    $ sudo /usr/local/sphinx/bin/searchd -c /usr/local/sphinx/etc/sphinx-blog.conf
    
    Time for action - creating a basic search script
  3. Copy the sphinxapi.php file (the class with PHP implementation of Sphinx API) from the sphinx source directory to your working directory:
    $ mkdir /path/to/your/webroot/sphinx
    $ cd /path/to/your/webroot/sphinx
    $ cp /path/to/sphinx-0.9.9/api/sphinxapi.php .
    
    
  4. Create a simple_search.php script that uses the PHP client API class to search the Sphinx-blog index, and execute it in the browser:
    <?php
    require_once('sphinxapi.php'),
    // Instantiate the sphinx client
    $client = new SphinxClient();
    // Set search options
    $client->SetServer('localhost', 9312);
    $client->SetConnectTimeout(1);
    $client->SetArrayResult(true);
    // Query the index
    $results = $client->Query('php'),
    // Output the matched results in raw format
    print_r($results['matches']);
    
  5. The output of the given code, as seen in a browser, will be similar to what's shown in the following screenshot:
    Time for action - creating a basic search script

What just happened?

Firstly, we added the searchd configuration section to our sphinx-blog.conf file (created in Chapter 3, Indexing). The following options were added to searchd section:

  • listen: This options specifies the IP address and port that searchd will listen on. It can also specify the Unix-domain socket path. This options was introduced in v0.9.9 and should be used instead of the port (deprecated) option. If the port part is omitted, then the default port used is 9312.

    Examples:

    • listen = localhost
    • listen = 9312
    • listen = localhost:9898
    • listen = 192.168.1.25:4000
    • listen = /var/run/sphinx.s
  • log: Name of the file where all searchd runtime events will be logged. This is an optional setting and the default value is "searchd.log".
  • query_log: Name of the file where all search queries will be logged. This is an optional setting and the default value is empty, that is, do not log queries.
  • max_children: The maximum number of concurrent searches to run in parallel. This is an optional setting and the default value is 0 (unlimited).
  • pid_file: Filename of the searchd process ID. This is a mandatory setting. The file is created on startup and it contains the head daemon process ID while the daemon is running. The pid_file becomes unlinked when the daemon is stopped.

Once we were done with adding searchd configuration options, we started the searchd daemon with root user. We passed the path of the configuration file as an argument to searchd. The default configuration file used is /usr/local/sphinx/etc/sphinx.conf.

After a successful startup, searchd listens on all network interfaces, including all the configured network cards on the server, at port 9312. If we want searchd to listen on a specific interface then we can specify the hostname or IP address in the value of the listen option:

listen = 192.168.1.25:9312

Note

The listen setting defined in the configuration file can be overridden in the command line while starting searchd by using the -l command line argument.

There are other (optional) arguments that can be passed to searchd as seen in the following screenshot:

What just happened?

Note

searchd needs to be running all the time when we are using the client API. The first thing you should always check is whether searchd is running or not, and start it if it is not running.

We then created a PHP script to search the sphinx-blog index. To search the Sphinx index, we need to use the Sphinx client API. As we are working with a PHP script, we copied the PHP client implementation class, (sphinxapi.php) which comes along with Sphinx source, to our working directory so that we can include it in our script. However, you can keep this file anywhere on the file system as long as you can include it in your PHP script.

Note

Throughout this book we will be using /path/to/webroot/sphinx as the working directory and we will create all PHP scripts in that directory. We will refer to this directory simply as webroot.

We initialized the SphinxClient class and then used the following class methods to set up the Sphinx client API:

  • SphinxClient::SetServer($host, $port)—This method sets the searchd hostname and port. All subsequent requests use these settings unless this method is called again with some different parameters. The default host is localhost and port is 9312.
  • SphinxClient::SetConnectTimeout($timeout)—This is the maximum time allowed to spend trying to connect to the server before giving up.
  • SphinxClient::SetArrayResult($arrayresult)—This is a PHP client API-specific method. It specifies whether the matches should be returned as an array or a hash. The Default value is false, which means that matches will be returned in a PHP hash format, where document IDs will be the keys, and other information (attributes, weight) will be the values. If $arrayresult is true, then the matches will be returned in plain arrays with complete per-match information.

After that, the actual querying of index was pretty straightforward using the SphinxClient::Query($query) method. It returned an array with matched results, as well as other information such as error, fields in index, attributes in index, total records found, time taken for search, and so on. The actual results are in the $results['matches'] variable.

We can run a loop on the results, and it is a straightforward job to get the actual document's content from the document ID and display it.

Matching modes

When a full-text search is performed on the Sphinx index, different matching modes can be used by Sphinx to find the results. The following matching modes are supported by Sphinx:

  • SPH_MATCH_ALL—This is the default mode and it matches all query words, that is, only records that match all of the queried words will be returned.
  • SPH_MATCH_ANY—This matches any of the query words.
  • SPH_MATCH_PHRASE—This matches query as a phrase and requires a perfect match.
  • SPH_MATCH_BOOLEAN—This matches query as a Boolean expression.
  • SPH_MATCH_EXTENDED—This matches query as an expression in Sphinx internal query language.
  • SPH_MATCH_EXTENDED2—This matches query using the second version of Extended matching mode. This supersedes SPH_MATCH_EXTENDED as of v0.9.9.
  • SPH_MATCH_FULLSCAN—In this mode the query terms are ignored and no text-matching is done, but filters and grouping are still applied.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset