Time for action - Sphinx in action

Let's see an example of how Sphinx works. We will create an index and then search it using the Sphinx command line utility as well as the PHP client implementation. So let's begin:

  1. Firstly, create a MySQL database named test, if it is not already there:
    CREATE DATABASE test;
    

    Sphinx ships with a sample configuration file and a sample database table to be used for demo purposes. The SQL for the table is located at /usr/local/sphinx/etc/example.sql and it contains the following SQL:

    DROP TABLE IF EXISTS test.documents;
    CREATE TABLE test.documents
    (
    id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT,
    group_id INTEGER NOT NULL,
    group_id2 INTEGER NOT NULL,
    date_added DATETIME NOT NULL,
    title VARCHAR(255) NOT NULL,
    content TEXT NOT NULL
    );
    REPLACE INTO test.documents ( id, group_id, group_id2, date_added, title, content ) VALUES
    ( 1, 1, 5, NOW(), 'test one', 'this is my test document number one. also checking search within phrases.' ),
    ( 2, 1, 6, NOW(), 'test two', 'this is my test document number two' ),
    ( 3, 2, 7, NOW(), 'another doc', 'this is another group' ),
    ( 4, 2, 8, NOW(), 'doc number four', 'this is to test groups' );
    DROP TABLE IF EXISTS test.tags;
    CREATE TABLE test.tags
    (
    docid INTEGER NOT NULL,
    tagid INTEGER NOT NULL,
    UNIQUE(docid,tagid)
    );
    INSERT INTO test.tags VALUES
    (1,1), (1,3), (1,5), (1,7),
    (2,6), (2,4), (2,2),
    (3,15),
    (4,7), (4,40);
    

    You can copy the SQL and paste it in your phpMyAdmin interface to run the SQL or execute the following command to import the SQL from the command line in Linux:

    $ mysql -u root < /usr/local/sphinx/etc/example.sql
    
  2. Next, create the configuration file (you may need the permissions to create the file):
    $ cd /usr/local/sphinx/etc
    $ cp sphinx-min.conf.dist sphinx.conf
    

    Now edit sphinx.conf in your favorite editor (you may need to change the permissions of the file to be able to modify it).

    The first block of the file looks something like this:

    source src1
    {
    type = mysql
    sql_host = localhost
    sql_user = test
    sql_pass =
    sql_db = test
    sql_port = 3306 # optional, default is 3306
    sql_query = 
    SELECT id, group_id, UNIX_TIMESTAMP(date_added)
    AS date_added, title, content 
    FROM documents
    sql_attr_uint = group_id
    sql_attr_timestamp = date_added
    sql_query_info = SELECT * FROM documents WHERE id=$id
    }
    
  3. Change the value of sql_host, sql_user, sql_pass and sql_db as per your system:
    sql_host = localhost
    sql_user = myuser
    sql_pass = mypass
    sql_db = test
    

    If you have not installed the Sphinx at /usr/local/sphinx then you will need to modify the paths of the following options as well:

    • path
    • log
    • query_log
    • pid_file
  4. Now run the indexer:
    $ /usr/local/sphinx/bin/indexer --all
    

    This will give output as shown in the following screenshot

    Time for action - Sphinx in action

    If you have installed Sphinx at a location other than /usr/local/sphinx, then you need to use the -c /path/to/sphinx.conf option in the previous command.

  5. Next, let's query the index to see if it works:
    $ /usr/local/sphinx/bin/search test
    
    Time for action - Sphinx in action

    To query the index from our PHP scripts, we first need to start the searchd daemon

    $ /usr/local/sphinx/bin/searchd
    
    Time for action - Sphinx in action

    Note

    To run searchd commands, you need to be the root user. You can either switch to root user using the su - command, or you could prefix all searchd commands with sudo.

  6. Now, go to the directory where you extracted the Sphinx tarball during installation (in Chapter 1, Setting Up Sphinx) and run the command as shown here:
    $ cd /path/to/sphinx-0.9.9
    $ php api/test.php test
    

    The command will output the search results, which confirms that searchd is working properly and we can search from our applications using the client API.

    Time for action - Sphinx in action

What just happened?

We created an index from the data stored in a MySQL table. We then used Sphinx's search utility to search for the test term in the index. The results showed that Sphinx is working properly and that the index we created was fine.

The major difference between search results by MySQL and Sphinx is that Sphinx does not return the actual data but only the document id. Using these document IDs, we need to fetch the actual data (from its source) to display it. Along with the document id, Sphinx also returns all the attributes and weight of each document. The higher the weight, the higher the relevance of that document with the search query.

We then used the PHP implementation of the Sphinx Client API to search for the same test term, but this time from within a PHP script.

Data to be indexed

The first thing we did was to create a MySQL database and then import the sample data in to it. This gave us the data as shown in the following screenshot:

Data to be indexed

Note

Throughout this book, the dates and times shown may differ from what you would have in your database or index. So don't worry about that.

Creating the Sphinx configuration file

Sphinx creates an index based on the options defined in the Sphinx configuration file sphinx.conf. This file is divided into different sections:

  • source: This section holds all the settings related to the source of the data to be indexed, which in our case is a MySQL database.
  • index: This section holds options which tell Sphinx where and how to save the index. These options are used during indexing-time.
  • indexer: This section holds options for the indexer program.
  • searchd: This section holds the options used while searching the index.

In this chapter we will not go into great detail about all the options used in the configuration file. However, a few options to look for are:

  • sql_*: These options are there to tell Sphinx about different MySQL settings; such as username, password, the database to use, and the port to use.
  • sql_query: This option holds the query that will be fired in order to get the data from the MySQL database.

Once the configuration file is ready, index can be created by issuing the following command.

$ /usr/local/sphinx/bin/indexer -all

During the indexing operation, some information is displayed in the console such as what configuration file is being used by the indexer, how many documents were found, how much time it took to index, and other related information.

Note

To run indexer commands, you need to be the root user. You can either switch to root user using the su - command, or you could prefix all indexer commands with sudo.

Searching the index

Sphinx provides a command-line utility search which comes in handy to quickly query the index that we created earlier. However, this utility should only be used for testing purposes. In the production environment one should always use the searchd and its client API implementation.

$ /usr/local/sphinx/bin/search test

The output of the search command gives us the results that matched the search term test. The result shows us the document id and weight, amongst other information for the queried term.

Similar information is displayed when we use the PHP client API to search.

Have a go hero -

We created a very basic example to see how Sphinx works; however, you can extend and explore this by:

  • Adding a few more records to the documents table
  • Re-indexing the documents table
  • Searching with different search phrases and examining the returned results and their weights
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset