Time for action - creating the Sphinx configuration file

  1. Create the file /usr/local/sphinx/etc/sphinx-blog.conf with the following content:
    source blog
    {
    type = mysql
    sql_host = localhost
    sql_user = root
    sql_pass =
    sql_db = myblog
    sql_query = SELECT id, title, content FROM posts
    sql_query_info = SELECT id, title FROM posts WHERE ID=$id
    }
    index posts
    {
    source = blog
    path = /usr/local/sphinx/var/data/blog
    docinfo = extern
    charset_type = sbcs
    }
    indexer
    {
    mem_limit = 32M
    }
    
  2. Run the following command to create the index:
    $ /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx-blog.conf --all
    
    Time for action - creating the Sphinx configuration file
  3. Now test the index by searching from the command line search utility:
    $ /usr/local/sphinx/bin/search --config /usr/local/sphinx/etc/sphinx-blog.conf php
    
    Time for action - creating the Sphinx configuration file

What just happened?

We created a configuration file /usr/local/sphinx/etc/sphinx-blog.conf, which is later used by Sphinx to create an index.

The first block in the configuration file defines the data source named blog which is of type mysql. We provided values for the options to connect to the database and a query (sql_query) which fetches all the records from the database for indexing. The last option in the configuration file is sql_query_info, which is used to display additional information related to the searched documents. Because of this we see the id and title in the search results.

What just happened?

The next block in the configuration file defines index. The index will be saved at /usr/local/sphinx/var/data/blog on the file system.To create the actual index we used the indexer program. indexer takes the path of the config file as the argument.

Note

If the config file path is not mentioned then it tries to search for it at the default location which is /usr/local/sphinx/etc/sphinx.conf.

Another argument we passed was --all which says that all indexes defined in the configuration file should be indexed. In our case there was just one index and we had named it posts.

There are a number of arguments that can be passed to the indexer. To view a list of all the arguments issue the following command:

$ /usr/local/sphinx/bin/indexer
What just happened?

The last thing we did was perform a search for the term "php", which returned two documents. This concluded that our index is working fine.

Note

The search utility used to perform the search is one of the helper tools available to quickly test the index from the command line without writing the code to connect to the searchd server and process its response.

search is not intended to be used in a client application. You should use searchd and the bundle client APIs to perform a search from within your application. We will be taking a look at how to use searchd and a client API to perform search in Chapter 4,Searching.

Similar to the indexer program, search also takes a number of arguments. Since we were not using the default configuration file, we passed the path of configuration file as an argument to the search command. The search term, that is "php", should always be last in the list of arguments.

$ /usr/local/sphinx/bin/search
What just happened?

The indexing workflow

Indexing works in the same fashion with all the SQL drivers. When indexer is run, a database connection is established using the credentials provided in the configuration file. After that, the main query, the sql_query is fired to fetch the data to be indexed. Once this is done the connection to the database is closed and the indexer does the sorting phase.

The indexing workflow

Adding attributes to the index

The index we created for the blog posts is all good and fine, but it only works for full-text searching. What if we want to filter the results by author or date? That can't be done with the index that we created earlier. To solve this problem, Sphinx offers special fields in the index called attributes.

Let's add attributes to our index to hold the author_id and publish_date.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset