Time for action - weighting search results

  1. Modify display_results.php (created earlier) and add the code as highlighted next:
    if (count($cats)) {
    print "<br />Categories: " . implode(', ', $cats);
    }
    print "<br />Weight: " . $result['weight'];
    print "<hr />";
    
  2. Create a PHP script search_weighting.php in your webroot containing the following code:
    <?php
    // Include the api class
    require('sphinxapi.php'),
    // Include the file which contains the function to display results
    require_once('display_results.php'),
    $client = new SphinxClient();
    $client->SetServer('localhost', 9312);
    $client->SetConnectTimeout(1);
    $client->SetArrayResult(true);
    $client->SetMatchMode(SPH_MATCH_ANY);
    display_results(
    $client->Query('php language framework'),
    'MATCH ANY'),
    $client->SetMatchMode(SPH_MATCH_BOOLEAN);
    display_results(
    $client->Query('php | framework'),
    'BOOLEAN'),
    $client->SetMatchMode(SPH_MATCH_EXTENDED2);
    display_results(
    $client->Query('@* php | @* framework'),
    'EXTENDED'),
    
  3. Execute the script in a browser.

What just happened?

We added code to show the weight in display_results.php. We then created a script to see how the weights are calculated when different matching modes are used.

In all modes, per-field weighted phrase ranks are computed as a product of LCS and per-field weight is specified by the user. The default value of per-field weight is 1 and they are always integer. They can never be less than 1.

Note

You can use SetFieldWeights($weights) API method to set per-field weight. $weights should be an associative array mapping string field names to integer value.

When SPH_MATCH_ANY is used, Sphinx adds a count of matching words in each field and before that weighted phrase ranks are additionally multiplied by a value big enough to guarantee that higher rank in any field will make the match ranked higher, even if it's field weight is low.

SPH_MATCH_BOOLEAN is a special case, wherein no weighting is performed at all and every match weight is set to 1.

The last mode we saw was SPH_MATCH_EXTENDED2, in which the final weight is a sum of weighted phrase ranks and BM25 weight. This sum is then multiplied by 1,000 and rounded to an integer. This is shown in the following screenshot:

What just happened?

Sphinx's motto is to present results with better sub-phrase matches, and perfect matches are pulled to the top.

Note

At the time of writing this book ranking mode can be explicitly set for SPH_MATCH_EXTENDED2 matching mode using the SetRankingMode() API method.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset