display_results.php
(created earlier) and add the code as highlighted next:if (count($cats)) {
print "<br />Categories: " . implode(', ', $cats);
}
print "<br />Weight: " . $result['weight'];
print "<hr />";
search_weighting.php
in your webroot containing the following code:<?php // Include the api class require('sphinxapi.php'), // Include the file which contains the function to display results require_once('display_results.php'), $client = new SphinxClient(); $client->SetServer('localhost', 9312); $client->SetConnectTimeout(1); $client->SetArrayResult(true); $client->SetMatchMode(SPH_MATCH_ANY); display_results( $client->Query('php language framework'), 'MATCH ANY'), $client->SetMatchMode(SPH_MATCH_BOOLEAN); display_results( $client->Query('php | framework'), 'BOOLEAN'), $client->SetMatchMode(SPH_MATCH_EXTENDED2); display_results( $client->Query('@* php | @* framework'), 'EXTENDED'),
We added code to show the weight in display_results.php
. We then created a script to see how the weights are calculated when different matching modes are used.
In all modes, per-field weighted phrase ranks are computed as a product of LCS and per-field weight is specified by the user. The default value of per-field weight is 1 and they are always integer. They can never be less than 1.
When SPH_MATCH_ANY
is used, Sphinx adds a count of matching words in each field and before that weighted phrase ranks are additionally multiplied by a value big enough to guarantee that higher rank in any field will make the match ranked higher, even if it's field weight is low.
SPH_MATCH_BOOLEAN
is a special case, wherein no weighting is performed at all and every match weight is set to 1.
The last mode we saw was SPH_MATCH_EXTENDED2
, in which the final weight is a sum of weighted phrase ranks and BM25 weight. This sum is then multiplied by 1,000 and rounded to an integer. This is shown in the following screenshot:
Sphinx's motto is to present results with better sub-phrase matches, and perfect matches are pulled to the top.