Solr打分计算公式
solr 6.3 版本的打分公式
计算公式:
\[score(q,d) = coord(q,d) \cdot queryNorm(q) \cdot \sum_{t\;in\;q} {(\;tf(t\;in\;d) \cdot idf(t)^2 \cdot t.getBoost() \cdot norm(t,d)\;)}\]
coord(q,d):coordination factorhow many of the query terms are found in the specified document 关键词命中率
queryNorm(q):query normalizationserves as a normalization factor to attempt to make cores between queries comparable
\[queryNorm(q) = queryNorm(sumOfSquaredWeights) = \frac{1}{\sqrt{sumOfSquaredWeights}}\]$sumOfSquaredWeights = q.getBoost()^2 \cdot (idf(t) \cdot t.getBoost())^2$
sumOfSquaredWeights 等于每个 term 相加。默认 boost = 1,则为每个 term 的 idf(t) 的平方相加
tf(t in d):term frequencyit is a measure of how often a particular term appears in a matching document
$tf(t\;in\;d) = frequency^{1/2}$
idf(t):inverse document frequencyrarer terms give higher contribution to the total score
$idf(t) = 1 + \log \left(\frac{docCount\;+\;1}{docFreq\;+\;1} \right)$
- $\log$ 是以 e 为底数的对数;对应 JAVA 中的 $\log$ 方法
norm(t,d):normalization factorcombine the matching document’s boost, the matching field’s boost, and a length-normalization factor that penalizes longer documents
$norm(t,d) = d.getBoost() \cdot lengthNorm(f) \cdot f.getBoost()$
$lengthNorm = 1 \; / \; Math.sqrt(numTerms) $
t.getBoost:即每个词的权重,没给则默认为 1