a Dynamic Bayesian Network Click Model for web search ranking: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 4: Line 4:
==Examination Hypothesis:==
==Examination Hypothesis:==
This hypothesis assumes that if a URL clicked by a user that means it’s both examined and relevant to the query . In another word , given a query q , position i and URL u the probability of examining a URL is  :
This hypothesis assumes that if a URL clicked by a user that means it’s both examined and relevant to the query . In another word , given a query q , position i and URL u the probability of examining a URL is  :
<math>P(C=1|u, p)=P(C=1|u,E=1)P(E=1|p)</math>
<math>P(C=1|u, p)=P(C=1|u,E=1)P(E=1|p)</math>



Revision as of 12:16, 14 November 2011

One of the important techniques in web-search ranking is Click modelling, which aims to estimate the probability of click to benefit a wide range of search–related application. Click modeling is a challenging task as it affected by different factors including position bias and hidden variables. Four hypotheses are common in the click modeling which are examination hypothesis, cascade hypothesis, rationality hypothesis and positional rationality hypothesis. Based on these hypotheses many models have proposed to understand user click behaviour, but, the most common models in click modeling are cascade model and examination model. In this paper, the author proposed Dynamic Bayesian Network (DBN) model to compute the value of the perceived relevance ai and user satisfaction si assuming that users scan the links top to bottom and if they satisfy with a document they will not examine the next document. This model is similar to hidden Markov model in that there is a conditional dependency between the probability of examining a URL and the probability of examining the next URL, but it differs in that the probability of examining the next URL depends on the observation and hidden variables of the current documents.

Another models :

Examination Hypothesis:

This hypothesis assumes that if a URL clicked by a user that means it’s both examined and relevant to the query . In another word , given a query q , position i and URL u the probability of examining a URL is  :

[math]\displaystyle{ P(C=1|u, p)=P(C=1|u,E=1)P(E=1|p) }[/math]

Many models have proposed based on the examination hypothesis including the Clicks Over Expected Clicks (COEC) model [1], the Examination model[2], and the Logistic model[3].

The cascade model:

This model[4] is based on the assumption that the user scans the results top to bottom, and the click depends on the relevance of all the results of a query. This model simply aggregate the click and skip of each documents in the results.

[math]\displaystyle{ P(C_i=1)=r_i\prod_{j=1}^i-1(1-r_j) }[/math] where [math]\displaystyle{ r_i }[/math] the probability of click in position i and [math]\displaystyle{ (1-r_j) }[/math] is the probability of a document being skipped

Click chain model (CCM):

This model [5]is similar to the cascade model as it assume that user scan the documents top to bottoms, but it added the transition probability from the ith url to next url i+1, which is similar to DBN approach as the probability of examining the document [math]\displaystyle{ d_i+1 }[/math] is based on the action in the current document [math]\displaystyle{ d_i }[/math]. The conditional probability defined by CCM are :

[math]\displaystyle{ P(C_i = 1|E_i = 0) = 0 }[/math]

[math]\displaystyle{ P(C_1 = 1|E_i = 1,R_i) = R_i }[/math]

[math]\displaystyle{ P(E_i+1 = 1|Ei = 0) = 0 }[/math]

[math]\displaystyle{ P(E_i+1 = 1|E_i = 1, C_i = 0) = \alpha1 }[/math] if the user skipped a url in a potion i the probability of examining the next url is [math]\displaystyle{ \alpha 1 }[/math]

[math]\displaystyle{ P(E_i+1 = 1|E_i = 1, C_i = 1, R_i) = \alpha2(1 − R_i) + \alpha3R_i }[/math] if the user clicked the url in position i the probability of examining the next depends on the relevance of document [math]\displaystyle{ d_i }[/math] range between [math]\displaystyle{ \alpha2 }[/math] and [math]\displaystyle{ \alpha3 }[/math] ([math]\displaystyle{ \alpha1 }[/math] , [math]\displaystyle{ \alpha2 }[/math],[math]\displaystyle{ \alpha3 }[/math] are fixed constants that independent of the user and query)