When Inverse Propensity Scoring does not Work: Affine Corrections for Unbiased Learning to Rank

Ali Vardasbi, Harrie Oosterhuis and Maarten de Rijke
Published in Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM ’20), 2020. [pdf, code, video]


Besides position bias, which has been well-studied, trust bias is another type of bias prevalent in user interactions with rankings: users are more likely to click incorrectly w.r.t. their preferences on highly ranked items because they trust the ranking system. While previous work has observed this behavior in users, we prove that existing Counterfactual Learning to Rank (CLTR) methods do not remove this bias, including methods specifically designed to mitigate this type of bias. Moreover, we prove that Inverse Propensity Scoring (IPS) is principally unable to correct for trust bias under non-trivial circumstances. Our main contribution is a new estimator based on affine corrections: it both reweights clicks and penalizes items displayed on ranks with high trust bias. Our estimator is the first estimator that is proven to remove the effect of both trust bias and position bias. Furthermore, we show that our estimator is a generalization of the existing (CLTR) framework: if no trust bias is present, it reduces to the original (IPS) estimator. Our semi-synthetic experiments indicate that by removing the effect of trust bias in addition to position bias, (CLTR) can approximate the optimal ranking system even closer than previously possible.

Download the paper here.

Code is available here.

Recommended citation:

A. Vardasbi, H. Oosterhuis, M. de Rijke. "When Inverse Propensity Scoring does not Work: Affine Corrections for Unbiased Learning to Rank." In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM, 2020.