Learning to Rank has traditionally considered settings where given the relevance information of objects, the desired order in which to rank the objects is clear. However, with today’s large variety of users and layouts this is not always the case. In this paper, we consider so-called complex ranking settings where it is not clear what should be displayed, that is, what the relevant items are, and how they should be displayed, that is, where the most relevant items should be placed. These ranking settings are complex as they involve both traditional ranking and inferring the best display order. Existing learning to rank methods cannot handle such complex ranking settings as they assume that the display order is known beforehand. To address this gap we introduce a novel Deep Reinforcement Learning method that is capable of learning complex rankings, both the layout and the best ranking given the layout, from weak reward signals. Our proposed method does so by selecting documents and positions sequentially, hence it ranks both the documents and positions, which is why we call it the Double Rank Model (DRM). Our experiments show that DRM outperforms all existing methods in complex ranking settings, thus it leads to substantial ranking improvements in cases where the display order is not known a priori.
H. Oosterhuis, M. de Rijke. "Ranking for Relevance and Display Preferences in Complex Presentation Layouts." In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2018.