Defensive Low Authority Host Predictor

Published in Microsoft Journal of Applied Research, 2022

Abstract: Responsible AI is becoming critical since AI is used in everyday lives. Search, recommender, and ranking techniques widely used in multiple industries are using ML models heavily. Not only do we need to improve the accuracy of the models but also need to guarantee fairness, resiliency to noise, explainability, and authoritative results. These objectives are not only relevant for ML model training but also, we need to ensure that we are showing fair as well as authoritative results. In this paper, we propose a host score prediction technique via which we try to demote the unsatisfactory hosts based on the integrity, quality, and authority (where is the information from, and is the information credible) of the hosts. Based on multiple features extracted for the host from context and other stats we publish scores for the host which indicate if they are good to show up on the landing page. We demote down hosts having low scores (i.e. unsatisfactory hosts) below a threshold and thus, reduce leakage of Bing via this technique making sure that there is no impact on the relevance of results. Finally, we show state-of-the-art results on our dataset built around billions of hosts. We show that this technique around responsible AI is highly robust and easy to deploy. We believe to have scratched the niche area of responsible AI and suggest further research challenges around this work.

Download paper here