Joel, Machado Pires; https://orcid.org/0000-0002-8428-3516; http://lattes.cnpq.br/1733784830364973
Abstract:
Recommender systems (RecSys) enhance information retrieval efficiency across different domains by delivering personalized content that aligns with user preferences. These systems address user-item relationships through methods like matrix factorization and graph attention networks (GAT). Despite advancements in accuracy, existing approaches often narrowly focus on predictive performance while neglecting the broader utility of confidence estimation. This estimation is crucial for quantifying the certainty behind recommendations, particularly in situations where a balance between risk and reward is required. RecSys can mitigate uncertainties stemming from data noise and model limitations by utilizing confidence. Existing approaches to confidence integration face critical limitations. Non-parametric techniques, such as neural network-based probabilistic calibration, remain confined to classification tasks, failing to address regression scenarios, including rating prediction and listwise learn-to-rank. Many confidence models operate independently of core recommendation processes, limiting their adaptability and calibration impact. Notably, the literature neglects the integration of confidence into GAT-based models. Additionally, the literature lacks experimental evaluation of different distribution-based methods. Therefore, this study proposes an experimental evaluation of previous distribution-based methods and explores a suitable confidence integration in GAT-based models. We evaluate four prior solutions in terms of rating prediction accuracy, ranking accuracy, and confidence correlation with error. These solutions and our proposal are evaluated in public datasets from varying contexts and characteristics. Results reveal that distribution-based confidence integration often harms models' accuracy and leaves room for improvement regarding the correlation between confidence and error. Although these findings also hold for our method, it still achieves superior performance compared to all prior solutions and shows promising results in terms of negative confidence–error correlation. Furthermore, as a second part of this study, we propose and evaluate the integration of confidence into embedding models for learn-to-rank methods. This proposal and its baselines are also evaluated across various public datasets, using different ranking metrics, and the correlation with confidence and error. The results reveal that both proposed methods consistently demonstrate competitive rank performances and even outperform the baselines in some datasets. Specifically, the proposed confidence integration for rating prediction achieved improvements of at least 58.16% in ranking metrics on the Amazon Movies and TVs, 34.94% on the Jester Joke, and 42.98% on the MovieLens dataset. Additionally, we observed a cubic polynomial relationship between confidence and error in this latter solution.