Please use this identifier to cite or link to this item: https://repositorio.ufba.br/handle/ri/36099
metadata.dc.type: Dissertação
Title: Exploiting linked data in Dbpedia to reduce prediction error in matrix factorization recommenders
Other Titles: Explorando Linked Data na DBpedia para reduzir Erro Predito em Recomendadores baseados em Fatorização de Matriz
Exploração de dados vinculados na Dbpedia para reduzir erro de previsão em recomendadores de factorização matricial
metadata.dc.creator: Pereira, Victor Martinez Vidal
metadata.dc.contributor.advisor1: Durão, Frederico Araujo
metadata.dc.contributor.referee1: Durão, Frederico Araujo
metadata.dc.contributor.referee2: Pereira, Adriano César Machado
metadata.dc.contributor.referee3: Coimbra, Danilo Barbosa
metadata.dc.description.resumo: Recommender Systems provide suggestions for items that are most likely of interest to users. Providing personalized recommendations is a challenge that can be addressed by filtering algorithms among which Collaborative Filtering (CF) has demonstrated much progress in the last few years. By using Matrix Factorization (MF) techniques, CF methods reduce prediction error by using optimization algorithms. However, they usually face problems such as data sparsity and prediction error. Studies point to the use of data available in Semantic Web as a path to improve recommender systems and address the challenges related to CF techniques. Motivated by these premises, the present work, conducted by me at RecSys Research Group at UFBA, developed a data pipeline along with an algorithm that processes the Ratings Matrix combining semantic similarities of Linked Open Data (LOD) and estimates missing ratings. The experiments took subsets of 1000 samples from three di↵erent datasets (Movielens, LastFM and LibraryThing), calculated two semantic similarity metrics, Linked Data Similarity Distance (LDSD) and Resource Similarity (RESIM), and applied three MF-based algorithms (SVD, SVD++ and NMF). Results suggest the proposed pipeline is able to reduce Root Mean Square Error (RMSE) of all subsets with statistical confidence supported by parametric test one-way ANOVA followed by Tukey’s multiple comparison test.
Abstract: Recommender Systems provide suggestions for items that are most likely of interest to users. Providing personalized recommendations is a challenge that can be addressed by filtering algorithms among which Collaborative Filtering (CF) has demonstrated much progress in the last few years. By using Matrix Factorization (MF) techniques, CF methods reduce prediction error by using optimization algorithms. However, they usually face problems such as data sparsity and prediction error. Studies point to the use of data available in Semantic Web as a path to improve recommender systems and address the challenges related to CF techniques. Motivated by these premises, the present work, conducted by me at RecSys Research Group at UFBA, developed a data pipeline along with an algorithm that processes the Ratings Matrix combining semantic similarities of Linked Open Data (LOD) and estimates missing ratings. The experiments took subsets of 1000 samples from three di↵erent datasets (Movielens, LastFM and LibraryThing), calculated two semantic similarity metrics, Linked Data Similarity Distance (LDSD) and Resource Similarity (RESIM), and applied three MF-based algorithms (SVD, SVD++ and NMF). Results suggest the proposed pipeline is able to reduce Root Mean Square Error (RMSE) of all subsets with statistical confidence supported by parametric test one-way ANOVA followed by Tukey’s multiple comparison test.
Keywords: Sistemas de recomendação
Fatorização de matrizes
Dados abertos
Erro predito
metadata.dc.subject.cnpq: CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::SISTEMAS DE INFORMACAO
metadata.dc.language: eng
metadata.dc.publisher.country: Brasil
Publisher: Universidade Federal da Bahia
metadata.dc.publisher.initials: UFBA
metadata.dc.publisher.department: Instituto de Computação - IC
metadata.dc.publisher.program: Programa de Pós-Graduação em Ciência da Computação (PGCOMP) 
Citation: PEREIRA, Victor Martinez Vidal. Exploiting linked data in Dbpedia to reduce prediction error in matrix factorization recommenders. 2022. 64 f. Dissertação (Mestrado em Ciências da Computação) Instituto de Computação, Universidade Federal da Bahia, Salvador, Ba, 2022.
URI: https://repositorio.ufba.br/handle/ri/36099
Issue Date: 21-Jun-2022
Appears in Collections:Dissertação (PGCOMP)

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.