Aceleração dos modelos de máquina de vetores suporte para dados massivos via amostragem localizada.

Pereira, Ivalbert dos Santos

Página inicial
→
Instituto de Matemática e Estatística (IME)
→
Programa de Pós-Graduação em Matemática (PGMAT)
→
Dissertação (PGMAT)
→
Ver item

Aceleração dos modelos de máquina de vetores suporte para dados massivos via amostragem localizada.

Pereira, Ivalbert dos Santos; http://lattes.cnpq.br/7796884613795107

URI: https://repositorio.ufba.br/handle/ri/44048

Data: 2025-06-30

Resumo:

We are experiencing an increasing development and adoption of statistical learning models (or machine learning) frameworks. Additionally, the vast amounts of data used for training can have unintended eﬀects concerning model adjustment time. In particular, Support Vector Machines (SVMs), which exhibit strong predictive performance, can be computationally intensive and even infeasible when applied to large datasets. This dissertation proposes a method to reduce the training time of a classiﬁcation SVM model by utilizing two partitioning methods and two sampling approaches. The partitioning methods aim to separate different subsets in feature space, applied to both numerical and categorical variables. Meanwhile, the sampling approaches seek to reduce the size of the training set while maintaining as much representative power from the training sample as possible. The results obtained in applications, whether using simulated or real data, are quite satisfactory, presenting shorter training times and, in some cases, enhanced predictive capabilities when compared to the traditional training approach that uses all observations in a dataset. An important ﬁnding was the reduction of the ”curse of dimensionality”eﬀects through the adoption of the proposed method.

Mostrar registro completo