Resumo:
Logistic Regression (LR) is widely used in credit risk assessment as it models the relationship between a binary response variable and a set of predictors. However, the traditional application of this technique overlooks fundamental aspects present in geographically heterogeneous contexts, such as spatial variation of risk determinants and autocorrelation among nearby observations, which are common in socioeconomic data of unequal countries like Brazil. This study addresses this limitation through the use of Geographically Weighted Logistic Regression (GWLR), which incorporates local effects into credit analysis by weighting observations according to geographic proximity. In addition, a semi-parametric version (S-GWLR) is proposed, treating some variables as global and others as local, providing greater flexibility and interpretability. Estimation was performed using 3,200 records from clients of a banking correspondent distributed across 13 municipalities in the metropolitan region of Salvador, Brazil, including only active credit holders — that is, clients who were either current or delinquent (over 90 days past due) — during 2019. The main objective of this research is to evaluate the impact of spatial heterogeneity and autocorrelation on credit risk modeling, investigating how these aspects influence the probability of default and help avoid decisions that may under- or overestimate risks in specific locations. To this end, three models were compared: LR, GWLR, and S-GWLR. Independent variables included macroeconomic indicators (Selic rate and unemployment rate), demographic data (age and education level), and credit contract characteristics (loan term). Results indicate that spatial models (GWLR and S-GWLR) provide superior fit compared to the global model, with lower AIC and BIC values and higher accuracy in classifying defaults. Sensitivity for detecting defaulters increased from 65\% in the global model to 70\% in the S-GWLR. Analysis of local coefficients revealed that macroeconomic variables, such as Selic and unemployment rates, exhibit significant variation across municipalities, while demographic variables are more spatially homogeneous. The semi-parametric approach improves modeling by identifying that only age should be treated as a global effect, enhancing flexibility and interpretability without compromising performance. In summary, considering spatial effects and local heterogeneity is essential for accurately modeling credit risk in diverse regions. The S-GWLR model supports fairer and more effective credit decisions tailored to regional socioeconomic specificities, contributing to the reduction of default risks.