Resumo:
Cluster Analysis is an area with vast methodological development in different areas of knowledge. This dissertation proposes a new clustering method for mixed data, taking into account the multilevel structure of observations. The identification of how similar or close the units of analysis are can be quantified through proximity measures, which, together with the algorithms used, are essential in the cluster analysis methodology. Mixed data is characterized by the joint presence of quantitative and qualitative variables. The term “Multilevel Clustering” is used in different areas of knowledge, referring to different concepts. Our multilevel clustering proposal adapts the k-means algorithm to multilevel data, incorporating the hierarchical structure of the data in calculating the distances between observations through a Hellinger distance weighting approach. The results obtained from simulation studies and practical applications are satisfactory, presenting better groupings when there is more than one quantitative variable. However, more studies are still needed in different scenarios to increase the robustness of the proposed methodology.