Boaventura, Denis Robson Dantas; https://orcid.org/0000-0003-2771-1710; http://lattes.cnpq.br/2583383311984927
Resumo:
The development of research in smart homes and the Internet of Things has expanded the use of smart devices, creating new possibilities for personalizing and adapting homes to residents' needs. Over time, these devices have begun to offer services that improve energy efficiency, security, and home comfort. However, this evolution has also brought challenges, increasing the complexity of managing the multiple states of devices, such as color, temperature, and intensity adjustments in smart lights. One alternative to address some of these challenges is the application of state or action recommendation systems in this type of environment. In this study, we propose a recommendation system that seeks to optimize the orchestration of smart devices within a home, anticipating users’ needs and adapting to routines that may change. Through the application of reinforcement learning algorithms integrated with the use of implicit feedback, the encoding of composite states, and cooperation between multiple agents to control actuator devices, this study explores new approaches to enhance the efficiency and adaptability of recommendation systems in smart residential environments. Using a smart environment simulator, we generated two different datasets based on three distinct routines and conducted experiments using two different reinforcement learning algorithms: Deep Q-Learning and Differential Semi-gradient n-step SARSA. Furthermore, both algorithms were tested in two different approaches: simple states and composite states. While the simple state approach considers only the primary state of each device, the composite state approach takes all states into consideration. The results of this study demonstrate a promising capacity of the system to anticipate residents’ needs and adapt to changes in their routines in both approaches. In all tests conducted, using Deep Q-Learning agents, the system achieved Hamming Score metrics above 94%, while with Differential Semi-gradient n-step SARSA agents, this metric was above 95% in all cases.