Deep Reinforcement Learning for Cell-Free Massive MIMO Network Optimization

Despite the significant advancements in wireless communication technologies, inter- cell interference remains a limiting factor due to the cell-centric design of traditional mobile networks. Cell-free massive multiple-input multiple-output (MIMO) is a paradigm shift in network architecture, where we replace the fixed cell boundaries with a seamless network of cooperating access points (APs) to achieve a uniformly good performance throughout the coverage area. To harness its full potential, it is necessary to address its scalability issue and the need for dynamic optimization based on the current state of the wireless environment. Compared to conventional opti- mization techniques and (un-)supervised machine learning, deep reinforcement learn- ing (DRL) is capable of operating model-free, without requiring any prior knowledge, including training datasets, and in an online manner, making it an effective tool for real-time network adaptation. Motivated by these advantages, this dissertation leverages DRL for the realization of scalable, self-adapting cell-free massive MIMO. The dissertation consists of three main parts. The first part focuses on user-centric clustering, where each user equipment (UE) is served by only a subset of APs. We demonstrate that there exists a cluster size that enables the scalable user-centric variant to stay close to the upper bound rate performance, exhibited by the canonical setup, but with significantly fewer AP- UE connections, translating to lower fronthaul requirements. While our proposed iteration-based algorithms have managed to deal with the non-convexity of the op- timization problems, such methods are challenging to implement in real-time. The second part of the dissertation capitalizes on single-agent RL (SARL) for cell-free network optimization. We design a framework that (de-)activates APs by jointly considering the position of all users. We show that by properly identifying the underutilized APs to deactivate, we reduce power consumption while obtaining a quality of service (QoS) that is close to that achieved when all the APs are always turned on. We next propose a SARL system that utilizes the spatial user densities for grouping the APs in a scalable network with multiple central processing units (CPUs). We demonstrate that by tailoring the AP group sizes to the expected user concentrations in different subareas, we improve the network sum rate. We then develop a DRL-based algorithm for user-centric clustering. By optimizing the AP selection and cluster size for each UE, we obtain almost the same performance as the canonical setup while benefiting from the reduced fronthaul capacity usage. Our findings prove that small user-centric clusters are sufficient to achieve good QoS, implying that only a subset of APs substantially contributes to UE rate performance.

File Type: pdf
File Size: 4 MB
Publication Year: 2025
Author: Charmae Franchesca Mendoza
Supervisors: Stefan Schwarz, Markus Rupp
Institution: TU Wien
Keywords: Machine learning, Cell free MIMO, DRL