Signal Quantization and Approximation Algorithms for Federated Learning
Distributed signal or information processing using Internet of Things (IoT), facilitates real-time monitoring of signals, for example, environmental pollutants, health indicators, and electric energy consumption in a smart city. Despite the promising capabilities of IoTs, these distributed deployments often face the challenge of data privacy and communication rate constraints. In traditional machine learning, training data is moved to a data center, which requires massive data movement from distributed IoT devices to a third-party location, thus raising concerns over privacy and inefficient use of communication resources. Moreover, the growing network size, model size, and data volume combined lead to unusual complexity in the design of optimization algorithms beyond the compute capability of a single device. This necessitates novel system architectures to ensure stable and secure operations of such networks. Federated learning (FL) architecture, a novel distributed learning paradigm introduced by McMahan et al., can be a promising solution for enabling IoT-based smart city applications by addressing these challenges. In the FL paradigm, a global server orchestrates model training, without raw data being transferred from the participating devices (or clients). Edge-deployed signal processing algorithms, such as sparse approximations and statistical learning methods will be essential for efficient management of compute and communication resources in FL. In this thesis, we seek answers to three research questions related to distributed signal processing that arise in the context of FL. First, what are the methods to speedup scalar quantizer design in resource-constrained edge devices? Second, given certain system or application-specific constraints, what are the signal representations that lead to near-optimal performance? Third, what are the tradeoffs to be considered while performing federated aggregation of models collected from individual edge devices? These questions are considered in the context of resource-constrained edge devices in the FL model. Beginning with the Lloyd-Max quantizer, a well-known algorithm in traditional signal processing, we propose an approximate Lloyd-Max quantizer, which relies on a piecewise linear approximation of the signal source probability density. We show that the proposed quantizer is nearly optimal and also converge to a fixed point close to the limit of Lloyd-Max quantizer levels at an exponential rate. Further, we extend the proposed algorithm to a data-driven setting where the parameters of the piecewise linear representation are learned through batch updates. Through experiments on an Android-based edge device, we show performance improvement of the developed quantizer when compared to the well known k-means in terms of energy efficiency, runtime, and memory utilization. Next, we consider some specific application-oriented system constraints and the signal representations useful for such cases. Of particular interest is the overprediction constraint that arise in network capacity planning problems. We develop two solutions: the first based on quantizer design and the second based on signal approximations. The overpredictive quantizer design hinges on the stochastic approximation-based updates that provide an online algorithm for finding the quantizer levels. The designed quantizer will generate a quantized signal which is always greater than or equal to the actual signal. The proposed schemes are verified and compared using an available TV whitespace dataset. The second approach to implement the system level overprediction constraint is through signal approximation, which we describe next in the context of FL. The final part of this thesis deals with algorithms for overpredictive signal analytics in a client-server architecture motivated by FL. We propose algorithms to find signal representations that satisfy the overprediction constraint using a suitable basis representation (the Fourier basis in our application). Such overprediction constraints are typical in emerging smart grid applications where a central server monitors household electricity consumption. The signal representations computed at the edge devices (or consumer sites) aid the central server in drawing insights into signal analytics, including the time-series demand patterns and other signal statistics. We evaluate the tradeoffs between communication cost, computation cost, and the mean squared error performance through experimental studies on an off-the-shelf residential energy consumption dataset.
