Unsupervised Domain Adaptation with Private Data

The recent success of deep learning is conditioned on the availability of large annotated datasets for supervised learning. Data annotation, however, is a laborious and a time-consuming task. When a model fully trained on an annotated source domain is applied to a target domain with different data distribution, a greatly diminished generalization performance can be observed due to domain shift. Unsupervised Domain Adaptation (UDA) aims to mitigate the impact of domain shift when the target domain is unannotated. The majority of UDA algorithms assume joint access between source and target data, which may violate data privacy restrictions in many real world applications. In this thesis I propose source-free UDA approaches that are well suited for scenarios when source and target data are only accessible sequentially. I show that across several application domains, for the adaptation process to be successful it is sufficient to maintain a low-memory approximation of the source embedding distribution instead of the full source dataset. Domain shift is then mitigated by minimizing an appropriate distributional distance metric. First, I validate this idea on adaptation tasks in street image segmentation. I then show improving the approximation of the source embeddings leads to superior performance when adapting medical image segmentation models. I extend this idea to multi-source adaptation, where several source domains are present, and data transfer between pairs of domains is prohibited. Finally, I show that relaxing the constraint for data privacy allows for mitigating domain shift in fair classification.

File Type: pdf
File Size: 6 MB
Publication Year: 2023
Author: Stan Serban
Supervisors: Mohammad Rostami
Institution: University of Southern California
Keywords: domain adaptation, unsupervised domain adaptation, sequential domain adaptation, computer vision, medical image segmentation, sliced wasserstein distance