Post-Selection Estimation Theory

Post-selection estimation refers to the scenario where a preliminary data-based selection stage determines the specific estimation problem. In this work, we are researching two fundamental problems of this framework: estimation after model selection and estimation after parameter selection. In estimation after model selection, the observation model is unknown. Therefore, prior to estimation, a model selection procedure is used to chose a model from a set of candidate models. Then, the parameters of the selected model are estimated. In estimation after parameter selection, the observations model is known. In this case, the selection refers to choosing the “parameters of interest” based on the data, while the rest of the unknown parameters are considered as nuisance parameters. In both problems, the selection stage impacts the subsequent estimation, for example, by introducing a selection bias. This research establishes a post-selection estimation theory for this two special cases, including estimation methods, appropriate unbiasedness, and performance bounds. The contributions of this work are as follows. First, for the estimation after parameter selection problem, we consider the post-selection mean-squared error (PSMSE) as an appropriate performance measure that takes into account the selection procedure. We introduce the appropriate unbiasedness criterion, which is the unbiasedness in the Lehmann sense w.r.t. post-selection mean-squared error cost function. The post-selection maximum likelihood (PSML) estimator has been presented and has been shown to reduce the bias in Lehmann sense and the PSMSE compared with conventional estimators. Since, the PSML estimator often lacks an analytical form and has high computational complexity, we developed new low-complexity post-selection estimation methods for estimation after parameter selection architecture. In addition, we present an appropriate Cram{$\acute{\text{e}}$}r-Rao bound (CRB) for this problem, the $\Psi$-CRB, and develop a new algorithm for efficient commutation of approximate this bound empirically. We generalize this model for unidentifiable model scenario where all the parameters cannot be estimated and a selection stage, that aims to identify the significant parameters is conducted prior to the estimation. We present the coherent PSML estimator as an appropriate estimator for this problem and provided practical algorithm for implementation. We show that the presented estimator out-preform the outperforms other common solutions for this problem. Second, we consider the estimation after model selection problem. The estimation after model selection is closely related to the concept of estimation under model misspecification. While the literature on estimation under misspecified models offers a framework for addressing model misspecification, it does not account for the selection process that led to the misspecified model. In estimation post-model selection, there are several candidate models, each with the potential to be selected. Consequently, the interpretation of the assumed model, in this case, is not straightforward. We present three different interpretations to address the non-Bayesian post-model-selection estimation problem as an estimation problem under model misspecification. Each of these interpretations induces a misspecified maximum likelihood estimator and a novel corresponding misspecified CRB. Finally, we consider a post-model-selection Bayesian parameter estimation approach of a random vector with an unknown deterministic support set, where this support set represents the model. We present different estimators and performance bounds. In particular, we develop the selective Bayesian CRB (BCRB) and selective tighter BCRB, lower bounds on the mean-squared error (MSE) for any coherent estimator.

File Type: pdf
File Size: 2 MB
Publication Year: 2024
Author: Harel, Nadav
Supervisors: Routtenberg, Tirza
Institution: Ben-Gurion University of the Negev
Keywords: Bayesian framework, Cram?er-Rao bound, Lehmann-unbiasedness, lower bounds, mean-squared-error, model misspecification, model-selection, non-Bayesian framework, parameter estimation, parameter selection, selective inference