Robust multivariate and high-dimensional statistics

Researchers

Key Publications

2021 - Taras Bodnar, Yarema Okhrin, Nestor Parolya - Journal of Business & Economic Statistics

Optimal Shrinkage-Based Portfolio Selection in High Dimensions

In this article, we estimate the mean-variance portfolio in the high-dimensional case using the recent results from the theory of random matrices. We construct a linear shrinkage estimator which is distribution-free and is optimal in the sense of maximizing with probability 1 the asymptotic out-of-sample expected utility, that is, mean-variance objective function for different values of risk aversion coefficient which in particular leads to the maximization of the out-of-sample expected utility and to the minimization of the out-of-sample variance. One of the main features of our estimator is the inclusion of the estimation risk related to the sample mean vector into the high-dimensional portfolio optimization. The asymptotic properties of the new estimator are investigated when the number of assets p and the sample size n tend simultaneously to infinity such that p/n→c∈(0,+∞). The results are obtained under weak assumptions imposed on the distribution of the asset returns, namely the existence of the 4+ε moments is only required. Thereafter we perform numerical and empirical studies where the small- and large-sample behavior of the derived estimator is investigated. The suggested estimator shows significant improvements over the existent approaches including the nonlinear shrinkage estimator and the three-fund portfolio rule, especially when the portfolio dimension is larger than the sample size. Moreover, it is robust to deviations from normality.

2017 - Gwenaël G. R. Leday, Mathisca C. M. de Gunst, Gino B. Kpogbezan, Aad W. van der Vaart, Wessel N. van Wieringen, Mark A. van de Wiel - The Annals of Applied Statistics

Gene network reconstruction using global-local shrinkage priors

Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighborhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with global shrinkage of the regularization parameters to borrow strength between genes and improve inference. We employ a simple Bayesian model with nonsparse, conjugate priors to facilitate the use of fast variational approximations to posteriors. We discuss empirical Bayes estimation of hyperparameters of the priors, and propose a novel approach to rank-based posterior thresholding. Using extensive model- and data-based simulations, we demonstrate that the proposed inference strategy outperforms popular (sparse) methods, yields more stable edges, and is more reproducible. The proposed method, termed ShrinkNet, is then applied to Glioblastoma to investigate the interactions between genes associated with patient survival.