Inferential and Clustering Methods for Functional Data

Author: Andrea Ghiglietti

Position: Postdoctoral Researcher

Affiliation: Department of Mathematics, Università degli Studi di Milano.

E-mail address: andrea.ghiglietti@unimi.it

In the last two decades, there has been an increasing interest in the statistical research concerning the study of high dimensional data. The main reason of this recent popularity is the wide range of applications in biology, medicine, meteorology and finance, among others. A feature common in all these disciplines is to observe data that are points sampled from functions generated by a continuous time stochastic process with values in a suitable infinite dimensional Hilbert space. Functional Data Analysis (FDA) is the area of statistics that gathers all the models and tools required to deal with this kind of data. Comparing the FDA with the classical multivariate analysis, we may see the functional data as an object with a number of features observed for each statistical unit much larger than the sample size. Classical methodologies in FDA are concerned with the mean function and the covariance kernel of the process generating the data.

functional_plots

My recent research activity concerns inference and clustering methods for functional data. In particular, we have considered the problems of (i) testing the equality of the means between two or more functional populations and (ii) clustering samples of curves generated by different stochastic processes. To this end, we have proposed a new distance in L2 that generalized the well-known Mahalanobis distance in infinite dimensional spaces. The performance of the new inferential tools and clustering methods based on this distance are studied both analytically and in simulation. Moreover, we are applying the proposed methods in a clinical context to a dataset of Electrocardiographic signals (ECGs). Here the statistical unit is represented by a multivariate function which describes the heart dynamics of each subject on eight different leads. The purpose of the analysis is to adopt the proposed inferential and clustering procedures to detect significant differences between the sample of healthy patients and the sample of subjects affected by Right and Left Bundle Brunch Block, two different Acute Coronary Syndromes.

This research is realized in collaboration with Prof. Anna Maria Paganoni, Politecnico di Milano, and Dr. Francesca Ieva, Università degli Studi di Milano.

 

 

%d bloggers like this: