Partial differential equations, statistics and data in an interdisciplinary approach to data-based modelling
By Mats Ehrnström
NTNU Norwegian University of Science and Technology
IMod is a 2022–2028 interdisciplinary project for building, analyzing and testing frameworks for data-based modelling built on partial differential equations and statistical modelling. It includes work packages and applications in surface fluid mechanics and neuroscience. Its primary objective is to develop a mathematical-statistical framework for data-driven models of complex systems, guided by problems in applications.
The project involves senior investigators from four different fields, as well as junior positions, additional affiliated junior positions, and external partners from Universities of Oxford, Notre Dame, Basel, Delaware, Edinburgh and the Norwegian Computing Center. The principal investigators are Mats Ehrnström, Helge Holden and Espen Robstad Jakobsen in partial differential equations; Ingelin Steinsland and Geir-Arne Fuglstad in statistics; Benjamin Dunn in computional neuroscience; and Simen Ådnøy Ellingsen in fluid mechanics; all at the NTNU Norwegian University of Science and Technology.
The larger background for IMod is the successful development of statistics in the 20th century , the ever-increasing amounts of freely available data , and the more recent universal learning theories  that have provided scientists with a sophisticated approach to data-driven modelling: the fitting of general models to data. For some time attempts at decoding data in terms of explainable and controllable models have been emerging . Physics-informed neural networks , data-driven partial differential equations , and the inclusion of advanced spatial operators in statistical modelling  serve as examples of how bottom-up models are supplemented by explainable systems stemming from differential equations and physical laws.
The group wanted to integrate methods from partial differential equations, stochastic processes and statistics, tested and applied in a setting where laws and equations are known (fluid mechanics), and a setting where they are emerging (neuroscience).
The research front in differential equations is currently at the intersection of nonlocal nonlinear processes [8, 9], and an increasingly more advanced integration of stochastic processes . We can handle complex interactions in physical and frequency space ; we can include noise in some of these equations  – but a connection to data is often lacking. In statistics, methods for describing and estimating complex dependence structures have revolutionised methods. Calibration and uncertainty quantification need to be taken into account [13, 14]. The introduction of mathematical boundary-value problems has enhanced calculations with parameters that can be interpreted physically as well as in the sense of statistical covariances . But even in the case of very well understood dynamical systems, a desirable integration of stochastic processes and the nonlinear framework of partial differential equations is unrealised .
In the fields of fluid mechanics and neuroscience one finds important examples of processes that can be measured and in which equations and models play leading roles. The local motion of fluid particles and signalling of neurons can to some extent be described . Yet, their complex interactions cause challenges as the rules in smaller scale gives emergence of new orders in the larger scales . Full-scale governing equations are quite involved . In instances, known equations and rules are lacking . We can sample the brain and the sea with a stunning precision of hundreds to thousands of measurements per second , but the sampling is still sparse given the whole system, and the data and models do not always fit . In the words of a parabola, the goal of the project is to look at the sparse amounts of information at the `surface’ of the brain or sea given by measurement points, and trying to decipher the interior flow and causality in simple but relevant concepts.
Goals and means
The primary objective of IMod is to develop novel mathematical-statistical frameworks for data-driven models of complex systems guided by problems in fluid mechanics and neuroscience. The project aims to combine, create, study and unite. By combining partial differential equations and statistical theory one wishes to develop general models with uncertainty for capturing interactions in complex systems. By creating effective and fast methods one can identify physical parameters from sparsely observed phenomena. By studying the systems from a mathematical and physical viewpoint one can gain insight into their dynamics. By uniting theory and data one develops new models in fluid mechanics and neuroscience aided by tailor-made experiments.
There are three main work packages: Building from data, Mathematical-statistical models in neuroscience, and Surface fluid dynamics as a key to sparse modelling.
In the first, we center on building partial differential models from generic data. Time series u(t) are captured and described through evolution equations, with both linear and nonlinear local or nonlocal operators. The signal is given, and the parameterized operators are sought. This package also includes couplings between stochastic partial differential equations and Gaussian random through stochastic, fractional and elliptic partial differential equatios, which we aim to extend to the time-dependent setting. We further deal with stochastic versions of equations such as the Camassa–Holm equation, and statistically averaged mean-field games as models of many agents.
In the second package, we develop both existing frameworks and novel data-driven models for forward and top-down temporal–spatial patterns and dynamics in neuroscience. Among existing mathematical models are Hudgin–Huxley, Fokker–Planck and Wilson–Cowan equations. Other population-agent models such as mean-field games are yet to be used in neuroscience. Our aim is to take a step beyond the current state space representations of the brain – which connect features such as neuronal firing-rate to head direction or positions – to models that can possibly discover underlying processes and salient features, such as decision making, communication patterns and dynamical fingerprints. Another important question is the stability of neural representations under perturbation of noise.
The third package concerns mathematical and statistical models of surface water waves and the coupling between surface dynamics and interior motion such as turbulence.While the flow of water is governed by the Euler and Navier–Stokes equations, considering turbulence in these equations leads to the Reynolds-averaged Navier-Stokes equations. The basic concept is now a covariance matrix. A main aim is to identify variables in the interaction between turbluance and surface waves, that may be measured experimentally and statistically analyzed. The package also connects the data-driven equations from the first package to experimental data, comparing data-driven models with known equations such as the Korteweg–de Vries, Whitham-type and other water-wave equations.
The IMod project is co-funded by the Research Council of Norway and the NTNU Norwegian University of Science and Technology. For more info, see https://www.ntnu.edu/imf/imod.
 Mathematics in the 20th Century. M Atiyah. Amer. Math. Monthly 108, 2001.
 The big challenges of big data. V Marx. Nature 498, 2013.
 Deep learning. Y Lecun, Y Bengio and G Hinton. Nature 521, 2015.
 Universal Differential Equations for Scientific Machine Learning. C Rackauckas et al, arXiv:2001.04385.
 Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. M Raissi, P Perdikaris and GE Karniadakis. J. Comput. Phys. 378, 2019.
 Learning data-driven discretizations for partial differential equations. Y Bar-Sinai, S Hoyer, J Hickey, and MP Brenner. Proc. Natl. Acad. of Sci.31, 2019.
 Spatial modeling with R-INLA: A review. H Bakka, H Rue, GA Fuglstad et al. WIREs Computational Statistics 10, 2018.
 An extension problem related to the fractional Laplacian. L Caffarelli and L Silvestre. Comm. Partial Differential Equations 32, 2007.
 Nonlocal and nonlinear diffusions and interactions: new methods and directions. JA Carrillo, M del Pino, A Figalli, G Mingione, JL Vázquez. Springer, 2017
 Stochastic Partial Differential Equations. H Holden, B Øksendal, J Ubøe and T Zhang. Springer, 2010.
 On Whitham’s conjecture of a highest cusped wave for a nonlocal shallow water wave equation. M Ehrnstrom and E Wahlén. Ann. Inst. Henri Poincaré C 36, 2019.
 The Hunter–Saxton equation with noise. H Holden, K Karlsen and P Pang. J. Differential Equations 270, 2021.
 On the statistical formalism of uncertainty quantification. JO Berger and LA Smith. Annu. Rev. Stat. Appl. 6, 2019.
 The Design and Analysis of Computer Experiments. TJ Santner, BJ Williams and WI Notz. Springer, 2018.
 An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. F Lindgren, J Lindström and H Rue. J. R. Stat. Soc. Ser. B Stat. Methodol. 73, 2011.
 Statistical inference for dynamical systems: A review. K McGoff, S Mukherjee and N Pillai. Statistical Surveys 9, 2015.
 Recurrent inhibitory circuitry as a mechanism for grid formation. JJ Couey, A Witoelar, SJ Zhang, K Zheng, J Ye, B Dunn, R Czajkowski, MB Moser, EI Moser, Y Roudi and MP Witter. Nature neuroscience 16, 2013.
 Smooth stationary water waves with exponentially localized vorticity. M Ehrnstrom, S Walsh and C Zeng. J. Eur. Math. Soc. arxiv:1907.07335
 Observation of surface wave patterns modified by sub-surface shear currents. B K Smeltzer, E Æsøy and SÅ Ellingsen. J. Fluid Mechanics873, 2019.
 Mathematical foundations of neuroscience. GB Ermentrout and DH Terman. Springer, 2010.
 Grid cells require excitatory drive from the hippocampus. T Bonnevie, B Dunn, M Fyhn, T Hafting, D Derdikman, JL Kubie, Y Roudi, EI Moser and MB Moser. Nature neuroscience 16, 2013.
 Spatial models with explanatory variables in the dependence structure. R Ingebrigtsen, F Lindgren and I Steinsland. Spatial Statistics 8, 2014.