Estimation and simulation of fractional Ornstein-Uhlenbeck processes, using deep neural networks

There is now a great deal of interest in parameter estimation of Ornstein-Uhlenbeck processes in finance because of its modelling capability (see e.g. [1]). That is why we investigated prediction of the unknown parameters with neural networks in the hope that we will obtain more accurate estimators than the ones using classical statistic methods, without any assumptions on the Hurst exponent.

Since the many parameter configuration options of the Ornstein-Uhlenbeck processes, they have several use cases in mathematical finance, e.g. one can obtain a model for analysing the time-dependent correlation between two stock prices by applying a certain transform of the stochastic differential equation describing the Ornstein-Uhlenbeck processes. Generalising the structure of the driving noise by allowing for fractional noise offers an opportunity to adjust the remaining amount of information from the past at time encoded into the noise process. This generalization step leads us to the fractional Ornstein-Uhlenbeck processes, of which the machine learning-based parameter estimation is under investigation by the ELTE AI Research Group.

We aimed at building consistent and accurate estimators by developing several neural networks for predicting various parameters of processes, which can be obtained as transformed fractional Ornstein-Uhlenbeck processes. Some network structures could come naturally, e.g. LSTM models, since the amount of information about the past behaviour, i.e. the natural filtration, can be interpreted as the inner memory of the LSTM net, while the long- and short-term dependencies can also be specified according to the corresponding autocorrelation function. In case of investigating the parameter estimation of fractional Wiener processes, the following figure shows the mean squared error comparison of an LSTM network, which is resistant to the choice of time scale the fractional Wiener process has been generated over, and the Higuchi estimator for 1500 sequence length over the [0,1] interval.

Mean squared error of the parameter estimation of fractional Wiener processes

The importance of an efficient data generator system for each analysed process rises high in this case, since if huge and complex neural network structures are applied in the learning procedure, then one needs a big amount of data to a good performance. Let us consider fractional Ornstein-Uhlenbeck processes with the assumption that the initial value is zero, because then the processes can be written as fractional Wiener integrals according to the corresponding integrated differential equations, which lets us reduce the whole task to generating discretized fractional Wiener integrals. Note that this assumption can be stated without loss of generality by adding extra layers to the network to handle this generalization step. Many discrete fractional Wiener process simulator methods have been published, e.g. Hosking, Cholesky, Davies-Harte (see e.g. [2]), which perform really slow near the endpoints of the interval [0,1], but since the [0.01,0.03] subinterval is one of the most important in mathematical finance we had to develop a new method, improving this bad property near the endpoints. We generalised a very familiar idea of the Kroese fractional Wiener noise generating method using Fast Fourier transform to obtain a generator for increments of an arbitrary initialized isonormal process. So we just had to assume about the driving noise that it is a Gaussian process with an isomorphism between the covariance function and the inner product related to the separable Hilbert space the isonormal process is defined on. That is why we could handle the covariance structure really efficiently and the circulant embedding procedure of the inner product structure could be implemented with really small memory requirement. Another possible boost to the generator procedure is the fact that the computations needed for generating one sequence can be much decreased by the cache-ability of the method, which means after obtaining the actual inner product structure of the considered isonormal process, a certain transform and embedding of that structure can be cached in the memory, which provides the 60% of the computations needed for one sequence. So if one aims at generating a discretized fractional Ornstein-Uhlenbeck process, then just the covariance structure of the fractional Wiener process and the corresponding discretized integrand has to be given to obtain the process, which could be executed in a really short time compared to the former non-Fast Fourier using methods for all points in the interval [0,1]. On the following plot the “Improvement” axis shows how many times the generalised method is faster  compared to Krause’s Fast Fourier-based method, which performs way faster than the classical methods especially near the endpoints of the unit interval, e.g. Davies-Harte, Cholesky, for different number of generated sequences in the [0.01,0.03] Hurst interval.

How many times our generalised method is faster than the best published one

This research is conducted in the AI Research Group at the Institute of Mathematics, Eötvös Loránd University.

For additional information please contact:


[1] Tim Leung and Xin Li: Optimal Mean-Reversion Trading: Mathematical Analysis and Practical Applications. World Scientific Publishing Co. 2016. ISBN 978-9814725910.

[2] Peter F. Craigmile: Simulating a class of stationary Gaussian processes using the Davies-Harte algorithm, with application to long memory processes. Journal of Time Series Analysis 2003, 24(5) DOI:10.1111/1467-9892.00318

%d bloggers like this: