Fully automated personal finance tool

LHV Bank problem for the First Estonian Study Group with Industry


LHV is researching options to develop a fully automated personal finance tool, which provides an overview of a customer’s spending and helps forecast future spending and on the same time acts as a personal finance advisor who makes personalised suggestions to help customers increase their savings. The goal for ESGI is to suggest a theoretical framework for the second part of this project: building the personal finance advisor.

An example of what the development process might look like is following. Segment the clients into similar groups based on financial situation, demographic situation and/or personal preferences. View the financial behaviour of customers over a sufficiently long period and identify good/bad behaviour patterns, based on which recommendations can be designed. During implementation, recommendations trigger when a particular pattern is identified.


The data will include consist of:

– Pseudonymised customer transactions, which are previously enriched with descriptive tags (ie. „restaurant“, „supermarket“, „loan payment“ etc.) which categorise the nature of the transaction. Essentially, the transactional dataset is simply: identifiers, timestamps, amounts, tags.

– Pseudonymised customer profile, which provides additional demographical and geographical information and data regarding the usage of bank products.


The data only concerns private persons. If deemed necessary, the dataset can be expanded to include further data, as long as it is sufficiently anonymised.

The help of the ESGI group was required to suggest and design a framework for developing a „smart AI“-type advisor that suggests changes in financial behaviour, based on each individual customer’s profile and spending history. While the simplest option is to develop rule-based common sense suggestions, the task is to find solutions that involve data mining



customers’ behaviour and identifying patterns which work well in reality, as well as identifying the efficiency level at which they work.


During the ESGI151, the Heckit regression method or the two-stage Heckman’s lambda method for panel data was implemented. Applied Heckit model has the form below.

Regression:    ln_balance = const + β0⋅ln_start + β1⋅ln_bVC + β2⋅ln_wVC +ε⋅lambda;

Selection eq.:  balance =const + β3⋅age + β4⋅address + β5⋅gender + β6⋅wVCq + β7⋅bVCq+u;

Here bVC and wVC are volumes of blacklisted and whitelisted clients, correspondingly.

Calculations were done in assumptions: β1, β2 β6, β7 <0; β3, β4, β5 >0; β0~0.


In conclusion, some discussion and future research problems are pointed out:

  • Theoretical model can be more sophisticated;
    • More categories of clients are useful;
    • Deeper regional breakdown;
    • Cash-preference included;
    • Quantile panel regression introduced;
    • Additional sample tests and increasing sample size.

Contact: Rauno.Siinmaa@lhv.ee

More detailed reports: https://sisu.ut.ee/esgi151