## Normal distribution and financial returns, the big mistake

The distribution of returns on a financial instrument tells us a lot about the risk we could take by investing money in it. In fact, it allows us to be able to evaluate how many times in the past a certain loss has occurred, and therefore to make decisions about our ability to be able to bear it.

If the fluctuations in yields that have occurred in the past were indicative of the future, and if we had a large enough history to observe, no further reasoning would be required. The empirical distribution, that is, the one that has occurred throughout the history of the financial instrument, would be sufficient to give us all the information we need about the investment risk.

Unfortunately, this is not always the case. Very often we have historians “limited” to a few hundred observations, or even less, and therefore it is not certain that in the past a certain loss has occurred a sufficient number of times to have statistically significant information. This is all the more relevant the greater the magnitude of the loss we are considering, that is, the rarer it is, the more we move towards the left tail.

Moreover, it is not even said that the dynamics of prices will remain the same over time, that is, it is not said that something has not changed in the market and / or in the financial instrument itself. In fact, events can occur that change the risk profile of the financial instrument, and therefore not necessarily, despite having a broad history, what has been seen in the past is indicative for the future.

For both of these reasons, it is important to try to trace the empirical distribution of returns to a model, that is to a form, which has a priori validity, beyond the empirical observations that have been recorded. In short, it is a question of moving from the simple observation of data to the use of reason, identifying, if possible, assumptions that may have general validity

over time and then using methodologies capable of reading the past through said assumptions.

By far, the most used model to describe the distribution of returns is the normal statistic, the famous Gaussian “bell”. There are many theoretical reasons that could lead to this choice. In all scientific fields the normal distribution plays an important role, as it is shown (the central limit theorem) that by adding, from a statistical point of view, a very high number of distributions, one tends to a normal distribution. Many quantities are thus regulated by this statistic: the measurement of the length of a table, the distribution of shots around a target, the strokes of bad luck in gambling …

There are so many areas in which this distribution is reflected in nature that it has been called “normal”, that is, found as a rule. It is therefore not surprising that this distribution immediately found use in the financial field, in a somewhat uncritical way, without being questioned … thus causing one of the most sensational measurement errors in history.

In the next article, we will see the implications for investors of this mistake and how Egonon looked for tools to remedy it.

by Riccardo Donati,

Scientific Committee Egonon

## Classify and decide!Which branch to choose?

In many businesses, it is often necessary to know how to make classifications in order to make decisions. For example, banks need to be able to classify credit applicants between those who will be able to meet their obligations and those who might find it difficult to do so. Public administrators have to decide whether to allocate certain areas of the territory to playgrounds or to parking lots. Financial analysts need to be able to distinguish assets whose future value could increase from those for which it could decrease. Obviously, in order to produce good classifications, the professional or administrator on duty must have a deep knowledge of the area in which she operates and must know how to effectively use the information available.

In other words, classification is a complex task. Therefore, especially in the era of the so-called Big Data, the tools that can support the decision maker in the classification activities are welcome. (Attention: support, do not replace!)

In the scientific literature, many types of such tools have been proposed and continue to be proposed. In particular, classification tools based on the so-called Machine Learning, have recently been highlighted for their effectiveness. In qualitative terms, and simplifying a bit, with “Machine Learning” we mean a wide family of methods that are inspired by how superior living beings and Nature “produce” intelligent processes.

Decision Trees are one of such intelligent methods of classification. Again in qualitative terms, a Decision Tree is a tool which, by applying Machine Learning techniques to an initial set of data coming from the objects to be classified, is able to autonomously carry out a classification of these same objects. In particular, a Decision Tree is capable of extracting rules such as:

IF the variables considered take on certain values ​​THEN a given action must be taken OTHERWISE another action must be taken.

Recently, in the international scientific literature a study has been published in which an automatic system based on Decision Trees has been proposed and applied to classify financial shares among those whose future price could increase (rise) and those for which it could decrease (decline).

The initial set of data used is relatively “simple”: prices (closing, opening and so on), volumes and some technical analysis indices, all quantities relating to the various actions considered.

Despite this simplicity, the ability of this automatic system to correctly classify future rises / falls in share prices, and therefore to predict their direction, is very good. Obviously, since this is a recent application, it requires further confirmation and in-depth analysis. However, the good conditions are all there.

* Forecasts based on Decision Trees of the rises / falls of the Bitcoin cryptocurrency prices for the period going from 20/01/2014 to 22/02/2021, for a total of 1779 working days. The first panel shows the logarithmic transform of   Bitcoin opening price in the indicated period (using the logarithmic transform makes the price trend more readable time), in the second panel the forecastsof the direction of the price course of the cryptocurrency are reported considered for the same period (rise = 1; fall = -1).

by Marco Corazza,

Scientific Committee Egonon