Genetic Programming

Genetic Programming is a machine learning method, based on Evolutionary Algorithm. It employs a stochastic search of a solution through genetic operations and fitness functions to resolve a user-defined task.

The Genetic Programming algorithms mimic the evolutionary progress observed in nature.

To achieve this, they perform three types of genetic operations in a tree-based development process – crossover, mutation, reproduction.

The crossover is a process where two “parents” from one generation are used to produce a “child” strategy that caries features (characteristics) from both of them. Mutation randomly changes some portion of the initial candidate, thus introducing genetic diversity. Reproduction copies the program to the next generation without any changes.

Evaluating the candidates is an important task.

This is done through user-defined criteria and the calculation of a fitness score for every strategy in every generation. This is a way of measuring the quality of the strategy and aims to select the candidates that would continue the genetic evolution process.

The process itself starts with an initial population of randomly generated strategies.

They are developed using available building blocks (like indicators, signals, trading options), acceptable values defined in certain ranges and comparison conditions. The number of candidates in the initial population is set by the user, as well as the number of evolution levels (generations) that will be produced. After computing a fitness score for each candidate, the next generation is evolved using the three processes – crossover, mutation and reproduction.

By selecting the fittest candidates for each consecutive stage of development, the genetic programming algorithm is able to achieve better performing strategies that should be robust enough for real trading.

This measurement of robustness is done through a series of cross checks that test the candidates on history data. One of the biggest problems, when generating strategies using machine learning algorithms, is having them too over-fitted (or curve-fitted) to the history data on which they were developed. That is why testing them is mandatory before using them in real trading.

The tests usually include some change in the history data, or the parameters of the strategy.

A robust strategy would hold its good performance, while an over-fitted candidate would show deteriorating results.

When a strategy passes these cross checks it can be considered safe to use in real-trading conditions. Still, monitoring its performance is a good idea. Financial markets are a volatile place and if the market conditions change much, the algorithm might no longer perform as it should. In those cases, adjustments may be required.

Optimization is the other useful process when dealing with GP algorithms.

It adjusts (or optimizes) the strategy to update its performance to the current market state. Regular re-optimization is a thing to consider, because it could keep the robustness of a good strategy for longer.