FINM 33165

Reinforcement Learning and Deep Learning

Autumn Quarter
Instructor: Niels Nygaard

The course begins with a quick introduction to Deep Learning i.e., Neural Networks (NN).  We discuss how such models are trained on data, loss functions and Stochastic Gradient Descent (SGD).

We then discuss Markov Decision Processes (MDP) and the Bellman Equation. This is the foundation of Reinforcement Learning. The objective is to learn to make decisions i.e., select actions that maximize an expected sum of rewards.  

We introduce the value and the q-functions and show how the problem of finding best actions is formulated in terms of these functions. We then show how NNs can be used to approximate q-functions. This method (and the Deep Q-Learning (DQN) model) was invented by researchers at Google’s DeepMind affiliate and contains several clever innovations.

We then turn to some very recent research that proposes a new approach to many RL problems. The model is known as the Decision Transformer and it combines RL with one of the most important innovations in Deep Learning in the last five years, the Transformer architecture.  We go through this architecture in some detail and show how it is used in Natural Language Processing (NLP)

The Decision Transformer applies Transformer architecture to RL problems to create models which are much simpler to code. The Decision Transformer model seems to be very appropriate for applications in Finance and there are many opportunities for further developments.

In the course we will code models from some of the research papers on these subjects.