Protein transformers from scratch video series

20 Sep, 2024

I've been inspired by many great teachers over the years, and one of my favorite courses recently has been Andrej Karpathy's Neural Networks: Zero to Hero series. In this detailed, hands-on series, Andrej shows how to build up a GPT-2 model from scratch in Python code. But what's really cool is that the course starts at the very beginning, by building an autograd engine and then working from there to neural nets, then finally the transformer model.

I was inspired by this to build an "onramp" to one of my software projects, which is an implementation of transformer models for protein design. Following the example of the Makemore series, I made a series of videos that walk through the problem of modeling protein sequences with transformer models step by step.

Protein transformers from scratch is a series of videos that builds up the concepts of protein sequence modeling step by step. We start with choosing an appropriate problem and dataset, data preprocessing, and representing protein sequences for neural nets. Then, we build a simple probabilistic neural "language" model that allows us to create new proteins one amino acid at a time. We then update our model to the transformer architecture, building out each component in PyTorch. We implement several different kinds of evaluations on our models that we can use while training, and to compare different models on problems we actually care about. Finally, we scale up the training to all 40 million protein sequences in UniRef50 (10 billion tokens) and replicate one of the well-known pre-trained protein transformer models that are currently state of the art.