AI-powered protein structure viewer for iPad

25 Jun, 2023

I read a lot on my iPad, including a lot of papers, and I often wish that I had the equivalent of PyMOL for my iPad. Of course, I wouldn't want actual PyMOL—the interaction model would be a disaster on a touch device. What I actually want is a clean, powerful protein structure viewer that works well on a touch screen.

In terms of features, I think I'd want

load multiple protein structures from the PDB (or locally)
align structures
good viewing options: persistent selections or layers, biochemically accurate secondary structure depiction, a range of depictions showing bonds and atoms, well-thought-out model for coloring and depicting different parts of the multiple structures loaded
predict and view predictions via ESMFold and/or AlphaFold server
run protein language models locally (using the Hugging Face implementation of ESM-2, for example, loaded via Swift Transformers) to predict mutational effects, and thoughtfully display this information on the structure

If you have basically real-time scoring of mutations, which is easy to do on-device using one of the smaller ESM-2 models converted to CoreML and loaded using Swift Transformers, you have a nifty little AI-powered protein design buddy. That's basically what I would like to build, using the iPad's touch screen for the UI.

So I went looking for what's out there. Here's what I found so far.

The built-in, web-based viewer at the PDB is pretty good (example with a favorite protein). Have to mention this one right up front, as it is built in to the PDB, has many of the features above (except for loading multiple structures), and works fine on the iPad
iMolviewer and iMolviewer Lite offer many of the features I'd like, but personally I find the interface difficult to look at
The BioViewer project on GitHub is built with modern Swift but seems incomplete from a biochemical point of view. There is also a review on the app store claiming that early versions of the app displayed molecules in mirror image, which reinforces my impression
Molecules (App Store link) seems more geared towards small molecules, not proteins

So I think that means that there are two main parts of this project:

Develop the protein structure viewer. This will be able to load and view structures, and will have a rich API for coloring and different representations, which will be used to display the info from the ML models
Develop the inference mechanism. This will need to be able to load protein language models and perform inference. This is also where code for scoring mutations should live. This whole module will, once loaded in a viewing session, will basically accept sequences, and return scores, and will have to do that at the rate of several per second.