AI-powered protein structure viewer for iPad
I read a lot on my iPad, including a lot of papers, and I often wish that I had the equivalent of PyMOL for my iPad. Of course, I wouldn't want actual PyMOL—the interaction model would be a disaster on a touch device. What I actually want is a clean, powerful protein structure viewer that works well on a touch screen.
In terms of features, I think I'd want
- load multiple protein structures from the PDB (or locally)
- align structures
- good viewing options: persistent selections or layers, biochemically accurate secondary structure depiction, a range of depictions showing bonds and atoms, well-thought-out model for coloring and depicting different parts of the multiple structures loaded
- predict and view predictions via ESMFold and/or AlphaFold server
- run protein language models locally (using the Hugging Face implementation of ESM-2, for example, loaded via Swift Transformers) to predict mutational effects, and thoughtfully display this information on the structure
If you have basically real-time scoring of mutations, which is easy to do on-device using one of the smaller ESM-2 models converted to CoreML and loaded using Swift Transformers, you have a nifty little AI-powered protein design buddy. That's basically what I would like to build, using the iPad's touch screen for the UI.
So I went looking for what's out there. Here's what I found so far.
- The built-in, web-based viewer at the PDB is pretty good (example with a favorite protein). Have to mention this one right up front, as it is built in to the PDB, has many of the features above (except for loading multiple structures), and works fine on the iPad
- iMolviewer and iMolviewer Lite offer many of the features I'd like, but personally I find the interface difficult to look at
- The BioViewer project on GitHub is built with modern Swift but seems incomplete from a biochemical point of view. There is also a review on the app store claiming that early versions of the app displayed molecules in mirror image, which reinforces my impression
- Molecules (App Store link) seems more geared towards small molecules, not proteins
So I think that means that there are two main parts of this project:
- Develop the protein structure viewer. This will be able to load and view structures, and will have a rich API for coloring and different representations, which will be used to display the info from the ML models
- Develop the inference mechanism. This will need to be able to load protein language models and perform inference. This is also where code for scoring mutations should live. This whole module will, once loaded in a viewing session, will basically accept sequences, and return scores, and will have to do that at the rate of several per second.