Alex Carlin

Design highly expressed genes with Espresso

I'd like to introduce a free and open source tool for performing codon optimization called Espresso. Espresso is a free, open-source app that allows anyone to create coding sequences for synthetic genes. Espresso supports a variety of different host organisms and mRNA design algorithms. Using Espresso, you can harness the power of generative AI to create highly-expressed synthetic genes for your host of choice.

Right now, you can use Espresso to design highly-expressed genes for a variety of organisms, including Escherichia coli, Saccharomyces cerevisiae, and Yarrowia lipolytica. Using Espresso is so easy, just install the Python package and off you go!

import espresso 


protein = "MENFHHRPFKGGFGVGRVPTSLYYSLSDFSLSAISIFPTHYDQPYLNEAPSWYKYSLESGLVCLYLYLIYRWITRSF"
gene = espresso.design_coding_sequence(protein, model="ec")

Using the pre-trained transformer models is just as easy

gene = espresso.design_coding_sequence(protein, model="fungi-v1")

And so is scrubbing sequences of undesired restriction sites

avoid = [
    "AAAAAA",  # avoid poly-A, synthesis restriction  
    "GAATTC",  # avoid EcoRI and its reverse complement  
]

protein = "MENFHHRPFKGGFGVGRVPTSLYYSLSDFSLSAISIFPTHYDQP"

# first design a candidate 
candidate = espresso.design_coding_sequence(protein, model="ec")

# then scrub the candidate 
scrubbed = espresso.scrub_sequence(candidate, avoid, model="ec")

For more details on the training of the codon transformer models, please see my blog post. For more details about Espresso, please see the Espresso user guide on GitHub.