Design highly expressed genes with Espresso
I'd like to introduce a free and open source tool for performing codon optimization called Espresso. Espresso is a free, open-source app that allows anyone to create coding sequences for synthetic genes. Espresso supports a variety of different host organisms and mRNA design algorithms. Using Espresso, you can harness the power of generative AI to create highly-expressed synthetic genes for your host of choice.
Right now, you can use Espresso to design highly-expressed genes for a variety of organisms, including Escherichia coli, Saccharomyces cerevisiae, and Yarrowia lipolytica. Using Espresso is so easy, just install the Python package and off you go!
import espresso
protein = "MENFHHRPFKGGFGVGRVPTSLYYSLSDFSLSAISIFPTHYDQPYLNEAPSWYKYSLESGLVCLYLYLIYRWITRSF"
gene = espresso.design_coding_sequence(protein, model="ec")
Using the pre-trained transformer models is just as easy
gene = espresso.design_coding_sequence(protein, model="fungi-v1")
And so is scrubbing sequences of undesired restriction sites
avoid = [
"AAAAAA", # avoid poly-A, synthesis restriction
"GAATTC", # avoid EcoRI and its reverse complement
]
protein = "MENFHHRPFKGGFGVGRVPTSLYYSLSDFSLSAISIFPTHYDQP"
# first design a candidate
candidate = espresso.design_coding_sequence(protein, model="ec")
# then scrub the candidate
scrubbed = espresso.scrub_sequence(candidate, avoid, model="ec")
For more details on the training of the codon transformer models, please see my blog post. For more details about Espresso, please see the Espresso user guide on GitHub.