Alex Carlin

How to fit data to find Michaelis-Menten kinetics

In biochemistry, the Michaelis–Menten equation is one of the best-known models of enzyme kinetics. The model takes the form of an equation describing the rate of enzymatic reactions, by relating reaction rate v to [S], the concentration of a substrate S. The velocity v is parameterized by—also related to—the turnover number kcat and the Michaelis constant KM. The most useful form of the equation for the biochemist relates the quotient of velocity v and enzyme concentration [E] to the substrate concentration [S], the catalytic rate kcat, and the Michaelis constant KM by

fracv[E]=frackcat*[S]KM+[S]

This notebook is a tutorial for the experimental biochemist who has collected data on the rate of product formation under different concentrations of substrate, as we frequently do in the Siegel group. Let's begin with data workup.

Data workup from a UV/vis plate assay

Our UV/vis plate assay follows the accumulation of 4-nitrophenol (pNP), and provides rates in units of OD/min. We need to convert these into M/min since what we actually care about is product concentration (not OD). A standard curve relating OD/min to concentration of product in M/min can be used to transform the OD/min data.

  1. Convert your rates from OD/min into M/min (molar per minute) using a standard curve.

In this example, we have experimentally determined that $
textrm{ pNP }
frac{M}{min} = 0.0002
textrm{ pNP }
frac{OD}{min} $.

  1. Convert your enzyme concentration from mg/mL into M by dividing by the extinction coefficient of the protein and correcting for any dilution in your assay procedure

In this case, the extinction coefficient of BglB is 113330, and we diluted the enzyme twice: first 100-fold and then in the assay plate 4-fold.

  1. Finally, divide the rates you observe (in units of M/min) by the enzyme concentration that you are testing (in units of M), to obtain rates in min

  2. Convert your substrate concentrations into M as well

  3. Put them in a CSV table like the one in this repository called example_data.csv

In this case, these unit conversions have already been performed on the example data. We could use following Python code like this to do the unit conversions for us, creating a new column called "rate" in our data.

ext_coef = 113330
standard_curve_slope = 0.0002 
protein_yield = 1  # units of mg/mL
dilution_1 = 100  # 100-fold dilution 
dilution_2 = 4  # 4-fold dilution 
protein_conc = protein_yield / ext_coef / dilution_1 / dilution_2
df["rate"] = standard_curve_slope * df.rate / protein_conc 

Example using real data

Let's dive into a real example. The dataset that I'm going to use was generated by myself and others in the Siegel Lab at UC Davis as part of the Bagel Project, where I led a project to generate a large dataset of kinetic constants for designed enzymes. See the Bagel Project repository for experimental details, including complete lab protocols for generating high quality kinetic data for designed enzymes.

import pandas 


df = pandas.read_csv("example_data.csv") 

Now that we have our data in a DataFrame, we can use the built-in method plot() of the DataFrame to make a quick scatter plot of our data and display it on screen.

That looks like it will fit the Michaelis-Menten equation. Since we've converted all our values to use the same units, we can estimate kcat and KM from looking at the plot.

I estimate kcat to be about 700 min1 and KM, the substrate concentration where the rate it half its max, to be 0.005 M, or 5 mM.

Fit your data to the Michaelis-Menten equation to determine kcat and KM

Next, we'll use the Python module SciPy to perform a nonlinear least-squares optimization to determine the values of kcat and KM that fit our data best. First, we'll import SciPy and a couple of Numpy modules that we'll need later

from scipy.optimize import curve_fit 
from numpy import diag, sqrt

and define the Michaelis-Menten equation in Python code

def v(s, kcat, km):
    return (kcat * s)/(km + s)

We need to provide curve_fit with initial guesses at the parameters, which we estimated from the scatter plot above. The value for kcat (the maximum rate observed) appears to be about 700 min1, and the value for KM looks to be about 0.005.

Put these in to a new tuple, called p0.

p0 = (650, 0.005)

Now we're ready to use curve_fit. The curve_fit documentation indicates that the function returns two arrays, called popt (= parameter optimums) and pcov (= parameter covariance). From the documentation:

popt : array Optimal values for the parameters so that the sum of the squared error of f(xdata, *popt) - ydata is minimized

pcov : 2d array The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).

I recommend setting up your code to have the function return into two variables called popt and pcov.

In this next step, we'll perform the optimization and convert our one standard deviation errors into percent, which I find easier to make sense of for large data sets.

popt, pcov = curve_fit( v, df.substrate, df.rate, p0=p0 )
perr = sqrt( diag( pcov ) )

# calculate percent errors
for i in range(len(popt)):
    if not popt[i] or perr[i] > popt[i]:
        popt[i] = perr[i] = None 
    else:
        perr[i] = perr[i] / popt [i] * 100

Whew! That's it! Let's print out our results in a nice format, rounding numbers as necessary

results = { 
    'kcat': '{:0.1f}'.format(popt[0]), 
    'kcat_std_err': '{:0.1f}%'.format(perr[0]),
    'km': '{:0.4f}'.format(popt[1]), 
    'km_std_err': '{:0.1f}%'.format(perr[1]) 
}

print pandas.Series(results)

And, with that, we have taken our experimental measurements (observed product formation rates at 8 different substrate concentrations) and fit them to the Michaelis-Menten equation, in order to determine the kcat and KM for our designed enzyme.