How to fit data to find Michaelis-Menten kinetics
In biochemistry, the Michaelis–Menten equation is one of the best-known models of enzyme kinetics. The model takes the form of an equation describing the rate of enzymatic reactions, by relating reaction rate to , the concentration of a substrate . The velocity is parameterized by—also related to—the turnover number and the Michaelis constant . The most useful form of the equation for the biochemist relates the quotient of velocity and enzyme concentration to the substrate concentration , the catalytic rate , and the Michaelis constant by
This notebook is a tutorial for the experimental biochemist who has collected data on the rate of product formation under different concentrations of substrate, as we frequently do in the Siegel group. Let's begin with data workup.
Data workup from a UV/vis plate assay
Our UV/vis plate assay follows the accumulation of 4-nitrophenol (pNP), and provides rates in units of OD/min. We need to convert these into M/min since what we actually care about is product concentration (not OD). A standard curve relating OD/min to concentration of product in M/min can be used to transform the OD/min data.
- Convert your rates from OD/min into M/min (molar per minute) using a standard curve.
In this example, we have experimentally determined that $
textrm{ pNP }
frac{M}{min} = 0.0002
textrm{ pNP }
frac{OD}{min} $.
- Convert your enzyme concentration from mg/mL into M by dividing by the extinction coefficient of the protein and correcting for any dilution in your assay procedure
In this case, the extinction coefficient of BglB is 113330, and we diluted the enzyme twice: first 100-fold and then in the assay plate 4-fold.
Finally, divide the rates you observe (in units of M/min) by the enzyme concentration that you are testing (in units of M), to obtain rates in min
Convert your substrate concentrations into M as well
Put them in a CSV table like the one in this repository called
example_data.csv
In this case, these unit conversions have already been performed on the example data. We could use following Python code like this to do the unit conversions for us, creating a new column called "rate" in our data.
ext_coef = 113330
standard_curve_slope = 0.0002
protein_yield = 1 # units of mg/mL
dilution_1 = 100 # 100-fold dilution
dilution_2 = 4 # 4-fold dilution
protein_conc = protein_yield / ext_coef / dilution_1 / dilution_2
df["rate"] = standard_curve_slope * df.rate / protein_conc
Example using real data
Let's dive into a real example. The dataset that I'm going to use was generated by myself and others in the Siegel Lab at UC Davis as part of the Bagel Project, where I led a project to generate a large dataset of kinetic constants for designed enzymes. See the Bagel Project repository for experimental details, including complete lab protocols for generating high quality kinetic data for designed enzymes.
import pandas
df = pandas.read_csv("example_data.csv")
Now that we have our data in a DataFrame
, we can use the built-in method plot()
of the DataFrame
to make a quick scatter plot of our data and display it on screen.
That looks like it will fit the Michaelis-Menten equation. Since we've converted all our values to use the same units, we can estimate and from looking at the plot.
I estimate to be about 700 min and , the substrate concentration where the rate it half its max, to be 0.005 M, or 5 mM.
Fit your data to the Michaelis-Menten equation to determine and
Next, we'll use the Python module SciPy to perform a nonlinear least-squares optimization to determine the values of and that fit our data best. First, we'll import SciPy and a couple of Numpy modules that we'll need later
from scipy.optimize import curve_fit
from numpy import diag, sqrt
and define the Michaelis-Menten equation in Python code
def v(s, kcat, km):
return (kcat * s)/(km + s)
We need to provide curve_fit
with initial guesses at the parameters, which we estimated from the scatter plot above. The value for (the maximum rate observed) appears to be about 700 min, and the value for looks to be about 0.005.
Put these in to a new tuple, called p0
.
p0 = (650, 0.005)
Now we're ready to use curve_fit
. The curve_fit
documentation indicates that the function returns two arrays, called popt
(= parameter optimums) and pcov
(= parameter covariance). From the documentation:
popt : array Optimal values for the parameters so that the sum of the squared error of f(xdata, *popt) - ydata is minimized
pcov : 2d array The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
I recommend setting up your code to have the function return into two variables called popt
and pcov
.
In this next step, we'll perform the optimization and convert our one standard deviation errors into percent, which I find easier to make sense of for large data sets.
popt, pcov = curve_fit( v, df.substrate, df.rate, p0=p0 )
perr = sqrt( diag( pcov ) )
# calculate percent errors
for i in range(len(popt)):
if not popt[i] or perr[i] > popt[i]:
popt[i] = perr[i] = None
else:
perr[i] = perr[i] / popt [i] * 100
Whew! That's it! Let's print out our results in a nice format, rounding numbers as necessary
results = {
'kcat': '{:0.1f}'.format(popt[0]),
'kcat_std_err': '{:0.1f}%'.format(perr[0]),
'km': '{:0.4f}'.format(popt[1]),
'km_std_err': '{:0.1f}%'.format(perr[1])
}
print pandas.Series(results)
And, with that, we have taken our experimental measurements (observed product formation rates at 8 different substrate concentrations) and fit them to the Michaelis-Menten equation, in order to determine the and for our designed enzyme.