Cell culture media is to mammalian cells what a workout smoothie is to your human cells.
Media's purpose is to make a stainless steel bioreactor hospitable to CHO cells by providing volume where temperature, dissolved oxygen (dO2), and pH can be controlled.
As well, it must provide nutrients intended for cellular uptake as well as a place to absorb metabolic waste. At the start of cell culture, the nutrient supply is defined; by the end of cell culture, the nutrients are depleted. Despite the added nutrients, media is still largely water and can thus be modeled at 1g/mL.
You make cell culture media the same way you'd make a smoothie, except at large-scale, you're making 100 or 500 or even 15,000 liters and so there are differences.
- Initial QS.
This is where you add water-for-injection (WFI) into a clean media preparation tank.
- Add media powder.
Once the powder touches the water, the media can promote growth. Since the bioreactor is only clean (and not sterile), if you don't proceed quickly, you may have a contamination on your hands.
- Add bicarbonate powder
Bicarbonate is the buffer. This whole time, you are agitating, and pH control is OFF.
- Add peptones (optional).
Over the past decade, we've seen movement away from bovine (cow) to porcine (pig) peptone. I've read that we now use veggie peptone, but have never seen it.
- Adjust pH
Some will dispute the necessity of adjusting the pH in the media prep tank because once the media is transferred to the bioreactor the pH will get adjusted there.
- Final QS
This is where you add the rest of the water. If you have an osmolality specification, you'd measure it here as an in-process test before transferring the media to the bioreactor
- Transfer/sterile filter the media
While pumping the media over to the bioreactor, there will be sterile filters that remove 0.1 micron particles so that the media that ends up in the bioreactor is free of microbes. In some cases, the media is virally inactivated by passing through a "pasteurizer" that raises the temperature to 121 degC, holds it for 1 minute and cools it down.
It's a bit more than making a smoothie since mixing in a blender is forgiving. But in the preparation of cell culture media where you are making thousands of liters of this stuff at 7 bucks per gallon ($2/liter), the large-scale media preparation procedure has to be written to be highly reproducible.
Credits: Image above is from the greatest movie of all time - The Matrix (1999).
Suppose you support a batch process. The way you likely measure performance is to sample each batch and measure different parameters. These measurements are ideal for plotting on an IR control chart - one control chart for each parameter and each batch would be represented by one point on the control chart.
If you have statistical software like JMP, then you can just click around on the menu

...and...

control charts appear like magic:

But suppose Wall Street bankers crashed the economy by securitizing AAA-rated subprime mortgages and you are the collateral damage; forking over $1,250 for a single-user annual license or $1,895 for a single-user perpetual license of JMP isn't in the cards. What do you do?
Good news. William Shewhart developed control charting principles long before computers so if worse comes to worse, you could probably create a control chart from graph paper and a grease pencil.
Here's what you do:
- Get the data into a column
- Compute moving range
- Multiply MR by 3 and divide by 1.128
We're not going to do it with a grease pencil and graph paper. We're going to do it with a spreadsheet.
Step 1: Get the data into a column
We haven't talked about this yet, but data for analysis needs to be structured. If you look at the numbers in a column and they represent what the column headers describe, then you got it right.

Step 2: Compute the Moving Range

This is where you take the absolute value of the difference between measurements. =B3-B2 would be the formula that you'd drag in column C. The average of the moving range is used to determine the width of the control limits.
Step 3: Compute distance to control limits
To get the distance to each control limit, compute 3 * Average( MovingRange ) / 1.128.

In this case, the average of the moving range is 3.90. Take 3.90 * 3 / 1.128 = 10.37.
The Upper Control Limit (UCL) is the 296 + 10.37 = 306
The Lower Control Limit (LCL) is the 296 - 10.37 = 286
What you do is calculate limits for every parameter you measure; apply it to a steady process and lock the limits and monitor the process against the locked-down limits to detect drift.
Speaking to a lot of prospects, I get this sense that the purpose of building and commissioning their plant is so the project engineer can finish their project. That the purpose of deploying control systems is to fill up their PI archives.
To me, that's wagging the dog.
You see, I'm the guy who uses the system. The cell culture engineer reporting applying statistical process control concepts to large-scale cell culture manufacturing. The guy who comes in after the professional the system integrator does the installation qualification (IQ) and operation qualification (OQ) and applies all that data to get a return for the company.
The system integrator is some automation engineer that works for a system integration company and his core competency is to deploy these systems. When that system integrator is done, he's off to another project at another company. He's not sticking around to actually use the system that he helped build.
Why is this important? A lot of prospects I meet are leaving the value on the table. They have uber-powerful systems like OSIsoft PI, but for the most part, don't use it. A lot of prospects think is there as a CYA-requirement... like something required to comply with cGMP.
The system integrator can tell you all about scan classes or data compression or instrument tags. The system integrator can't tell you about how best to build PI ProcessBook displays so your engineers can apply the data immediately... or the principles of PI Alias Nomenclature... or best practices for defining your equipment hierarchy.
Those items... crucial for getting a high ROI on your system... sit in this immense vacuous gap.
My opinion for what the solution is? A system integrator with real experience in your world, solving your problems with the tools you are trying to deploy.
For SCADA integrators, you're probably okay. For OSIsoft PI system integration? Go with a domain expert.
Biotech System Integrator for OSI PI
Biologics are medicinal products created by a biological process - as opposed to chemical synthesis. What this means is that you can't get a couple beakers, Petri dishes and Bunsen burners in a lab and produce drugs like growth hormone.
Say you wanted growth hormone before Watson and Crick discovered DNA. You'd have to squeeze growth hormone out of the pituitary glands of pigs. This sucked because after you've gone through all the trouble of a sterile preparation, you're still left with porcine growth hormone. Worse, the manufacturing process is unscalable.
Now that we know about DNA, the manufacturing of biologics at large-scale is possible.
You see, medicine these days - as in drugs - are often complex proteins.
Proteins are molecules - composed of a sequence of amino acids - whose shape is determined by the sequence of said amino acids.
Amino acid sequence is determined by the DNA sequence.
So if you want to make biologics, you basically have to start with the protein... reverse engineer the amino acid sequence and then string together DNA that encodes for the amino acid sequence.
If you transfect (i.e. poke your DNA into) CHO cells (or other mammalian cells) and the DNA lands on a part of the chromosome that gets high ribosome traffic, then you have a cell line capable of producing the active pharmaceutical ingredient (API).
Biologics manufacturing then becomes growing the cells/having them secrete the API (a.k.a. cell culture) and subsequently purifying the drug.
Biologics made with recombinant DNA technology can be made reliably. Lean manufacturing principles can be applied and significant medical needs can be met.
Companies that own the infrastructure to identify marketable proteins and manufacture tons of it (while meeting regulatory requirements) are the ones with dominant positions.
More on cell culture.
Buzzwords are aplenty in this line of work: Lean manufacturing, lean six sigma, value stream mapping, business process management... Class A. But at the end of the day, we're talking about exactly one thing: continuous process improvement:
How to get your manufacturing processes (as well as your business process) to be better each day.
And to that, I say, "Pick your weapon and let's get to work." For me, I prefer statistical process control because SPC was invented in the days before continuous process improvement collided with information technology.
Back in those days, things had to be done by hand, concepts had to be boiled down in simple terms: special cause vs. common cause variability could simplify what was going on and clarify decision making. And having just Excel spreadsheets is a vast technological improvement to paper & pencil. In those days, there was no time for complexity of words and thought.
If we say words from the slower days of yesteryear, but use tools from today, we can solve a lot of problems and make a lot of good decisions.
Companies like Zymergi are third-party consultants who can help develop your in-house continuous process improvement strategy... especially for cell culture and fermentation companies. We focus applying statistical process control as well as knowledge management so that once we reduce process variability and increase reliability.
The technology is there to institutionalize the tribal knowledge so that when people leave... your high-paid consultants leave, the continuous process improvement know-how stays.
We use SPC and statistical analysis because it has been proven by others and it is proven by us. Data-driven decisions deliver real results.
7 Tools of SPC
- Control Charts
- Histograms
- Correlations
- Ishikawa Diagrams
- Run Charts
- Process Flow Diagrams
- Pareto Charts
Continuing on banalities of sub-second data, I got a really good question today about excmin vs. scan rate:
What is the relationship between scan rate and excmin?
- Scan Rate
- tells the interface how frequently to fetch the data
- Excmin
- tells the OSIsoft PI snapshot how frequently to ignore the data
It stands to reason that if you tell the interface to fetch sub-second data from the data source, then you don't want to throw out that data. If you do, you're simply clogging your network pipes with traffic that you end up tossing.
In practice, I have not seen excmin used. I'm sure it's useful when you absolutely, positively don't want data within excmin seconds of the previous point. But when you're worried about the 22 tag attributes for 5000 tags, that's a lot to worry about.
What typically happens is that PI tag data compression is handled with:
- Excmin=0
- Compmin=0
- Excdev>0
- Compdev> 0
Here's why:
We want to see what we're getting, so filtering out primary data just because it didn't fall inside of some arbitrary time-window is actually not GMP.
Secondly, filtering out primary data because it falls within the instrument accuracy of the measurement is a valid method for filtering out data because you are determining what instrument noise is and rationally filtering it out.
Get PI Compression Paper
So there it is:
- If you have sub-second data, set excmin and compmin = 0.
- Generally speaking, it's good practice to set both equal to 0 anyway
- Use excdev/compdev for data compression
I found out today that the PI-OPC Interface supports sub-second data. I'd imagine that this comes as no surprise to many of you, but it certainly does to me.
OSIsoft PI has supported the archival of sub-second data for quite some time. And for cell culture/fermentation processes, sub-second data is overkill. Cell culture happens over the course of days, production culture...weeks. Fermentations happen in hours, so very few things happen in between seconds.
Actually, there was this one time there was a rupture disk on a pasteurizing unit that had a setting of 50psi. When the transfer of media through the pasteurizer failed, due to the rupture disk, the highest pressure reading on OSIsoft PI was 29psi. As it turns, there was a pressure spike that happened on a sub-second basis, and was not captured... I suppose some large-scale manufacturing activities may require sub-second data for troubleshooting.
But ever since moving to sub-second data, it has been a pain because an event may happen at
18-Apr-12 01:24:03.5566
but if you were searching between
18-Apr-12 to 18-Apr-12 1:24, then you'd miss this event.
As is, people despise typing out more characters for specifying time. And in cell culture processes, it is simply not necessary. But as OSIsoft PI evolves to serve multiple industries, nuisances like sub-second data start cropping up.
In any case, the way to specify a sub-second scan-rate is at the interface. In the case of the PI-OPC Interface, you can specify the scan class as a fraction. If you wanted to specify scan rates at 1/10th, 1, 2, 5, 10, 30 and 60 seconds, your interface configuration file should read:
/f=0.1 /f=1 /f=2 /f=5 /f=10 /f=30 /f=60
Then any tag that you want to scan 10 times per second should have location4=1 (since 0.1 is the first scan class).
In any case,
- Few cell culture/fermentation processes require sub-second scan rates.
- Well duh: a PI system capable of archiving sub-second data has interfaces engineered to deliver data at sub-second scan rates.
It's the year 2012 and still, I see customers with batch processes not using PI Batch... the proven system for navigating batches in PI. Truth be told, some of these customers are not using OSI PI, which is in itself a problem.
Batch Relativity is having the the start/endtimes of a time-window so that when you need to look at a trend, you can plot it without having to manually input the timestamps:
Of course for time-windows in the recent past, you can use the arrows:
But for precise review of trends in the past, there are few alternatives to manual input.
When I was first starting out as a fermentation engineer, I distinctly recall getting the Gantt charts from the Planning & Scheduling department at the morning meetings and typing in estimates of the start/end times from the 11x17" paper I got each morning thinking there must be a better way. And there is: you can programmatically specify start/endtimes from PI Batch into the PI Trends.
If you have a batch manager, you can purchase software that writes to the PI Batch Database. For example, if you have an Emerson DeltaV system, you can purchase the Emerson DeltaV Batch Interface (EMDVB) that reads from EVT files and inserts records into the PI Batch database. Otherwise, you can use the native PI Batch Generator (PI BaGen) Interface that comes with the PI Server.
PI Batch Generator
To use the PI Batch Generator, there are several pre-requisites. The first is having an Active Tag.... a tag whose value = 1 when the batch is running and a tag whose value = 0 when the batch is stopped:
This is the minimum requirement for PI Batch Generator to work:
- A PI Unit for each unit you wish to track batches
- An Active Tag for each unit
- PI Batch Generator Interface installed as a service
For bioreactors (i.e. fermentors), if you don't have a tag that specifically starts/ends a batch, the tag you can use is the pH Controller Mode. Here's why:
You are generally interested in what goes on in the fermentor when there's something going on. And something is going on when it is batched with media. And when it is batched with media, pH control is typically ON; which means the pH Controller Mode = 1. On the back end of the batch, you typically turn of pH control after transfer or harvest so pH Controller Mode = 0 when the batch ends.
You'll know that you've picked the right point when your process values change when Active Tag = 1 and they flatline when Active Tag = 0:
For other types of process equipment, be clever with your existing tags to figure out the best Active Tag; for example, volume tends to be a good Active Tag.
With the Active Tag, you have satisfied the only requisite for using PI Batch Generator, all others are optional:
- Batch ID tag - a tag whose value equals the batch id at the time the batch is started.
Typically some gibberish word that uniquely identifies this batch.
- Product tag - a tag whose value equals the name of the product being produced. (e.g. 'HER2', 'E25', 'VEGF')
- Procedure tag - a tag whose value equals manufacturing formula used
These values can be programmatically inserted in the event you don't want to consume tags for infrequent data.
Further reading:
Get A Pro to Configure PI Batch
While control limits are approximations of 3 standard deviations, they are not 3 standard deviations.
In thermodynamics, we talk about state variables and path variables. State variables - like internal energy (U) is a state variable… "it is what it is." Other variables like work (w) are path variables… "its value depends on how you got there."
Standard deviation is a "state"-like parameter… if you have a set of points, the standard deviation is the standard deviation; it does not matter the order in which the data happened.

Using the same data from our previous control charting example, we see the standard deviation is 2.9 and a mean of 295. The 3 standard deviations around the average is 286 - 303.
Control limits, on the other hand, are path-like parameters that depend on the order in which it was received, and in the case of pretty random data, the control limits are 285 - 306... which is pretty close to the 3 standard devations, but not exact.

Viewing the control chart, it's obvious there are no special cause signals and there are no patterns in the data that indicate the data is out of the ordinary.
But suppose we got the same exact measurements... except this time, we found that each observed value was equal to or higher than the previous:

The standard deviation remains the same and therefore average +/- 3 standard deviations remains the same: 286 - 303. But look at the control limits... they have tightened significantly to 292 - 298.
This is because the control limits are computed from the moving range, and is when the same data shows an ascending pattern, the control limits are able to shrink and flag special cause signals where the standard deviations are not.
Apply 3 standard deviations where they are applicable; they are not applicable when identifying special cause signals of stable processes.
A control chart is a graphical tool that helps you visualize process performance. Specifically, control charts help you visualize the expected variability of a process and unambiguously tells you what is normal (a.k.a. "common cause variability") and what is abnormal (a.k.a. "special cause variability").
Discerning common-cause from special-cause variability is crucial because responding to low results that are within expectation often induces more variability.
So up to this point, we know that low process variability allows us to detect changes to the process sooner. We also know that low process variability enables processes with higher capability.
Below is the control chart of the buffer osmo data from a previous blog post on reducing process variability.

The -green- horizontal line is the average of the population and the -red- lines are the control limits (upper control limit and lower control limit). Points that are within the UCL and LCL are expected (a.k.a. "common"). Points outside of the limits are unexpected (a.k.a. "special"). From the control chart, you can immediately see that the latest value of 301 mOsm/kg is "normal" or "common", and that no response is necessary.
Below, you see the control chart for the second set of data and how a reading of 297 mOsm/kg after 8 consecutive readings of 295 mOms/kg is anomalous and certainly worth an extra look.

There are all kinds of control charts and they have a rich history - worth reading if you're into that kind of thing. In batch/biologics processes, each data point corresponds with exactly one batch and so the type of control chart used is the IR chart.
It is important to know that the control limits are not computed from standard deviations - they are computed from the moving range... without going full nerd, the reason behind this is that control limits are sensitive to the order in which the points were observed and narrow when there is a trending pattern in the data.
Control charts for key process performance indicators are a must for any organization serious about reducing process variability. Firstly, control charts quantify variability. Secondly, control charts are easy to undertand. Lastly - and most importantly, control charts help marshall scarce resources by identifying common vs. special cause.