In the last week or so, I caught this note from the process development team at BI who use automation to screen cation exchange resins for conditions that led to significant improvements in their main isoform. I love the experimental layout
One of these days, I've got to set up a 96-well plate with multiple resins. The options are just too fun to consider. In this example, I would have proposed a different design. For example, a central composite design with 8 centerpoints completed in replicate would use up all 32-wells. Recognizing that the pH is the harder variable to change, then JMP include this to generate a design that is easily programmed. The result would be data with significant statistical power and improved predictive ability for scale-up.
Discussions about the latest in biologics process development along with discussions of multivariate experimental designs applied to reducing manufacturing costs.
Wednesday, May 29, 2013
Tuesday, May 28, 2013
Critical versus non-critical process parameters
"whose variability has an impact on a critical quality attribute
and therefore should be monitored or controlled to ensure the process produces the
desired quality."
The previous post related the changes of the aggregate percentages and CHO HCP levels, along with the recovery. Of these, only the former two would be considered as quality attributes (recovery being a business factor).
Protein aggregation is hard to argue against as a critical quality attribute and that means the load and load conductivity would be critical process parameters. The next step is to scientifically justify the greatest acceptable operating range for the process and JMP provides some exceptionally useful tools to assist in this effort.
We could begin with a graphical representation of the space we want to claim as our acceptable operating space
How much do our product quality attributes vary within this region? The prediction profiler within JMP allows the two prediction curves to be shown simultaneously. Translating the limits from the previous figure establishes the initial lower and upper operating limits (LOL, UOL). These are shown as dashed, red vertical lines in the figure. The cross-hairs are set approximately at the center point of this region.
From a quality perspective, the design space is really one sided all product quality attributes meet our design criteria below the lower operating limit. The yield may take a hit, but that's a business problem - not quality. Examining the lower and upper edges of the design space offer predictions as to how the product quality attributes vary with the process
Product Quality at Lower Operating Limit |
Product Quality at Upper Operating Limit |
The result is a graphical representation of the expected process variation and it's impact upon the product quality attributes. These can then be used in supporting the overall process control strategy. When the process is at its upper limits, there is a significant portion of the samples that would not meet the upper limit of 0.8 % aggregate. The operating limits may have to be adjusted lower, at the expense of recovery, to ensure the maintenance of the product quality.
Thus, an iterative process of defining and modeling enables a scientific justification for the acceptable operating range of the critical process parameters.
Monday, May 27, 2013
Need some motivation?
I made some figures for those days when I need a refresher because the noise of the day has distracted me from the big picture. I hope all y'all find them as motivating.
Here's a picture of how cancer is distributed around the globe
I was fairly amazed at that distribution until I looked into where the majority of money is spent to treat cancer:
The impact of these numbers only begins to paint the picture of the challenges to be overcome. For example, the distribution of public health:
Here's a picture of how cancer is distributed around the globe
I was fairly amazed at that distribution until I looked into where the majority of money is spent to treat cancer:
A significant contribution to this situation is tied to the distribution of wealth on the planet.
- Over 1 billion people live on less than $1/day
- Over 2.7 billion people live on less than $2/day
Along with those very sobering numbers is how the rest of the world is spending its money to take care of its citizens.
The impact is a patient has to rely on themselves, or family members, when it comes to treating a problem.
Looking at the dose cost of some anti-cancer drugs, these prices aren't accessible for the rest of the world even if there were medical personnel who were available to offer them:
To bring anti-cancer treatments to the 2.7 billion people living on less than $2/day, the cost of treatment has to be about $1/dose (my estimate, based upon UNICEF numbers for vaccines). The implication is that the cost of making the drug has to be about $0.1/dose (again, my estimate). As a process development engineer, that's the number I'm chasing. As a result, I'll use scale, cost of goods, and anything else I can find to get me as close as possible to that value because the closer I am to there, the greater the patient opportunity for the medication.
If you need a couple of outstanding references, check out this paper by Brian Kelley as well as Suzanne S. Farid. These papers offer valuable insight into the economic forces at play in biopharmaceutical manufacturing.
Sunday, May 26, 2013
Optimizing recovery and aggregation in the stream
The previous post demonstrated the ability of Capto Adhere to recover the antibody with minimal losses when used in the weak exchange partitioning mode. Product quality is also an important criteria for a purification step. For antibodies, one of the essential product qualities is aggregation. Aggregation is a major concern in the field and will probably need an entire post (with refs) detailing how these impact the development process as well as the risk to patients. With that said, the analysis of the aggregation percentages (determined by size exclusion chromatography) using the model for the response surface results in the following ANOVA and parameter estimates, respectively:
ANOVA Results
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
5
|
0.00002046
|
4.092e-6
|
4.6700
|
Error
|
4
|
0.00000350
|
8.7624e-7
|
Prob > F
|
C. Total
|
9
|
0.00002397
|
0.0803
|
Parameter Estimates
Term
|
Estimate
|
Std Error
|
t Ratio
|
Prob>|t|
|
|
Intercept
|
0.0078988
|
0.00054
|
14.61
|
0.0001*
|
|
Load(93,312)
|
0.0004262
|
0.000453
|
0.94
|
0.4002
|
|
Load cond(10,30)
|
Biased
|
-0.000279
|
0.000415
|
-0.67
|
0.5386
|
Load pH(6,7.5)
|
Zeroed
|
0
|
0
|
.
|
.
|
Load*Load
|
-0.001182
|
0.001377
|
-0.86
|
0.4389
|
|
Load*Load cond
|
Biased
|
0.0020644
|
0.000462
|
4.47
|
0.0111*
|
Load cond*Load cond
|
Biased
|
0.000571
|
0.001144
|
0.50
|
0.6439
|
Load*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
Load cond*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
Load pH*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
The parameter estimates show the only statistically significant model is the pairwise interaction of the load and the load conductivity. Based upon the chemistry of the Capto Adhere resin, this should make a fair amount of sense. The antibody is binding through electrostatic interactions when operating in the weak exchange partitioning mode. As a result, the conductivity is going to play a significant role; however, I wouldn't have expected that the pairwise term would be significant only. As a result, the final model will have all three terms: the load, the load conductivity and their pairwise term (despite working from the belief that only statistically significant terms should be included in a final model). The resulting ANOVA shows a reasonably good fit to the data and the parameter estimates continue to show that only the pairwise interaction is statistically significant.
ANOVA Results
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
3
|
0.00001970
|
6.5664e-6
|
9.2358
|
Error
|
6
|
0.00000427
|
7.1097e-7
|
Prob > F
|
C. Total
|
9
|
0.00002397
|
0.0115*
|
Parameter Estimates
Term
|
Estimate
|
Std Error
|
t Ratio
|
Prob>|t|
|
|
Intercept
|
0.0075547
|
0.000268
|
28.23
|
<.0001*
|
|
Load(93,312)
|
0.0005093
|
0.000353
|
1.44
|
0.1989
|
|
Load cond(10,30)
|
-0.000359
|
0.000363
|
-0.99
|
0.3614
|
|
Load*Load cond
|
0.0019552
|
0.000401
|
4.88
|
0.0028*
|
Fit of Aggregate Results to Model Prediction |
Does the model improve using only the pairwise interaction? Yes! The F-ratio has increased significantly - primarily as a function of only the single degree of freedom taken by our model!
ANOVA Results using only Statistically Significant Term
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
1
|
0.00001758
|
0.000018
|
22.0356
|
Error
|
8
|
0.00000638
|
7.979e-7
|
Prob > F
|
C. Total
|
9
|
0.00002397
|
0.0016*
|
Notice that the Mean Square of the Pure Error is 7.979e-7? The square root of the is value gives an estimate of the pure error in the model. The result (0.09%) is in pretty good agreement to what I'd estimate is the precision of the SEC assay. With the model established, the next step is to optimize the process.
Now, JMP allows us to combine the model for the recovery and the aggregate to guide us in our decisions about the process limits. In this case, a contour plot allows for a good visualization.
The recovery lower limit is set to 95% and the upper limit to the aggregate is 0.8%. The resulting contour plot then removes the region that wont give 95% recovery (in red) and the less than 0.8% aggregate (green). The white space represents a first step towards establishing acceptable operating space for the process. When the results from the CHO HCP ELISA are included, the operating space undergoes even greater definition.
The upper limit to the HCP content is set at 20 ppm with the assumption that a downstream purification step would provide additional clearance. The process space then becomes restricted to a region centered around the cross hairs. The contour plot may then be used to provide a scientific justification to the regulatory agencies for the manufacturing control strategy and process validation approach. In fact, the relationship of this data to the commercial strategy will probably take up several postings in the future!
JMP also allows the exporting of these results as a flash file. These can be particularly useful when trying to message the results to management. I'll try to upload a flash file in the future.
Friday, May 24, 2013
The Capto Adhere Evaluation, continued
*NOTE - This may show up as a new post, but I had to do some edits to make sure the figures went in properly. I'm still building my workflow.*
In the previous post, I was completing an analysis of a central composite design using the load amount, load conductivity, and load pH as inputs for the study. The initial results identified that all three factors were important; however, there were no pairwise interactions present (nor higher terms, such as quadratic). In the original paper, the authors point out that one of their runs had an abnormally low recovery percentage and so was omitted. Let's take a look at how the model changes when this is done.
First, notice where the point (red triangle) falls in the plot of the actual vs predicted:
It's pretty clear this point is having a significant impact on the model through leveraging. When the model of main effects only is applied to the results, the ANOVA results are
and the parameters estimates, along with their probabilities are
The load conductivity and load pH are no longer statistically significant; however, looking at the actual vs predicted, several points are observed to be outside the confidence intervals.
As this was a CCD, we are justified adding some higher order terms. Re-starting the analysis with a response surface model, the ANOVA shows the following results
with the parameter estimated as
Now, what to do? The dominant effect is the load and maybe the load*load term (p=0.0631). There are some camps that say a model should only be the statistically significant terms while others would argue that the fit should be of the entire model. Taking only the statistically significant terms, the ANOVA was
What is interesting to note is how the F ratio has slowly increased as we reach our best fit. This comes with the increasing degrees of freedom for the error term. The final model only has two degrees of freedom (load, load^2), resulting in the error term having 7 degrees of freedom to estimate the mean square error (0.001869). The estimated RMSD is obtained by applying the square root to the mean square error term.
Finally, the Prediction Profiler may be used to predict how the recovery changes as a function of load:
The sweet spot is around 200 g/L; after that point, a modest improvement in recovery results with the higher load. Of note: look at how quickly the recovery drops! At 175 g/L, it's down to 90% and then continues to plummet.
In the previous post, I was completing an analysis of a central composite design using the load amount, load conductivity, and load pH as inputs for the study. The initial results identified that all three factors were important; however, there were no pairwise interactions present (nor higher terms, such as quadratic). In the original paper, the authors point out that one of their runs had an abnormally low recovery percentage and so was omitted. Let's take a look at how the model changes when this is done.
First, notice where the point (red triangle) falls in the plot of the actual vs predicted:
It's pretty clear this point is having a significant impact on the model through leveraging. When the model of main effects only is applied to the results, the ANOVA results are
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
2
|
0.06547415
|
0.032737
|
8.7720
|
Error
|
7
|
0.02612395
|
0.003732
|
Prob > F
|
C. Total
|
9
|
0.09159810
|
0.0124*
|
and the parameters estimates, along with their probabilities are
Term
|
Estimate
|
Std Error
|
t Ratio
|
Prob>|t|
|
|
Intercept
|
0.8820829
|
0.019367
|
45.55
|
<.0001*
|
|
Load(93,312)
|
0.1042488
|
0.025469
|
4.09
|
0.0046*
|
|
Load cond(10,30)
|
Biased
|
-0.028196
|
0.024983
|
-1.13
|
0.2963
|
Load pH(6,7.5)
|
Zeroed
|
0
|
0
|
.
|
.
|
The load conductivity and load pH are no longer statistically significant; however, looking at the actual vs predicted, several points are observed to be outside the confidence intervals.
As this was a CCD, we are justified adding some higher order terms. Re-starting the analysis with a response surface model, the ANOVA shows the following results
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
5
|
0.08642340
|
0.017285
|
13.3609
|
Error
|
4
|
0.00517470
|
0.001294
|
Prob > F
|
C. Total
|
9
|
0.09159810
|
0.0132*
|
with the parameter estimated as
Term
|
Estimate
|
Std Error
|
t Ratio
|
Prob>|t|
|
|
Intercept
|
0.9453904
|
0.020768
|
45.52
|
<.0001*
|
|
Load(93,312)
|
0.1032109
|
0.017412
|
5.93
|
0.0041*
|
|
Load cond(10,30)
|
Biased
|
-0.026721
|
0.01594
|
-1.68
|
0.1690
|
Load pH(6,7.5)
|
Zeroed
|
0
|
0
|
.
|
.
|
Load*Load
|
-0.135078
|
0.052907
|
-2.55
|
0.0631
|
|
Load*Load cond
|
Biased
|
0.0373873
|
0.017758
|
2.11
|
0.1030
|
Load cond*Load cond
|
Biased
|
0.023048
|
0.043967
|
0.52
|
0.6278
|
Load*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
Load cond*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
Load pH*Load pH
|
Zeroed
|
0
|
0
|
.
|
.
|
Now, what to do? The dominant effect is the load and maybe the load*load term (p=0.0631). There are some camps that say a model should only be the statistically significant terms while others would argue that the fit should be of the entire model. Taking only the statistically significant terms, the ANOVA was
Source
|
DF
|
Sum of Squares
|
Mean Square
|
F Ratio
|
Model
|
2
|
0.07851582
|
0.039258
|
21.0059
|
Error
|
7
|
0.01308228
|
0.001869
|
Prob > F
|
C. Total
|
9
|
0.09159810
|
0.0011*
|
What is interesting to note is how the F ratio has slowly increased as we reach our best fit. This comes with the increasing degrees of freedom for the error term. The final model only has two degrees of freedom (load, load^2), resulting in the error term having 7 degrees of freedom to estimate the mean square error (0.001869). The estimated RMSD is obtained by applying the square root to the mean square error term.
Finally, the Prediction Profiler may be used to predict how the recovery changes as a function of load:
The sweet spot is around 200 g/L; after that point, a modest improvement in recovery results with the higher load. Of note: look at how quickly the recovery drops! At 175 g/L, it's down to 90% and then continues to plummet.
Subscribe to:
Posts (Atom)