In the Process Stream: The Capto Adhere Evaluation, continued

*NOTE - This may show up as a new post, but I had to do some edits to make sure the figures went in properly. I'm still building my workflow.*

In the previous post, I was completing an analysis of a central composite design using the load amount, load conductivity, and load pH as inputs for the study. The initial results identified that all three factors were important; however, there were no pairwise interactions present (nor higher terms, such as quadratic). In the original paper, the authors point out that one of their runs had an abnormally low recovery percentage and so was omitted. Let's take a look at how the model changes when this is done.

First, notice where the point (red triangle) falls in the plot of the actual vs predicted:

It's pretty clear this point is having a significant impact on the model through leveraging. When the model of main effects only is applied to the results, the ANOVA results are

Source	DF	Sum of Squares	Mean Square	F Ratio
Model	2	0.06547415	0.032737	8.7720
Error	7	0.02612395	0.003732	Prob > F
C. Total	9	0.09159810		0.0124*

and the parameters estimates, along with their probabilities are

Term		Estimate	Std Error	t Ratio	Prob>\|t\|
Intercept		0.8820829	0.019367	45.55	<.0001*
Load(93,312)		0.1042488	0.025469	4.09	0.0046*
Load cond(10,30)	Biased	-0.028196	0.024983	-1.13	0.2963
Load pH(6,7.5)	Zeroed	0	0	.	.

The load conductivity and load pH are no longer statistically significant; however, looking at the actual vs predicted, several points are observed to be outside the confidence intervals.

As this was a CCD, we are justified adding some higher order terms. Re-starting the analysis with a response surface model, the ANOVA shows the following results

Source	DF	Sum of Squares	Mean Square	F Ratio
Model	5	0.08642340	0.017285	13.3609
Error	4	0.00517470	0.001294	Prob > F
C. Total	9	0.09159810		0.0132*

with the parameter estimated as

Term		Estimate	Std Error	t Ratio	Prob>\|t\|
Intercept		0.9453904	0.020768	45.52	<.0001*
Load(93,312)		0.1032109	0.017412	5.93	0.0041*
Load cond(10,30)	Biased	-0.026721	0.01594	-1.68	0.1690
Load pH(6,7.5)	Zeroed	0	0	.	.
Load*Load		-0.135078	0.052907	-2.55	0.0631
Load*Load cond	Biased	0.0373873	0.017758	2.11	0.1030
Load cond*Load cond	Biased	0.023048	0.043967	0.52	0.6278
Load*Load pH	Zeroed	0	0	.	.
Load cond*Load pH	Zeroed	0	0	.	.
Load pH*Load pH	Zeroed	0	0	.	.

Now, what to do? The dominant effect is the load and maybe the load*load term (p=0.0631). There are some camps that say a model should only be the statistically significant terms while others would argue that the fit should be of the entire model. Taking only the statistically significant terms, the ANOVA was

Source	DF	Sum of Squares	Mean Square	F Ratio
Model	2	0.07851582	0.039258	21.0059
Error	7	0.01308228	0.001869	Prob > F
C. Total	9	0.09159810		0.0011*

What is interesting to note is how the F ratio has slowly increased as we reach our best fit. This comes with the increasing degrees of freedom for the error term. The final model only has two degrees of freedom (load, load^2), resulting in the error term having 7 degrees of freedom to estimate the mean square error (0.001869). The estimated RMSD is obtained by applying the square root to the mean square error term.

Finally, the Prediction Profiler may be used to predict how the recovery changes as a function of load:

The sweet spot is around 200 g/L; after that point, a modest improvement in recovery results with the higher load. Of note: look at how quickly the recovery drops! At 175 g/L, it's down to 90% and then continues to plummet.

In the Process Stream

Friday, May 24, 2013

The Capto Adhere Evaluation, continued

No comments:

Post a Comment