Friday, May 24, 2013

The Capto Adhere Evaluation, continued

*NOTE - This may show up as a new post, but I had to do some edits to make sure the figures went in properly.  I'm still building my workflow.*

In the previous post, I was completing an analysis of a central composite design using the load amount, load conductivity, and load pH as inputs for the study.  The initial results identified that all three factors were important; however, there were no pairwise interactions present (nor higher terms, such as quadratic).  In the original paper, the authors point out that one of their runs had an abnormally low recovery percentage and so was omitted.  Let's take a look at how the model changes when this is done.

First, notice where the point (red triangle) falls in the plot of the actual vs predicted:

It's pretty clear this point is having a significant impact on the model through leveraging.  When the model of main effects only is applied to the results, the ANOVA results are
Source
DF
Sum of Squares
Mean Square
F Ratio
Model
2
0.06547415
0.032737
8.7720
Error
7
0.02612395
0.003732
Prob > F
C. Total
9
0.09159810

0.0124*

and the parameters estimates, along with their probabilities are
Term

Estimate
Std Error
t Ratio
Prob>|t|
Intercept

0.8820829
0.019367
45.55
<.0001*
Load(93,312)

0.1042488
0.025469
4.09
0.0046*
Load cond(10,30)
 Biased
-0.028196
0.024983
-1.13
0.2963
Load pH(6,7.5)
 Zeroed
0
0
.
.

The load conductivity and load pH are no longer statistically significant; however, looking at the actual vs predicted, several points are observed to be outside the confidence intervals.



As this was a CCD, we are justified adding some higher order terms.  Re-starting the analysis with a response surface model, the ANOVA shows the following results
Source
DF
Sum of Squares
Mean Square
F Ratio
Model
5
0.08642340
0.017285
13.3609
Error
4
0.00517470
0.001294
Prob > F
C. Total
9
0.09159810

0.0132*

with the parameter estimated as
Term

Estimate
Std Error
t Ratio
Prob>|t|
Intercept

0.9453904
0.020768
45.52
<.0001*
Load(93,312)

0.1032109
0.017412
5.93
0.0041*
Load cond(10,30)
 Biased
-0.026721
0.01594
-1.68
0.1690
Load pH(6,7.5)
 Zeroed
0
0
.
.
Load*Load

-0.135078
0.052907
-2.55
0.0631
Load*Load cond
 Biased
0.0373873
0.017758
2.11
0.1030
Load cond*Load cond
 Biased
0.023048
0.043967
0.52
0.6278
Load*Load pH
 Zeroed
0
0
.
.
Load cond*Load pH
 Zeroed
0
0
.
.
Load pH*Load pH
 Zeroed
0
0
.
.

Now, what to do?  The dominant effect is the load and maybe the load*load term (p=0.0631).  There are some camps that say a model should only be the statistically significant terms while others would argue that the fit should be of the entire model.  Taking only the statistically significant terms, the ANOVA was
Source
DF
Sum of Squares
Mean Square
F Ratio
Model
2
0.07851582
0.039258
21.0059
Error
7
0.01308228
0.001869
Prob > F
C. Total
9
0.09159810

0.0011*

What is interesting to note is how the F ratio has slowly increased as we reach our best fit.  This comes with the increasing degrees of freedom for the error term.  The final model only has two degrees of freedom (load, load^2), resulting in the error term having 7 degrees of freedom to estimate the mean square error (0.001869).  The estimated RMSD is obtained by applying the square root to the mean square error term.

Finally, the Prediction Profiler may be used to predict how the recovery changes as a function of load:

The sweet spot is around 200 g/L; after that point, a modest improvement in recovery results with the higher load.  Of note: look at how quickly the recovery drops!  At 175 g/L, it's down to 90% and then continues to plummet.

No comments:

Post a Comment