Viewing By Month : August 2008 / Main
August 27, 2008
Interpreting nonlinear results when the fit hit a constraint

 Usefulness of constraining a parameter in nonlinear regression

Prism lets you constrain the parameters of a nonlinear fit in the Constrain tab of the nonlinear regression dialog. When you constrain a parameter to a constant value, Prism just uses that value and doesn't try to fit the parameter. Constraining a parameter to a range of values is more tricky. When you do this, there are four possible outcomes:

  • The constraint is irrelevant, as the parameter never would have taken on a value in the forbidden range. 
  • The constraint helped speed up the fit. Nonlinear regression works by iteratively changing the values of the parameters. With some complicated fits, the nonlinear regression process can 'get confused' and end up spending time exploring parameter values that make no sense. . Constraining the values of one or more parameters can prevent the nonlinear regression process from being led astray. With huge numbers of data points, you might see a noticeable speeding up of the fitting process. 
  • The constraint helped nonlinear regression choose from several local minima. Nonlinear regression works by changing parameter values step by step until no small change affects the sum-of-squares (which quantifies goodness-of-fit). With some models, there can be two sets of parameter values that lead to local minima in sum-of-squares. Applying a constraint can ensure that nonlinear regression finds the minimum with biologically relevant values. 
  • The constraint prevents nonlinear regression from finding a minimum sum-of-squares. Instead, the best the program can do (while obeying the constraint) is set the parameter to the limit of the constrained range. Prism 5 then reports that the fit 'hit constraint'. 

In the first case, the constraint is harmless but useless.

In the next two cases, the constraint helps the nonlinear regression reach sensible results. Essentially, the constraint can give the nonlinear regression process some scientific judgement about which parameter values are simply impossible. These cases are really what constraints are for. 

The last case, when the fit ends with a parameter set to one end of its constraint,  is where it gets tricky to interpret the results. 

Hitting a constraint is not the same as constraining a parameter to a constant value

When a parameter hits a constraint, Prism still counts the parameter that hit the constraint when it determines the number of degrees of freedom. However, parameters that are constrained to a constant value are not counted. So the confidence interval of other parameters will not be exactly the same in the two cases. 

Interpreting results when the fit hit a constraint

When a fit hits a constraint, the results are unlikely to provide useful information. If you had a solid reason to constrain a parameter within a range of values, it ought to end up in that range. If the fit hit the constraint limit, that means the true best-fit value is some value forbidden by the constraint.

Prism does not compute confidence and prediction bands when a parameter hit a constraint. The best-fit values are not a local minimum, so any attempt to compute confidence or prediction bands would give misleading results.  Prism does compute the confidence intervals for the other parameters (the ones that didn't hit a constraint) but these need to be viewed with caution. 

When a fit ends up hitting a constraint, it is likely that you set the constraint incorrectly. So the first thing to do is make sure the constraint is sensible and correctly entered. Another possibility is to change the constraint from an inequality (Bottom>0) to a constant constraint (Bottom=0). You'll get the same parameter values, but difference confidence intervals, and you can get confidence and prediction bands. 

 

 

 

 The best-fit value of logEC50 is not the X value that corresponds to Y=50

When you fit a log(dose) response curve with Prism, it finds the best-fit values of the parameters that make the curve come as close as possible to the points. One of the parameters is the logEC50 (or logIC50 if the curve is inhibitory). The meaning of the logEC50 is commonly misunderstood. It is the X value (log of concentration) where the Y value of the curve (response) is halfway between the Top and Bottom plateaus of the curve. 

If you fit a normalized model, the curve is forced to have plateaus at 0 and 100, so the halfway point is 50. Therefore the best-fit value of the logEC50 (or logIC50) is the X value that makes the Y value of the curve equal 50. 

If you don't pick a normalized model (and don't constrain Top and Bottom to 100 and 0), then the logEC50 (or logIC50) is not the X value that makes the curve Y equal 50. Instead, it is the X value that makes the Y value of the curve equal a value that is halfway between the best-fit values of Top and Bottom. This will almost never equal 50.00 and will often be very far from 50.0. If you don't constrain Bottom to be a constant value of zero, Prism may find a best-fit value that is far from zero. And it may find a  best-fit value of Top that is far from 100. The average of Top and Bottom can be very far from 50.

Note that Top and Bottom refer to the best-fit values that correspond to the Y value of the curve when extrapolated very far to the left and right. They are not the same as the highest and lowest Y values in your data set. 

What if you really want the X value where the curve Y is 50.0 rather than the logEC50 as defined above? Prism 5 can fit that for you. You'll need to clone the built-in equation so you can edit it. Then go to the dialog where you edit the equation, and enter the transforms shown on lines two and three in this screen shot:

 

 

The transform with the name logX50 will be the X value which corresponds to Y=50. Since the X axis represents the log of concentrations, so will this value. The transform with the name X50 is the antilog of the previous value, so is in concentration units rather than log(concentration) units. 

August 19, 2008

How can I compare the slopes of linear regression lines? With post tests!

Prism can automatically test whether slopes and intercepts differ. The instructions below work with Prism 3, 4 and 5.

Overall comparison

Enter data onto an XY table, with X in one column, and two columns for Y for your two (or more) data sets.

Click Analyze and choose linear regression.

In the "Parameters: Linear Regression" dialog, check the box labeled "Test whether slopes and intercepts are significantly different" . You can see the results by choosing the "Are lines different?" subpage in the Navigator. 

Note that all the data must be entered on one data table. This article explains how to enter the data.

The P value answers the question: If all the data sets really came from populations with identical slopes, what is the probability that random sampling would result in slopes as disparate as the ones observed in this experiment. 

 

Post tests

If you are only comparing two groups, you are done. 

If you are comparing more than two groups, you might want to test them two at a time with post tests. Prism cannot do this automatically, but here is a way to get the job done by entering the slopes and running an ANOVA:

  1. Create a new grouped data table ('two grouping variables' in Prism 4) with Y subcolumns formatted for entry of mean, SEM and N.
  2. Enter data into the first row only. Enter slopes into "mean", SE of slopes into "SEM", and df+1 into "N". (Why df+1? Because Prism will subtract 1 from whatever you enter to calculate the df value it uses in ANOVA.)
  3. Click analyze, and choose one-way ANOVA. Choose an appropriate post test. 

 Note: Elsewhere, we explain how to test whether the slope of a linear regression differs from a specific, hypothetical value.

August 18, 2008

 R2 of weighted nonlinear fits.

When you fit a model to data with nonlinear regression, it is often useful to weight the data. This is most often done when the amount of variation is proportional to the Y value, so there is more scatter with large values. 

Prism performs weighted nonlinear regression. But it doesn't report a weighed R2. Instead it reports the unweighted R2 for the fit determined by weighted nonlinear regression.

Why? In versions 1-4 of Prism, the weighted nonlinear regression was weighted by the Y values of the data. With this scheme, a weighted R2 would make no sense.  With Prism 5, we switched to weighting by the Y values of the curve (and adjusting those weights as the nonlinear regression progresses). With weighting by the Y values of the curve, it makes perfect sense to compute a weighted R2, but we didn't realize this when we created Prism 5. Prism 6 will do this calculation correctly.   

If you are using relative weights (weight by 1/Y2), It is easy to compute the weighted R2 yourself. Follow these steps:

  1. Fit your model using relative (1/Y2) weighting and record the weighted sum-of-squares that Prism reports. Call this wSSmodel. 
  2. Fit the data again, and this time choose the Horizontal line model from the Lines set of of models. Again choose 1/Y2  weighting. 
  3. Note the weighted sum-of-squares from step 2. Call this wSShorizontal. 
  4. Compute the weighted R2 as:

                1.0 - (wSSmodel/wSShorizontal)

 

Note that this will only work with Prism 5, as Prism 4 and earlier did the weighting differently. 

August 7, 2008
Computing a pA2 using global fitting.

General information:

This article contains a fairly detailed example, with screen shots.

Troubleshooting fits that don't converge:

If you are using Prism 5.00, update to Prism 5.01 (Windows, Mac 5.0a is fine). This update introduced a new rule for initial values, which is used to obtain an initial value for the pA2. This is a much much better rule than used in previous versions, and will enable fits to converge that otherwise wouldn't. 

What units should I use to enter B? What units is the pA2 in?

You can enter B in any concentration units you want, but molar is standard. The units of 10^(-1*pA2) will be the same as the units of B. So if B is in molar, then pA2 is the negative log of the Kb. 

Creating a column bar graph with SD or SEM entered directly

When you create a new table and graph in Prism 5, one choice is to create a Column data table. Each column in the table creates one column in the graph, plotted as individual points, bars, box-and-whisker etc. This kind of graph can include error bars computed directly from the raw data you enter in each column. But there is no way to enter SD or SEM values computed elsewhere onto a column table.

If you want to enter SD or SEM values directly, start with a Grouped table instead of a Column table. With this kind of table, you can choose to format the table for direct entry of SD or SEM (and N) rather than raw replicates. 

This will create a data table with the appropirate columns. Use just the top row to enter your data.

If you enter data onto only one row of a Grouped data table, the graph will be almost the same as a column graph but with the error bars coming from SD or SEM values you entered directly. The difference is that Prism labels the graph like a Grouped graph, with different data sets identified with a legend, rather than column labels below each graph.

There are two ways to alter the graph to label each bar with its column title:

  • Alternative 1: Click the Change graph type button, and change to a column bar graph. The data table will still be Grouped with subcolumns for SD or SEM, but now the graph will be Column, and so label each bar with its column title. 
  • Alternative 2: Double-click on the X axis to bring up the Format Axis dialog. Change the Number format to 'Column titles."