Prism can give different results for nonlinear regression when you ask it to identify or remove outliers -- even if no outliers are identified.
Prism 5 offers the choice (on the Fit tab of the nonlinear regression tab) to automatically eliminate outliers using the ROUT method. You can also choose (Diagnostics tab of the nonlinear regression tab) to identify and count outliers without eliminating them.
Surprisingly, in rare cases, asking Prism to identify or eliminate outliers can lead to different regression results -- even if no outliers are identified. If you ask Prism to identify or eliminate outliers, here are the steps it uses:
- Generate the curve defined by the initial parameter values defined in the Initial Values tab.
- Use least-squares regression to fit a curve. Use the best-fit parameters from leasat-squares as initial values for robust regression.
- Use robust nonlinear regression to adjust the values of those parameters to improve the fit of the curve to most of the data points, while giving outliers little weight.
- Identify any outliers from that curve defined with robust nonlinear regression.
- Eliminate those outliers, if that was specified.
- Use the best-fit values of the parameters from the robust fit as the starting place for a standard least-squares fit.
- Fit the curve using standard Marquardt nonlinear regression.
The key point: Even if no outliers are identified (or one or more are identified, but they are not excluded), the least-squares regression fit of step 7 starts with the best-fit parameter values obtained by robust nonlinear regression in step 3. If you didn't ask Prism to identify outliers, the initial values of the parameters are those set in the Initial Values tab of the nonlinear regression dialog.
Using the best-fit values of the robust fit as the starting place for the least squares fit can make the fitting process go much faster. With huge data sets, this can be noticeable. In rare cases, however, this affect the resutls. Prism could report different best fit results depending on whether you asked it to identify outliers.Or nonlinear regression may converge on a best-fit solution in one case, but does in the other.
This discrepancies should only happen in two situations:
- The initial values defined in the Initial Values tab define a curve that is far from the data. You can view the curve defined by the initial values by checking an option at the top of the Diagnostics tab.
- The data don't really define the entire curve. If the data have a lot of scatter, or don't cover a wide enough range of X values, the best-fit curve can be somewhat ambiguous and so be affected by minor differences in the initial parameter values. In this case, both fits will result in wide confidence intervals of some parameters.