Note the difference between confidence and prediction bands:
•The 95% confidence bands enclose the area that you can be 95% sure contains the true curve. If you have many data points, the confidence bands will be near the line or curve, and most of your data will lie outside the confidence bands.
•The 95% prediction bands enclose the area that you expect to enclose 95% of future data points. They are wider than confidence bands -- much wider with large data sets.
Also distinguish between the 95% confidence interval of the parameters (a range of values), and the 95% confidence bands around the curve.
The calculation of the confidence and prediction bands are fairly standard, and can only be expressed with matrices. A brief explanation follows. More details can be found here.
First, define G|x, which is the gradient of the parameters at a particular value of X and using all the best-fit values of the parameters. The result is a vector, with one element per parameter. For each parameter, it is defined as dY/dP, where Y is the Y value of the curve given the particular value of X and all the best-fit parameter values, and P is one of the parameters.)
G'|x is that gradient vector transposed, so it is a column rather than a row of values.
Cov is the covariance matrix (inverted Hessian from last iteration). It is a square matrix with the number of rows and columns equal to the number of parameters. Each item in the matrix is the covariance between two parameters. Note that this is the actual covariance matrix, which is distinct from the normalized covariance matrix (where each value is between -1 and 1) that Prism can report.
Now compute c = G|x * Cov * G'|x. The result is a single number for any value of X.
The confidence and prediction bands are centered on the best fit curve, and extend above and below the curve an equal amount.
The confidence bands extend above and below the curve by:
= sqrt(c)*sqrt(SS/DF)*CriticalT(Confidence%, DF)
The prediction bands extend a further distance above and below the curve, equal to:
= sqrt(c+1)*sqrt(SS/DF)*CriticalT(Confidence%, DF)
In both these equations, the value of c (defined above) depends on the value of X, so the confidence and prediction bands are not a constant distance from the curve. The value of SS is the sum-of-squares for the fit, and DF is the number of degrees of freedom (number of data points minus number of parameters). CriticalT is a constant from the t distribution based on the amount of confidence you want and the number of degrees of freedom. For 95% limits, and a fairly large df, this value is close to 1.96. If DF is small, this value is higher.