|
||
July 16, 2008Tips to make Prism 5 files much smaller! Prism 5 files can get large, but we offer two ways to make them smaller. With this format, the data and info tables are in plain text XML, but the rest of the file is greatly compressed. Overall, PZFX files tend to be MUCH smaller than PZF files. Typically, a PZFX file is less than 10% the size of the corresponding PZF file. If you want to use PZF files (for compatibility with Prism 4), here is a trick to reduce the file size a bit. Go to the Preferences dialog, File tab, and check the option to "Save compact". With this option checked, Prism won't save the results of analyses, but rather recomputes the results when it opens the file. The difference in file size depends on how many analyses your file contains.
July 15, 2008How to compare two (or more) means when the groups have different standard deviations.The unpaired t test depends on the assumption that the samples come from populations that have identical standard deviations (and thus identical variances). One-way ANOVA makes the same assumption. Prism tests this assumption using an F test (to compare the variance of two groups) and Barlett’s test (three or more groups). The P value answers the question: If the populations really had identical standard deviations, what is the chance of observing as large a discrepancy among sample standard deviations as occurred in the data (or an even larger discrepancy)? Note: Don’t mix up the P value testing for equality of the standard deviations of the groups with the P value testing for equality of the means. That latter P value is the one that answers the question you most likely were thinking about when you chose the t test or one-way ANOVA. If the P value is large (>0.05) you conclude that there is no evidence that the standard deviations differ. If the P value is small, you conclude that the data are likely to be sampled from populations with different standard deviations. Then what? There are four answers.
Note that switching to a nonparametric test is not an appropriate approach. If the groups are sampled from populations with distinct standard deviations, then the nonparametric tests simply test whether the distributions are different. They do not test whether the medians differ.
July 8, 2008Asymmetrical (five-parameter) logistic dose-response curves The standard log(dose) vs. response curve is defined by the bottom, top, EC50, and slope. In this curve, the top and bottom parts are mirror images of each other -- the curve is symmetrical. Some log(dose) vs. response curves are not symmetrical. This can be modeled by including a fifth parameter that describes the asymmetry of the curve. The standard curve is sometimes called a four-parameter logistic model, so the asymmetrical curve is called a five parameter logistical model. Of course, an equation should not be referred to by its number of parameters. Some authors assume that any nonspecific signal is already subtracted off, so present the equations in a form where Bottom is defined to be zero and doesn't appear in the equation. Then the symmetrical variable slope equation has three (not four) parameters, and the asymmetrical form has four (not five) parameters. Asymmetrical dose-response curves can be described by several equations. Prism uses the RIchards version (from Giraldo et. al.), which is built-in to the 'Dose-response -- Special' group of equations in Prism 5. This is also called the generalized Hill equation.
This equation assumes that X has been entered as (or transformed to) the logarithm of concentration, and that Y is response in any convenient units. S is the asymmetry parameter. If s=1.0, then this is the same as the four parameter equation. When s is not 1.0, the curve will be asymmetrical. S must be greater than zero, but can be less than, or greater than, 1.0.
Top and Bottom are the Y values at the top and bottom plateaus of the curve. If you have normalized the data, you may want to constrain these values to 100 and 0. The equation above fits the logEC50, which is the X value when Y is half-way between the Top and Bottom plateaus. It is in the same units as the X values -- the logarithm of concentration. Note that the logEC50 is not the same as the inflection point. The first line in the equation computes the inflection point from the logEC50, HillSlope, and S. Like the logEC50, this inflection point (called logXb) is in the same scale as the X values (the logarithm of concentration).
If your goal is to obtain meaningful best-fit parameters, then you'll need lots of high quality data. It is very hard to fit both slope and asymmetry with tight confidence intervals. If your goal is just to interpolate unknowns from a standard curve, the width of the confidence intervals of the parameters doesn't really matter. What you want is a curve that follows the data, and in some cases an asymmetrical five parameter model does so better than a four parameter model.
Other formulations of asymmetrical dose-response curves have been developed. For example, Ricketts and Head developed a model for use in baroreflex studies. Bindslev has written a lengthy on-line text, Drug-Acceptor Interactions. Chapter 10, Hill in Hell discusses many models of dose-response curves, including asymmetrical ones.
June 8, 2008Centered polynomial regression The standard polynomial models look like this: Y= B0 + B1*X +B2*X^2
Y= B0 + B1*XC +B2*XC^2
However, the centered model has reparameterized the equation. The parameters have different meanings, so have different best-fit values (except the last parameter which is the same), different standard errors and confidence intervals , smaller covariances and dependencies, and narrower confidence/prediction bands. Here is a Prism file that demonstrates centered polynomial fitting. Open it, go to one of the fits, change parameters and then then "edit" the equation without changing anything. This will place the equation into your user-defined equation list.
June 6, 2008Are outlier tests useful when data come from a distribution that is not Gaussian? No. Most outlier tests are based on the assumption that the data, except the potential outier(s), come from a Gaussian distribution. If the distribution is not Gaussian, outlier tests are misleading. Here is an example. Grubbs outlier test found an outlier in three of these four data sets.
But these data are not sampled from a Gaussian distribution with an outlier. Rather they are sampled from a lognormal distribution. Transform all the values to their logarithms, and the distribution becomes Gaussian:
The apparent outliers are gone. Grubb's test finds no ouliters. The extreme points only appeared to be outliers because extreme values are common in a lognormal distribution but rare in a Gaussian distribution. If you don’t realize the distribution was lognormal, an outlier test would be very misleading.
When analyzing data, sometimes you want to graph or analyze only a portion of the values, and remove any values that are higher (or lower) than some threshold. You can do this with a user-defined Prism transform. Here is a transform that removes any data with Y greater than 100: Y=IF(Y>100, 0/0, Y) That transforms any values greater than 100 to 0/0 which is undefined, so becomes blank in the results table. The other values get transformed to equal Y (no change). Here is a transform that removes any data with Y greater than 100 or less than 10. Y=IF(Y>100, 0/0, IF(Y<10, 0/0,Y)) This simply nests two IF functions in the transform. To enter a user defined tranform, go to a data table, click analyze, and choose Transform. At the top of the dialog, choose User-defined Y transforms. On the new dialog, click Add to create a new transform. Of course, you could create an X transform and use similar syntax to remove rows where X is too high or too low (or meets some other criterion).
June 4, 2008How to make the right and left Y axes look differentWhen you create a graph with two Y axes, Prism always creates them with the same length and the same color. To make the lengths appear different:
To give the axes different colors:
To give the axis numbering a different color or font:
To only have one axis but put in on the right side of the graph: The first axis created is called the 'left Y axis', but in fact it does not need to be placed on the left side of the graph. It can be anywhere. To put this axis on the right side of a graph:
To delete the right Y axis:
June 1, 2008Plotting t, z, F, or chi-square distributions with Prism. GraphPad Prism can generate probability distributions. This demonstrates Prism's ability to plot functions from user-defined functions, and also the use of hooking info constants to analyses.
In each case, the simulation generates two (or three) data sets. The first (A) data set plots the entire curve. The second (and third) data sets only plot values where X is greater than (less than) a specified cutoff value. This second (and third) data set are plotted with area fill to shade the tails of the distributions. Remove data set B or C from the graph if you only want to shade one tail. Change the numbers of degrees of freedom and the cutoff values (for shading) in the Info sheet. This demonstrates how values entered into an info sheet can be 'hooked' to constants used in analyses.
Graphing a Binomial or Poisson distribution with Prism. Prism can graph a Binomial or Poisson distribution. Download the file that generated this pair of graphs.
To modify this file, change the value of lamda (for Poission) or the probability, n, and cutoff (Binomial) in the Info sheet. Enter new values there, and the graph updates. This is a good example of the usefulness of hooking an info constant to an analysis. If you want to recreate graphs like these, keep in mind these points:
May 25, 2008P values. One-tail or two-tail ? When comparing two groups, you must distinguish between one- and two-tail P values. Some books refer to one-sided and two-sided P values, which mean the same thing. What does one-sided mean? Assuming the null hypothesis is true, what is the chance that randomly selected samples would have means as far apart as (or further than) you observed in this experiment with either group having the larger mean?
Assuming the null hypothesis is true, what is the chance that randomly selected samples would have means as far apart as (or further than) observed in this experiment with the specified group having the larger mean?
When is it appropriate to use a one-sided P value?
Here is an example in which you might appropriately choose a one-tailed P value: You are testing whether a new antibiotic impairs renal function, as measured by serum creatinine. Many antibiotics poison kidney cells, resulting in reduced glomerular filtration and increased serum creatinine. As far as I know, no antibiotic is known to decrease serum creatinine, and it is hard to imagine a mechanism by which an antibiotic would increase the glomerular filtration rate. Before collecting any data, you can state that there are two possibilities: Either the drug will not change the mean serum creatinine of the population, or it will increase the mean serum creatinine in the population. You consider it impossible that the drug will truly decrease mean serum creatinine of the population and plan to attribute any observed decrease to random sampling. Accordingly, it makes sense to calculate a one-tailed P value. In this example, a two-tailed P value tests the null hypothesis that the drug does not alter the creatinine level; a one-tailed P value tests the null hypothesis that the drug does not increase the creatinine level. Recommendation
Common misunderstandings about P values. Kline (see book listing below) lists commonly believed fallacies about P values, which I summarize here: Reference: RB Kline, Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research, 2004.
The Mann-Whitney test doesn't really compare medians. You'll sometimes read that the Mann-Whitney test compares the medians of two groups. But this is not precisely correct. Consider this example:
The graph shows each value obtained from control and treated subjects. The two-tail P value from the Mann-Whitney test is 0.0288, so you conclude that there is a statistically significant difference between the groups. But the two medians, shown by the horizontal lines, are identical. The Mann-Whitney test compared the distributions of ranks, which is quite different in the two groups even though the medians are the same. It is not correct, however, to say that the Mann-Whitney test asks whether the two groups come from populations with different distributions. The two groups in the graph below clearly come from different distributions, but the P value from the Mann-Whitney test is high (0.46).
The Mann-Whitney test compares sums of ranks -- it does not compare medians and does not compare distributions. The Mann-Whiteny test is a comparison of medians only when you assume that the distributions of the two populations have the same shape, even if they are shifted (have different medians). If you accept this assumption, then a small P value from a Mann-Whitney test leads you to conclude that the difference between medians is statistically signficant. More generally, the P value answers this question: What is the chance that a randomly selected value from the population with the larger median is greater than than a randomly selected value from the other population?
May 20, 2008Before-after graphs with different colors for different subjects. When you enter data on a column table and choose a before-after graph, Prism plots all the symbols the same way. You can choose different colors or shapes for "before" than for "after" (which is not helpful). And you can right click on each symbol and change its color (and that of the connecting line). But this approach would be very tedious. . Prism 5 can, in fact, create before-after graphs with multiple colors for different subjects. The trick is to enter the data enter the data on a Grouped table. Follow these steps or examine this Prism file. 1. Create a Grouped table.
Choose the appropriate number of "replicates" (subjects) for your data. Be sure to choose to plot each replicate, and to connect each replicate. 2. Enter the data.
Note that the arrangement of data is different than with a column table. The before-after pairs are stacked into subcolumns. This table has two rows, because it plots just before and after. If you had more time points, add more rows. This table has two data sets, male and female, because we want symbols with two different appearances. Use as many data sets as you want. If you want each subject to have its own appearance, create a table with no subcolumns, and enter each subject into its own data set. 3. Polish the graph.
Also see this related example for creating column scatter graphs with multiple symbol colors.
How to turn off automatic snapping. When you move a text object in Prism 5, it will automatically snap into alignment with bars on graphs, groups of bars, the center of the page, and other text objects. This almost always is a great feature, as it lets you quickly move text to an appropriate spot. But sometimes, you may find that the automatic snapping prevents you from fine-tuning a graph.
The Standard Addition Method for determing concentrations. Prism can easily interpolate from a linear or nonlinear standard curve. You perform the assay at a number of known concentrations, fit a line or curve, and interpolate the uknown values. But there is a problem with interpolating from a standard curve. The results can be incorrect when the unknown sample are contaminated with other substances that alter the assay. This is known as the 'matrix effect problem'. The Standard Addition Method is a way to bypass this problem. You don't need to perform the assay with known concentrations of substance. Instead you add various known concentrations (including zero) of known substance to a constant amount of the unknown. This ensures that all the samples have the same amount of unknown, including any substances that interfere with the assay. Fit the data with linear regression. The value you want to know is how much of the known substance has to be added to double the signal. There is an easier, somewhat trickier, way to find out: Extrapolate the line down to Y=0. One of the parameters that Prism reports is the X intercept, which will be negative. Take the absolute value, and that is the concentration of the unknown substance. The confidence inerval for the X intercept gives you the confidence interval for the concentration of the uknown. Simply multiply both confidence limits by -1. To plot the data in Prism, you'll want to extend the linear regression line to start at an X value equal to the X intercept (a choice in the Linear regression parameters dialog). You may also want to move the origin to the lower left, a choice on the first tab of the Format Axis dialog. Here is a graph created with this Prism file.
|
||