page sections

introduction
chi-squared tests
z scores
t test and ANOVA
confidence intervals
correlation and linear regression
logistic regression
measures of central tendency
standard deviation and standard error
resources and references

Statistical Testing

last authored:
last reviewed:

Introduction

Statistics allow us to collect, organize, and analyze quantititive data after they are converted, or standardized, into test statistics.

Measures whether there is a difference between groups.

The point of most statistical tests is to determine how likely it is that a given test statistic is within a relevant distribution.

When to use different tests:

testing frequencies or percentages:

chi-square test
- how many people who undergo surgery get pneumonia?
- how many people born in Nova Scotia live until they are 80%

comparing means:

z-test with small samples (<30)
- cholesterol levels, over 10 years, among members of a family
t-test for larger samples (>30)
- cholesterol levels between Halifax and Dartmouth
ANOVA if more than two groups
- cholesterol levels across the provinces of Canada

return to top

Chi-Square Tests

Chi-square tests examine dichotomous dependent variables. It is good for frequencies or percentages.

A 2x2 table is normally used, with actual values compared with expected values to examine whether the difference is statistically significant.

Good for frequencies.

The test statistic is χ², and significance is judged by the p value.

The Fisher's exact test is similar to two dichotomous variables, but used when there are small expected frequencies in one or more cells.

Z Scores

Z scores are used to test two sample means of normally distributed, continuous data, and describe how far away a sample mean is away from the population mean, using standard error as the measurement. The mean score is standardized to 0, and the standard error is 1.

Z score = (sample mean - population mean) / se

A positive z score is a value above the mean, while a negative z score is a value below.

Charts provide the area under the curve to give the distance the sample mean is from the reference mean.

P values represent the probability that the observed value could be at least as extreme as it is, due to sampling alone.

T Test and ANOVA

T-tests are used for continuous dependent variables across two independent variables, with means often being compared. If there are more than two independent variables, ANOVA is used instead. ANOVA, or analysis of variance, is used to compare continuous variables across different levels of one or more categorical variables.

The 'paired' t-test is often used to compare two groups of subjects, with two sets of measurements on each group - a before/after type thing.

A one-tailed T test is not as rigorous as a two tailed test, as all of α is in one tail and therefore has relaxed criteria for accepting the alternative hypothesis.

F ratios are obtained and p values are used to judge significance.

return to top

Confidence Intervals

Confidence intervals are used to show the precision with which the mean has been derived. Given a 95% CI, there is a 5% chance that the population mean is outside the CI, or 2.5% in each tail.

90% CI = sample mean +/- 1.64 se
95% CI = sample mean +/- 1.96 se
99$ CI = sample mean +/- 2.58 se

return to top

Correlation and Linear Regression

Correlations measure how one continuous variable is affected by another continuous variable, ie systolic bp and heart rate. Correlation is actually a type of linear regression.

Test statistic is r, the correlation coefficient. r =0 is no correlation while an r = - / + 1 is the extremes. P values are sued to judge statistical significance.

Linear regression examines continuous dependent variables and (usually) continuous independent variables and shows how useful independent variables are in predicting dependent variables.

The main outcome is R², or the coefficient of determination, which describes the percentage of variation in the dependent variable explained by the variation in the independent variable.

Slope coefficients for each independent variable describe the effect of the independent variable. For example a coefficient of 0.5 means that for every 1.0 increase in the independent, there is a 1.0 increase int he dependent variable.

return to top

Logistic Regression

Logistic regression is used for dichotomous dependent variables and any type of independent variables.

The great thing about logistic regression is that for each independent variable, when an exponent is used for the slope coefficient, and estimate of the odds ratio can be obtained, adjusted for all other variables in the regression.

return to top

Measures of Central Tendency

In a normally distributed population, all three measures of central tendency should be very similar. If not the population is skewed.

mean: average
median: middle value (or average of the middle two)
mode: most common value

The mean is the most affected by distribution; it is imortant to compare and report all three.

return to top

Standard Deviation and Standard Error

to show variation, use SD
to show mean, use SE (plus 95% CI)
if variation is biological, SD is more accurate; if variation is experimental, SE is better

Standard deviation describes the normal variation, or scatter, seen in a population and is represented by a bell curve extending from the true mean.

SD measures the average distance of individual values from the mean, while standard error is the standard deviation of a population of sample means, rather than of individual observations.

Standard error therefore refers to the variability of means and measures how accurately you know the mean.

SE = SD/( sqrt N)

SD doesn't get smaller with an increasing number of samples, while standard error should.

To calculate SD, start with the variance.

The variance is calculated by adding all individual values together and dividing by (n-1)

SD = square root of variance.

SD is calculated by adding the variance of observations around the mean, summing them,The SD can be used to describe the individual observations that make up raw data, but is not useful for determining how close a sample mean from data is from the true mean.

In a normal population, 68% of values will be within 1 SD, 95% will be within 2 SD, and 99.7% will be within 3 SD.

Effects Size

Effects size is (x1-x2)/SD

0.2 - minimal clincial benefit
0.5 - moderate clinical benefit
0.8 - substantial clinical benefit

return to top

Resources and References

return to top