These are Glossary entries O-R. For Additional Notes, click on the bar within or at the end of the entry.
O P Q R
Observed Value. See rather Actual Value. ("Observed" does not abbreviate well in formulas).
Ogive. A curve like a tall shallow right-leaning S, rising gradually at the beginning (lower left), more or less steadily in the middle, and tapering off gradually at the end (upper right). Cumulative frequency distributions typically give ogive curves. Some, referring to the S-shape, prefer the term "sigmoid" curves.
Ordinal Data. "Order data:" data points whose labels give us information about the sequence of the things reported, but not how far apart the points are within that sequence. If we replace the names Andy, Bert, Charlie, Dave, Ernie (plus their associated measures of height in inches) with the order in which they finished the sack race (Two, Five, One, Three, Four), then we can immediately arrange them in order of sack-fleetness, though we still don't know whether Four (Ernie) came in just a little bit ahead of Five (Bert), or a long way ahead. We can then try to analyze the associated heights to see if height confers an advantage in sack racing, but we won't be able to quantify any advantage that we find. See Interval.
Outlier. A data point notably further out from the central value than the others. Outliers invite explanation as observational errors, or intrusions from another set, or something of that sort. The handling of outliers is involved in the choice between the mean and the median as a measure of central tendency. If outliers are frequent in data sets, and cannot be explained as experimental slips, they tend to indicate that the data set is not normal (Gaussian). A distribution with a significant number of outliers is said to have "thicker tails" than the normal distribution. This situation is much commoner than conventional statisticians (who are in love with the normal distribution) care to admit.
Parameter. A number determining the shape of an equation. Some equations (such as that for the Poisson distribution) have only one parameter, others two or more. The parameters for the normal distribution, which wholly determine the shape of its curve, are the mean and the standard deviation.
Parametric. Problems for which a distribution graph can be drawn, and the equation of the graph is known; the parameters are the numbers needed to define the equation and thus the graph. Parametric situations are amenable to standard mathematical treatment. Compare Nonparametric.
Pascal Triangle. Common name for the Arithmetical Triangle, a suggestive arrangement of the binomial coefficients (qv) in triangular form. It has many uses, and suggests many truths, in number theory and in statistics.
Pascal's Wager. Pascal's proposition was that even if the chance of the Christian doctrine of salvation being true was extremely small, the benefit of eternal life (the prize to the believer if, in fact, the doctrine is true), so that the expectation of the wager was also infinite, and it was therefore a good risk to believe in the doctrine. This is not a position that will much appeal to modern persons, though it received some attention in its own time (the 17th century).
Permutations. The number of ways a given set of things can be arranged, where order is considered to make a difference. Thus the combination of three elements H, H, and T (the outcomes for tossing a coin three times) may be permuted as HHT, HTH, and THH. Compare Combinations.
p (Greek pi) = 3.14159, the ratio of the circumference of a circle to its diameter. This and the also irrational number e = 2.71828 are connected by a very beautiful if unexpected relationship. Both numbers are common in growth formulas, and for that reason also in statistical formulas. See the Note to entry e.
p(n). The probability of n occurrences of a specified type. If we toss a coin three times, p(3H) is the probability that we will get three heads. Probability in this sense is always a number between 0 (cannot occur) and 1 (must inevitably occur). In this example, assuming that the coin is unbiased, p(3H) would equal 1 out of 8, or 0.125. If the probabilities for all possible outcomes of an experiment of this type are added together, they must always equal 1.
Population. The class of individuals or possibilities from which a given sample is drawn. This term is a holdover from the days when statistical analysis was chiefly applied to literal populations: the inhabitatants of a country. This occurred in part under the leadership of Quetelet.
Population Mean. The mean value for a given population. Usually represented by the Greek letter m (mu). Compare Sample Mean.
Power. A number raised to the nth power is multiplied by itself n times, thus x² means x times x, or (x)(x), and x³ = (x)(x)(x). For the symbolism of powers used in these Internet pages (which cannot represent superscript numbers higher than 3), see Exponent.
Quartile. One of three lines dividing the data set into four equal groups. The middle line of the three is the Median, since by definition it divides the data set into two equal halves. See Interquartile Range.
Robust. Capable of successful misuse; continuing to operate under adverse conditions. Used of a statistical procedure which is relatively resistant to blips and oddities in the data set, or to unmet requirements in the assumptions about the underlying population. If a procedure is proper to normally distributed data, but one can regularly get away with using it on data that is not normally distributed, that procedure is robust. Nonparametric tests, which make no assumptions about the nature of the distribution, are typically more robust than those which do make such assumptions.
Roots. The nth root of x is the number which will give x when multiplied by itself n times. Thus the square root of 4 is 2 (since 2 times 2 = 4) and the square root of 100 is 10 (since 10 times 10 = 100). The square root of x is here written r2(x), or r(x) for short, and the nth root of x as rn(x). In this notation
r2(4) = 2,
r(100) = 10, and
r3(125) = 5
Ruria. A statistically ideal country in which all months have six weeks of five days each for a total of 30 days. There are no weekends, and mail is delivered every day. The year ends in an uncounted five-day national orgy. In which junk mail, which has been saved up throughout the year, is ceremonially burned in huge bonfires, one per block.
Statistics is Copyright © 2001- by E Bruce Brooks
Contact The Project / Exit to Statistics Page