Mathematics {statistics} can be about probability, populations, distributions, and testing.
After rank ordering, proximities can show approximate distances {analysis of proximities}.
Statistics {order statistics} can be about sequences.
Two population-sample measurements can have relation {correlation, statistics}| {regression coefficient}. Sample members are measurement pairs. Both measurement values can increase {positive correlation}, decrease {negative correlation}, or have no correlation.
For paired measurements, such as individual weight and height, sum of products of z scores, divided by number of individuals, makes a number {correlation coefficient}| between -1 and +1. Correlation coefficient R measures relations between two variables: R^2 = SSR/SST = 1 - SSE/SST, where SSR is residual sum of squares, SST is term sum of squares, and SSE is error sum of squares.
z scores
Sum from i = 1 to i = N of z1(i) * z2(i) / N, where z is z score and N is population size.
means
Sum from i = 1 to i = N of (n1(i) * n2(i) / N - x1 * x2) / (s1 * s2), where n is value, x is mean, s is standard deviation, and N is population size.
values
(N * (sum from i = 1 to i = N of n1(i) * n2(i)) - (sum from i = 1 to i = N of n1(i)) * (sum from i = 1 to i = N of n2(i))) / ((N * (sum from i = 1 to i = N of (n1(i))^2) - (sum from i = 1 to i = N of n1(i)) * (sum from i = 1 to i = N of n1(i)))^0.5) * ((N * (sum from i = 1 to i = N of (n2(i))^2) - (sum from i = 1 to i = N of n2(i)) * (sum from i = 1 to i = N of n2(i)))^0.5).
If population distribution is not normal, testing correlation coefficient {correlation coefficient test} can show if two event sets relate. Hypothesize that correlation coefficient is zero. Choose significance level. Degrees of freedom are sample size minus two for the two events. Convert correlation coefficient to t value: t = r * (N - 2)^0.5 / (1 - r^2)^0.5, where r is correlation coefficient and N is number of individuals. If t value is less than t-distribution value at that significance level and degrees of freedom, do not reject hypothesis.
Lines {regression line}| {regression curve} closest to all points can go through correlation graphs.
Regression curve can be straight line {linear regression}|.
Regression lines pass as close as possible {best fit}| to all points and so minimize sum of distances from points to regression line. Lines can pass closest to points if sum of squares of differences from line to points is minimum. Best-fit lines must pass through all property means: x2 = m * x1 + b, where x1 and x2 are means, m is slope, and b is y-intercept. m = (N * (sum from i = 1 to i = N of n1(i) * n2(i)) - (sum from i = 1 to i = N of n1(i)) * (sum from i = 1 to i = N of n2(i))) / (N * (sum from i = 1 to i = N of n1(i)^2) - (sum from i = 1 to i = N of n1(i)^2)). b = x2 - m * x1.
Regression curves can predict {prediction} second-variable amount from first-variable amount. Second-variable value y equals regression-curve slope m times first variable x plus intercept b: y = m*x + b.
Measurement values can have frequencies {distribution, population}| {frequency distribution} {theoretical distribution}, such as frequencies at ages.
Distributions {binomial distribution} can reflect probability of event series that have two possible outcomes. Terms equal (n! / (r! * (n - r)!)) * p^r * (1 - p)^(n - r), where n is event number, r is favorable-outcome number, and p is favorable-outcome probability. Mean equals n*p. Variance equals n * p * (1 - p).
ratio
Small p with caret (^) {p-hat} denotes ratio of subset X to sample N: = X/N. p-hat standard deviation = (p * (1 - p) / N)^0.5 or (N * p * (1 - p))^0.5 / N, if p = X/N and 1 - p = (N - x) / N.
Distributions {bivariate distribution} can have two random variables. Average value of (x1 - u1) * (x2 - u2), where x is value and u is mean, measures covariance.
N things can have x of one kind and N - x of another kind {hypergeometric distribution}. Mean equals R*p, where p is favorable-outcome probability and R is favorable-outcome number. Variance equals R * p * (1 - p) * ((N - R) / (N - 1)). Terms equal (x! / (x1! * (x - x1)!)) * ((N - x)! / ((N - x1)! * (N - R - x + x1)!)) / (N! / (R! * (N - R)!)), where x is thing or event and x1 is number of things that are the same.
comparisons
Hypergeometric distributions approximate binomial distributions, if probability is less than 0.1 and number of things is large. Hypergeometric distributions approximate Poisson distributions, if number of things is large and favorable-event number divided by thing number is less than 0.1. Hypergeometric distributions approximate normal distributions, if mean is greater than or equal to four.
Symmetrical distributions {normal distribution} {normal curve} {Gaussian curve} {Gaussian distribution} over continuous domain can have highest frequencies near mean and lowest frequencies farthest from mean. y = (1 / (s * (2 * pi)^0.5)) * (e^(-(x - u)^2 / (2 * s^2))), where x is domain value, y is frequency, u is mean, and s is standard deviation.
median
In normal distributions, mean equals mode equals median.
approximations
Non-normal distributions can transform to normal distributions using square root of x or logarithm of x.
purposes
Normal distribution models random errors {error curve}. Normal distributions result from measurements that have many factors or random errors. For example, height results from genetics, diet, exercise, and so on, and has normal distribution.
Passing inputs through different-threshold sigmoidal functions, and then finding differences, results in Gaussian distributions.
theorem
If many random same-size samples come from a large population with normal distribution, sums of samples make a normal distribution {central limit theorem, normal distribution}, as do sample means.
mean
Sample-mean mean is an unbiased population-mean estimate.
If binomial coefficients are in a triangle {Pascal's triangle} {Pascal triangle}, so nth row has coefficients for n, coefficients are sums of two nearest preceding-row coefficients. Pascal's triangle is 1 / 1 1 / 1 2 1 / 1 3 3 1 / 1 4 6 4 1 / ...
Asymmetrical distributions {Poisson distribution, statistics} can have most numbers near mean. Domain starts at zero and is continuous. Mean and variance equal n*p. Terms equal (u^r / r!) * e^-u, where r is favorable-outcome number, from zero to infinity, u equals n*p, p is favorable-outcome probability, and n is event number. Poisson distribution is binomial-distribution limit, as p goes to zero and N goes to infinity.
z scores can convert to whole numbers {T score} {standard score} between 0 and 100: T = 50 + z*10. T score has mean 50 and standard deviation 10.
Normal-distribution variable can change from x to z = (x - u) / s {standard measure} {z distribution} {z score}, where u equals mean and s equals standard deviation. New mean equals zero, and new standard deviation equals one. z score measures number of standard deviations from mean.
Outcome or state probability can depend on outcome or state sequence {Markov process}. Events can depend on previous-event order. Transition graphs show event orders needed to reach all events.
Taking samples can find transition matrices, or using transition matrices can generate data points {Markov chain Monte Carlo method} (MCMC). MCMC includes jump MCMC, reversible-jump MCMC, and birth-death sampling.
Hidden Markov models {Markov-modulated Poisson process} can have continuous time.
Functions {autocorrelation function} (ACF) can measure if process description needs extra dimensions or parameters.
An average {autoregressive integrated moving average} (ARIMA) can vary around means set by hidden Markov chains.
Minus two times Schwarz criterion {Bayesian information criterion} (BIC) approximates hidden-Markov-model number of states.
Models {Bayesian model} can estimate finite-state Markov chains.
Methods {direct Gibbs sampler, sampling} can sample states in hidden Markov chains.
Algorithms {EM algorithm} can have E and M steps.
Hidden Markov models {finite mixture model} can have equal transition-matrix rows.
Stochastic forward recursions and stochastic and non-stochastic backward recursions {forward-backward Gibbs sampler, sampling} can sample distribution.
Recursion {forward-backward recursion, sampling} can adjust distribution sampling. It is similar to Kalman-filter prediction and smoothing steps in state-space model.
Finite-state Markov chain {hidden Markov model} (HMM) samples distribution.
model
Hidden Markov models are graphical models. Bayesian models are finite-state Markov chains.
purposes
Hidden Markov chains model signal processing, biology, genetics, ecology, image analysis, economics, and network security. Situations compare error-free or non-criminal distribution to error or criminal distribution.
transition
Hidden-distribution Markov chain has initial distribution and time-constant transition matrix.
calculations
Calculations include estimating parameters by recursion {forward-backward recursion, Markov} {forward-backward Gibbs sampler, Markov} {direct Gibbs sampler, Markov}, filling missing data, finding state-space size, preventing switching, assessing validity, and testing convergence by likelihood recursion.
Models {hidden semi-Markov model} can maintain distribution states over times.
In Monte Carlo simulations, if smaller particles surround particle, where will particle be at later time? Answer uses white noise {Langevin equation} {Langevin diffusion}: 6 * k * T * (lambda / m) * D(t), where lambda/m is friction coefficient, T is absolute temperature, k is elasticity, and D(t) is distance at time t. At much later time, distance = (k * T) / (m * lambda).
Recursion {likelihood recursion} can calculate over joint likelihood averaged over time, typically taking logarithm.
Methods {marginal estimation} can estimate hidden Markov distribution and probability.
Methods {maximum a posteriori estimation} (MAP) can estimate hidden Markov distribution and probability.
Methods {Metropolis-Hastings sampler} can sample from multivariate normal distribution centered on current data point, depending on Hastings probability.
Distributions {predictive distribution} can measure how well models predict actual data.
Asymptotic approximations {Schwarz criterion} to logarithm of state-number probability can depend on likelihood maximized over transition matrix, sample size, and free-parameter number.
Models {state-space model} can use Gaussian distributions for hidden Markov models.
Algorithms {Viterbi algorithm} can find most likely Bayesian trajectory, using forward-backward recursion with maximization, rather than averaging, and can estimate hidden Markov distribution.
Systems can permute things in sequences of positions. Systems can have things in sequences of positions but sequence does not matter or some things are the same, so some permutations are the same {combination, mathematics}. Combinations are number of different permutations. Number of combinations is less than or equal to number of permutations.
permutations
N positions and N things have number of permutations N! = N * (N - 1) * (N - 2) * ... * 1. N things and R positions, with R < N, have number of permutations N! / (N - R)! = N * (N - 1) * (N - 2) * ... * (N - (R - 1)). N things and R positions, with N <= R, have number of permutations N! = N * (N - 1) * (N - 2) * ... * 1.
combinations
Number of combinations C equals number of permutations P divided by factorial of number of positions R: C = P / R!. Number of combinations of n things taken r at a time is n! / r!.
Number of combinations of n + 1 things taken r at a time equals number of combinations of n things taken r - 1 at a time, plus number of combinations of n things taken r at a time: (n + 1)! / r! = n! / (r - 1)! + n! / r!.
Number of combinations of n things taken n - r at a time equals number of combinations of n things taken r at a time: n! / (n - r)! = n! / r!.
Number of combinations of n things taken r + 1 at a time equals number of combinations of n things taken r at a time, times quantity (n - r) divided by (r + 1): n! / (r + 1)! = (n! / r!) * (n - r) / (r + 1).
binomial
Systems can have things that have states. Two things, each with possible states a and b, have four permutations and three combinations: 1 a*a, 2 a*b, and 1 b*b. Three things with states a and b have eight permutations and four combinations: 1 a*a*a, 3 a*a*b, 3 a*b*b, or 1 b*b*b, and so on.
Combination-term coefficients derive from binomial powers. (a + b)^2 = a^2 + 2*a*b + b^2, (a + b)^3 = a^3 + 3*a^2*b + 3*a*b^2 + b^3, (a + b)^4 = a^4 + 4*a^3*b + 6*a^2*b^2 + 4*a*b^3 + b^4, and so on.
First-event outcome can have no affect on second-event outcome {independent event}. For independent events, to find probability of outcome series, multiply event probabilities: P = p1 * p2 * p3 * ... * pN, where pi are probabilities, and N is event number.
Event outcome can mean that other outcomes cannot happen {disjoint events} {mutually exclusive events}|. For equally probable outcomes, outcome probability is 1/N, where N is number of mutually exclusive outcomes. To find probability of at least one of mutually exclusive outcomes, add all probabilities: P = p1 + p2 + p3 + ... + pN, where pi are probabilities, and N is outcome number.
Systems can have things that can be at sequence positions, or a thing can have a succession of states {permutation}| {arrangement}.
states
Number P of permutations for succession of states is product of number N of states raised to power of number R of events: N^R.
things
For sequences, number of permutations is product of number N of things times number of things minus one, and so on, until number R of sequence positions: N * (N - 1) * (N - 2) * ... * (N - R + 1). Using an outcome makes it unavailable for succeeding events.
Event series {stochastic process}| can have different results with different probabilities. Events can be independent {Bernoulli sequence}. Events can depend on preceding events in a Markov process.
Graphs {transition graph} can show event orders needed to reach all events.
Two-dimensional displays {graph} can show distributions.
Values or value ranges can have bars whose height or length indicates frequency {bar graph}. For example, yellow has three items, orange has four items, and green has two items. Straight lines can connect bar-top midpoints {frequency polygon}.
Bar graphs {histogram}| can display frequency versus property value over property-value range. For example, 0 to 5 has 15 items, 5 to 10 has 20 items, and 10 to 15 has 12 items.
Coordinate systems {scatter diagram} {scattergram}| can display data dispersion.
Event-outcome chance {probability} {risk} is between zero and one. To find probability, count how many times outcome happens compared to how many times event repeats: p = outcomes / events. Probability can be from theory {a priori probability} or from experiment {empirical probability}.
First-event outcome can influence second-event outcome {conditional probability}| {prior probability}. For conditional probability, to find probability that outcome happens in first event and outcome happens in second event, multiply first-outcome probability times modified second-outcome probability {conditional probability law} {law of conditional probability}: P = p1 * p2(p1).
Systematic probability {Kolmogorov probability} {Kolmogorov axioms} can use three axioms. Outcome probability is zero or positive real number. Probability that event has some outcome is one. For disjoint subsets, probability of union of subsets is sum of subset probabilities {sigma-additivity} {additivity}.
The more times event repeats, the closer to actual probability outcome-probability becomes {large numbers law} {law of large numbers}.
People can believe that an improbable situation that has not happened recently is more likely to happen now, or that the past affects next outcome of random process {Monte Carlo fallacy} {gambler's fallacy}.
Expected outcome divided by outcome value measures risk {risk, outcome}. Expected outcome value is worth or gain multiplied by probability.
After many independent events, relative frequency approaches outcome probability {weak law of large numbers}.
Probability is 0.5 that at least two people out of 25 have same birthday {birthday, probability}.
What is probability that needle falls on parallel lines {Buffon's problem} {Buffon problem}? Probability that needle touches line is 2 * l / (pi * d), where l is needle length and d is distance between lines.
First, contestant chooses one of three outcomes. Then contestant learns that one of other two outcomes is incorrect. Do people switch to another outcome or keep current choice {Monty Hall problem}? Switch, because first guess was only 1/3 correct, but second guess must be 2/3 correct, because contestant now knows that probability of one outcome is zero.
To generate pseudorandom number sequence, square number and use result without first and last digits, then repeat {middle-square method}.
Sampling from tables of random digits {Monte Carlo method} can determine empirical probability.
Data-collection methods can cause errors {systematic error}.
Normal distribution indicates error frequencies {Gauss-Laplace law}.
Statistics find information about group {population, statistics} {statistical population}. Population has number of people or objects, which have measurable properties {statistic, data} {descriptive statistic} {datum}. Experimenters can check population sample or collect information from all population members {census, population}.
If many random same-size samples come from a population with normally distributed measurements, sample sums and means are both normal distributions {central limit theorem, population}.
Samples have numbers {degrees of freedom, statistics}| of independent values, free to change. Degrees of freedom equal value number minus one, because last value is total minus other values and so is dependent. If situation has factors, factor degrees of freedom are factor number minus one, because last factor depends on the others.
Taking {sampling}| one value {sample, population} from a population can be random {random sample}. Sample sets {aggregate, population} have number of samples {sample size}.
attributes
Samples have properties {parameter, population} {sampling statistic}, such as weight. Samples have property values {attribute data} {measurement data}, such as high weight. Sampling can infer {statistical inference} population mean, variance, mode, and median.
types
Random samples can have same subgroup proportions as population {stratified random sample}, such as same age distribution.
Sampling can return sampled member to population {sampling with replacement} or not {sampling without replacement}.
subsets
Sample subsets {sample class} share an attribute, such as high weight.
Data can have linear changes over time {trend}| {trend line}.
Numbers can have coefficients {weight, multiplier} based on frequency or importance {weighting}|. Averages {weighted average} can use weights.
Means can have an estimate {estimate} {estimation}|.
Intervals {confidence interval}| around estimated means have a confidence level that population mean is in the interval. Unbiased estimate can have 95% confidence if low estimate equals x - (1.96 * s) / N^0.5 and high estimate equals x + (1.96 * s) / N^0.5, where x is mean, s is standard deviation, and N is sample size.
On average, situations have sets of expected outcomes {expectation, statistics} {expected value}|. Averaging outcome values can find expected value: sum from i = 1 to i = N of (w(i) * p(i)) / N, where N is number of values, w(i) is outcome worth, and p(i) is outcome probability.
Average is not the best estimate {Stein's paradox} {Stein paradox}. Best estimate is average plus factor times difference between average and grand average: e = u * f * (u - U). Factor depends on standard deviation, as shown by Bayes.
Estimates have population, which sets correct confidence interval {unbiased interval}. Sample means are unbiased population-mean estimates. Sample variances are unbiased population-variance estimates.
Percent of numbers that are less than a number plus half of percent of numbers equal to the number is a statistic {percentile rank} {percentile}|. For example, if n is at 50th percentile {second quartile, percentile} {fifth decile}, half of all values are less than or equal to n.
Dispersion {inter-quartile range} can be between lower quartile 25% and upper quartile 75%. Median splits inter-quartile range.
Value has cumulative probability {quantile} {inverse density function} {percent point function} for that value and all lower values. Quantile function is integral of probability function from minimum to value.
For distributions, one quarter {quartile} goes from 0th to 25th percentile {first quartile}. One quarter goes from 25th to 50th percentile {second quartile, distribution}. One quarter goes from 50th to 75th percentile {third quartile}.
Deviation {quartile deviation} (Q) can be half the difference between first and third quartiles.
Population measurement, such as weight, has calculated numbers {statistic, population}, such as median, mode, mean, and range.
Standard deviation divided by mean {coefficient of variation} {coefficient of variability} {variation coefficient} can measure population variation.
Sum from i = 1 to i = N of n(i) /N, where N equals number of numbers, and n(i) equals number, is a statistic {mean, population}| {average}. For example, the numbers 1, 2, 2, 3, 4, 5, and 6 have mean equal to (1 + 2 + 2 + 3 + 4 + 5 + 6) / 7. Average is number-group balance point, because sum of differences between numbers and mean equals zero.
If numbers are in sequence, middle number of odd number of numbers, or average of two middle numbers of even number of numbers, is a statistic {median, population}|. For example, the numbers 1, 2, 2, 3, 4, 5, and 6 have median equal 3.
Number with greatest frequency is a statistic {mode, population}|. For example, the numbers 1, 2, 2, 4, 5, and 6 have mode equal 2.
Difference between lowest and highest number is a statistic {range, number}. For example, the numbers 1, 2, 2, 3, 4, 5, and 6 have range equal 5.
Mean can divide into quotient error {relative error, statistics}.
Number sets have variance spread {dispersion, statistics}|. Dispersion is torques of numbers around balance point: sum from i = 1 to i = N of (n(i) - x)^2 / N, or sum from i = 1 to i = N of (n(i))^2 / N - x^2, where n(i) are numbers, and x equals mean.
Fourth moment {kurtosis, distribution} measures distribution fatness or slimness.
Square root of mean of squares of differences between numbers and mean {root mean square} (RMS) can equal standard deviation.
Third moment {skewness, distribution} measures distribution asymmetry, whether it is more to right or left of mean. Skew distribution is not symmetric. To find skewness, calculate median and compare to mean.
Sample-mean distribution standard deviation {standard error of the mean}| is smaller than population standard deviation: s / N^0.5, where s is population standard deviation, and N is sample size.
Variance has a square root {standard deviation}|.
Torques of numbers around balance point measure dispersion {variance, distribution}|: sum from i = 1 to i = N of (n(i) - x)^2 / N, or sum from i = 1 to i = N of (n(i))^2 / N - x^2, where n(i) are numbers, and x equals mean.
Using a parameter {statistic, test} can test {test, statistics} {statistical tests} whether two sample groups from population are similar.
statistic
Statistic can be mean or variance.
hypothesis
Hypothesis is false if it leads to contradiction, and then hypothesis opposite is true. Assume groups have no difference for statistic {null hypothesis, statistical test}. Try to reject null hypothesis.
significance
Choose allowable probability {significance level, test}, usually 5%, that rejected hypothesis is actually true {type-one error, test} or that accepted hypothesis is actually false {type-two error, test}.
calculation
Calculate probability that sample group is from population.
comparison
Reject hypothesis if probability is less than 5%. The statistic {two-tailed test, error} can be too high and/or too low, the usual case. The statistic {one-tailed test} can be too low or too high. It is hard to prove hypotheses true, but hypotheses are false if they lead to contradiction, thus proving hypothesis opposite.
Statistics can test statement {hypothesis, statement} about population {hypothesis testing}.
To test, take sample. Sample can be from hypothesized population or not. Sample can be represent hypothesized population or not. If sample is from hypothesized population and represents it, hypothesis is true. If sample is not from hypothesized population and does not represent it, hypothesis is false.
errors
If sample is from hypothesized population but does not represent it, hypothesis is true but seems false. Errors {Type I error} can happen, with probability {alpha risk}.
If sample is not from hypothesized population and represents it, hypothesis is false but seems true. Errors {Type II error} can happen, with probability {beta risk}.
independence
Perhaps, events, such as political party, job type, sex, or age category, do not affect each other {independence, statistics}. Tests can see if they do affect each other {dependence, statistics}.
Testing variable found in mutually exclusive groups from same population, normally distributed or not, can show if groups are equivalent {analysis of variance} {variance analysis} (ANOVA).
measurements
In ANOVA, measurements are sums of four parts: mean, class or treatment effect, sampling or measurement effect, and normally distributed random error. ANOVA checks if sampling error or class error is great enough compared to random error to make samples or classes actually different.
process
Assume groups are equivalent and so have same mean. Set significance level to 5%.
For two groups, calculate variance ratio {variance ratio} {F value, ANOVA}. Group degrees of freedom are group number minus one: C - 1. Sample degrees of freedom df are sum of group degrees of freedom Ni, minus group number C. df = N1 + N2 + ... - C.
Numerator is sample-mean-difference variance: ( (sum from i = 1 to i = N1 of (n1(i))^2) / N1 - (sum from i = 1 to i = N1 + N2 + ... of (n(i))^2) / (N1 + N2 + ...) ) / (C - 1) + ( (sum from i = 1 to i = N2 of (n2(i))^2) / N2 - (sum from i = 1 to i = N1 + N2 + ... of (n(i))^2) / (N1 + N2 + ...) ) / (C - 1) + ... Denominator is population variance: ( (sum from i = 1 to i = N1 + N2 + ... of (n(i))^2) - ( (sum from i = 1 to i = N1 of (n1(i))^2) / N1 + (sum from i = 1 to i = N2 of (n2(i))^2) / N2 + ... ) ) / (N1 + N2 + ... - C), where n is frequency, N is sample size, and C is sample number.
F values form distributions {F distribution, ANOVA} that vary with degrees of freedom and significance.
If calculated F value is less than F value in F distribution for same degrees of freedom and significance level, accept that samples have same mean. If calculated F value is more, reject hypothesis, so at least one sample is not random, or samples are from different populations.
mean square
For samples or treatments, degrees of freedom can divide into sum of squares of differences between values and mean {mean square}. Mean square estimates population variance. F value is sample or treatment mean square divided by error mean square.
missing data
Least-squares method can estimate missing data.
types
Replications are like classes {randomized blocks plan}.
Testing interactions between treatments can be at same time as testing interactions between samples {two-way classification}.
comparison to t test
With two samples, F test and t test are similar.
In populations, tests {chi square} can show whether one property, such as treatments, affects another property, such as recovery level. Properties have mutually exclusive outcomes, such as cured or not.
process
Hypothesize that events are independent. Select significance level, typically 5%. Make contingency table.
Calculate degrees of freedom: (R - 1) * (C - 1), where R is number of table rows and C is number of table columns.
Calculate chi square value: X = sum from i = 1 to i = R and from j = 1 to j = C of (x(i,j) - f(i,j))^2 / f(i,j), where x(i,j) is observed frequency and f(i,j) is expected frequency. f(i,j) = (sum from i = 1 to i = R of x(i,j)) * (sum from j = 1 to j = C of x(i,j)) / (sum from i = 1 to i = R and from j = 1 to j = C of x(i,j)), where x(i,j) is observed frequency.
result
If calculated chi square value is less than actual chi square value for degrees of freedom at significance level, do not reject hypothesis.
Table rows can be possible outcomes of one variable, and table columns can be possible outcomes of other variable {contingency table}. Table cells are numbers {observed frequency} having both outcomes. For example, table rows can be Wide and Narrow people, and table columns can be Tall and Short people, so table 0 1 / 2 3 has 0 of Wide and Tall and 3 of Narrow and Short.
Testing two same-size samples, each from different and not necessarily normally distributed populations, can show if populations are the same {F test}. Hypothesize that samples are the same. Set significance level to 5%. Sample degrees of freedom are sample size minus one. Calculate variance ratios between two samples {F distribution, test} {F value, test}: v1 / v2, where v is sample variance. If calculated F value is less than actual F value for degrees of freedom at significance level, do not reject hypothesis. F distribution measures variance distribution.
Experiments {factorial experiment} can use factors. Several variables {factor, variable} can affect dependent variable. Set up ANOVA. Calculate factor effects, by finding average of differences between variable measurements, while holding other factors at their constant means. Calculate factor interactions, by finding differences between variable-measurement changes, at varying other-variable levels. Small differences mean little interaction. For no interactions, factor effects can add.
ANOVA {nested classification} {hierarchical classification} can have sample treatments and classes.
Randomizing treatments over replications {Latin square} can control two variation sources.
ANOVA {mixed model ANOVA} can have no treatments or classes, with sample subsamples.
ANOVA {fixed effect model} {Model I analysis of variance} can have no treatments or classes and only replicate samples.
ANOVA {random effect model} {Model II analysis of variance} can have one random sample from any number of treatments or classes.
Tests {distribution-free test} {non-parametric test}, such as sign tests, can be for unknown distributions. Sign and difference value can make a non-parametric test {sign rank test}.
Assume groups have no difference for parameter {null hypothesis, test}|.
Methods {sequential analysis} can test paired attribute data to decide between classes/treatments. For all pairs, check treatment differences. Count only significant differences. Observe until number of accepted counted differences exceeds number required by significance level and total number of observations, or until total number of observations exceeds threshold. If difference number is greater than threshold, accept that one treatment is better.
Testing two samples with matched pairs {sign test} can show if they are from same, not necessarily normally distributed, population.
matched pairs
Before and after comparison has matched pairs.
hypothesis
Hypothesize that first and second samples show no change.
significance
Set significance level to 5%.
degrees of freedom
Degrees of freedom are sample size minus one.
calculation
For all matched pairs, subtract before-value from after-value to get plus, minus, or no sign. Add positive signs.
Use probability of 1/2 for getting positive sign, because samples are the same by hypothesis. Calculate binomial-distribution z score: (P - 0) / (N * 0.5 * 0.5)^0.5, where P is positive-sign number, N is sample size, mean equals zero, and standard deviation equals (N * p * (1 - p))^0.5.
test
If z score is less than normal-distribution value with same degrees of freedom at significance level, do not reject hypothesis.
Choose allowable probability {significance level, statistics}|, usually 5%, that rejected hypothesis is actually true {type-one error, significance} or that accepted hypothesis is actually false {type-two error, significance}.
Sample can test the hypothetical mean of normally distributed population {t test} {one-sample t test}. Hypothesize that sample and population means are equal. Set significance level to 5%. Sample size less one gives independent-value number {degrees of freedom, t test}. Calculate distribution of same-size-sample means with same degrees of freedom. Result is similar to normal distribution, except distribution includes degrees of freedom {t value} {t distribution}: t = (x - u)/e, where x is sample mean, u is hypothetical population mean, and e is sample-mean standard error. If calculated t value is less than actual t value for significance level and degrees of freedom, do not reject hypothesis.
two samples
Testing two independent samples from population can show if samples are from same population. Hypothesize that first and second sample means are equal. Set significance level to 5%. Degrees of freedom involve both sample sizes: (N1 - 1) + (N2 - 1) = N1 + N2 - 2. Calculate t value: t = (x1 - x2)/e, where x is sample mean. e is standard error of difference, which equals ( ( (v1 * (N1 - 1) + v2 * (N2 - 1)) / (N1 + N2 - 2) )^0.5) * ((1 / N1 + 1 / N2)^0.5), where v is sample variance and N is sample degrees of freedom. If t value is less than t-distribution value with same degrees of freedom at significance level, do not reject hypothesis.
paired samples
Testing two paired samples, or matched pair samples, can show if they are from same population. Hypothesize that first and second sample means are equal. Set significance level to 5%. Degrees of freedom are sample size minus one. Calculate t value: t = sum from i = 1 to i = N of (n1 - n2)/e, where N is sample size and n is sample value. e is standard error of difference, which equals (N * (sum from i = 1 to i = N of (n1 - n2)^2) - (sum from i = 1 to i = N of (n1 - n2)^2) ) / (N - 1)^0.5. If t value is less than t-distribution value with same degrees of freedom at significance level, do not reject hypothesis.
Statistic can be too high and/or too low {two-tailed test, statistics}, the usual case.
Rejected hypothesis can be actually true {type-one error, statistics}.
Accepted hypothesis can be actually false {type-two error, statistics}.
Testing {z test} an outcome that recurs in successive events can show if it is a true outcome.
set up
Assume outcome is true outcome. Set significance level to 5%. Degrees of freedom are sample size minus one. Add number of times outcome happens. Use probability, usually 0.5, suggested by null hypothesis for outcome.
test
Calculate z score for binomial distribution: (P - 0) / (N * p * (1 - p))^0.5, where P is outcome number, p is probability, N is event number, mean equals zero, and standard deviation equals (N * p * (1 - p))^0.5. If z score is less than normal-distribution value with same degrees of freedom at significance level, do not reject hypothesis.
Theories {Fisher theory} can be about experiment design.
Theories {Neyman-Pearson theory} can be about hypothesis testing.
Outline of Knowledge Database Home Page
Description of Outline of Knowledge Database
Date Modified: 2022.0225