Forest ErrAna StatDist

From New IAC Wiki
Jump to navigation Jump to search

Parent Distribution

Let [math]x_i[/math] represent our ith attempt to measurement the quantity [math]x[/math]

Due to the random errors present in any experiment we should not expect [math]x_i = x[/math].

If we neglect systematic errors, then we should expect [math] x_i[/math] to, on average, follow some probability distribution around the correct value [math]x[/math].

This probability distribution can be referred to as the "parent population".


Average and Variance

Average

The word "average" is used to describe a property of a "parent" probability distribution or a set of observations/measurements made in an experiment which gives an indication of a likely outcome of an experiment.

The symbol

[math]\mu[/math]

is usually used to represent the "mean" of a known probability (parent) distribution (parent mean) while the "average" of a set of observations/measurements is denoted as

[math]\bar{x}[/math]

and is commonly referred to as the "sample" average or "sample mean".



Definition of the mean

[math]\mu \equiv \lim_{N\rightarrow \infty} \frac{\sum x_i}{N}[/math]


Here the above average of a parent distribution is defined in terms of an infinite sum of observations (x_i) of an observable x divided by the number of observations.

[math]\bar{x}[/math] is a calculation of the mean using a finite number of observations

[math] \bar{x} \equiv \frac{\sum x_i}{N}[/math]


This definition uses the assumption that the result of an experiment, measuring a sample average of [math](\bar{x})[/math], asymptotically approaches the "true" average of the parent distribution [math]\mu[/math] .

Variance

The word "variance" is used to describe a property of a probability distribution or a set of observations/measurements made in an experiment which gives an indication how much an observation will deviate from and average value.

A deviation [math](d_i)[/math] of any measurement [math](x_i)[/math] from a parent distribution with a mean [math]\mu[/math] can be defined as

[math]d_i\equiv x_i - \mu[/math]

the deviations should average to ZERO for an infinite number of observations by definition of the mean.

Definition of the average

[math]\mu \equiv \lim_{N\rightarrow \infty} \frac{\sum x_i}{N}[/math]
[math]\lim_{N\rightarrow \infty} \frac{\sum (x_i - \mu)}{N}[/math]
[math]= \left ( \lim_{N\rightarrow \infty} \frac{\sum (x_i }{N}\right ) - \mu[/math]
[math]= \left ( \lim_{N\rightarrow \infty} \frac{\sum (x_i }{N}\right ) - \lim_{N\rightarrow \infty} \frac{\sum x_i}{N} = 0[/math]


But the AVERAGE DEVIATION [math](\bar{d})[/math] is given by an average of the magnitude of the deviations given by

[math]\bar{d} = \lim_{N\rightarrow \infty} \frac{\sum \left | (x_i - \mu)\right |}{N}[/math] = a measure of the dispersion of the expected observations about the mean

Taking the absolute value though is cumbersome when performing a statistical analysis so one may express this dispersion in terms of the variance

A typical variable used to denote the variance is

[math]\sigma^2[/math]

and is defined as

[math]\sigma^2 = \lim_{N\rightarrow \infty}\left [ \frac{\sum (x_i-\mu)^2 }{N}\right ][/math]


Standard Deviation

The standard deviation is defined as the square root of the variance

S.D. = [math]\sqrt{\sigma}[/math]


The mean should be thought of as a parameter which characterizes the observations we are making in an experiment. In general the mean specifies the probability distribution that is representative of the observable we are trying to measure through experimentation.


The variance characterizes the uncertainty associated with our experimental attempts to determine the "true" value. Although the mean and true value may not be equal, their difference should be less than the uncertainty given by the governing probability distribution.

Another Expression for Variance

Using the definition of variance (omitting the limit as [math]n \rightarrow \infty[/math])

Evaluating the definition of variance
[math]\sigma^2 \equiv \frac{\sum(x_i-\mu)^2}{N} = \frac{\sum (x_i^2 -2x_i \mu + \mu^2)}{N} = \frac{\sum x_i^2}{N} - 2 \mu \frac{\sum x_i}{N} + \frac{N \mu^2}{N} [/math]
[math] = \frac{\sum x_i^2}{N} -2 \mu^2 + \mu^2 =\frac{\sum x_i^2}{N} - \mu^2[/math]


[math]\frac{\sum(x_i-\mu)^2}{N} =\frac{\sum x_i^2}{N} - \mu^2[/math]
[math]\Rightarrow \sigma^2 = E[(x-\mu)^2] = \sum_{x=0}^n (x_i - \mu)^2 P(x_i)[/math]
[math]= E[x^2] - \left ( E[x]\right )^2 = \sum_{x=0}^n x_i^2 P(x_i) - \left ( \sum_{x=0}^n x_i P(x_i)\right )^2[/math]

Average for an unknown probability distribution (parent population)

If the "Parent Population" is not known, you are just given a list of numbers with no indication of the probability distribution that they were drawn from, then the average and variance may be calculate as shown below.

Arithmetic Mean and variance

If [math]n[/math] observables are mode in an experiment then the arithmetic mean of those observables is defined as

[math]\bar{x} = \frac{\sum_{i=1}^{i=N} x_i}{N}[/math]


The "unbiased" variance of the above sample is defined as

[math]s^2 = \frac{\sum_{i=1}^{i=N} (x_i - \bar{x})^2}{N-1}[/math]
If you were told that the average is [math]\bar{x}[/math] then you can calculate the

"true" variance of the above sample as

[math]\sigma^2 = \frac{\sum_{i=1}^{i=N} (x_i - \bar{x})^2}{N}[/math] = RMS Error= Root Mean Squared Error
Note
RMS = Root Mean Square = [math]\frac{\sum_i^n x_i^2}{n}[/math] =

Statistical Variance decreases with N

Where does this idea of an unbiased variance come from?


Consider the following:

The average value of the mean of a sample of n observations drawn from the parent population is the same as the average value of each observation. (The average of the averages is the same as one of the averages)

[math]\bar{x} = \frac{\sum x_i}{N} =[/math] sample mean
[math]\overline{\left ( \bar{x} \right ) } = \frac{\sum{\bar{x}_i}}{N} =\frac{1}{N} N \bar{x_i} = \bar{x}[/math] if all means are the same

This is the reason why the sample mean is a measure of the population average ( [math]\bar{x} \sim \mu[/math])

Now consider the variance of the average of the averages (this is not the variance of the individual measurements but the variance of their means)

[math]\sigma^2_{\bar{x}} = \frac{\sum \left (\bar{x} -\overline{\left ( \bar{x} \right ) } \right )^2}{N} =\frac{\sum \bar{x_i}^2}{N} -\left( \overline{\left ( \bar{x} \right ) } \right )^2[/math]
[math]=\frac{\sum \bar{x_i}^2}{N} -\left( \bar{x} \right )^2[/math]
[math]=\frac{\sum \left( \sum \frac{x_i}{N}\right)^2}{N} -\left( \bar{x} \right )^2[/math]
[math]=\frac{1}{N^2}\frac{\sum \left( \sum x_i\right)^2}{N} -\left( \bar{x} \right )^2[/math]
[math]=\frac{1}{N^2}\frac{\sum \left (\sum x_i^2 + \sum_{i \ne j} x_ix_j \right )}{N} -\left( \bar{x} \right )^2[/math]
[math]=\frac{1}{N^2} \left [ \frac{\sum \left(\sum x_i^2 \right)}{N} + \frac{ \sum \left (\sum_{i \ne j} x_ix_j \right )}{N} \right ] -\left( \bar{x} \right )^2[/math]


If the measurements are all independent
Then [math] \frac{\sum x_i x_j}{N} = \frac{\sum x_i}{N} \frac{ \sum x_j}{N}[/math] : if [math]x_i[/math] is independent of [math]x_j[/math]
[math]= \left ( \frac{\sum x_i}{N} \right)^2 = \bar{x}^2[/math]


[math]\sigma^2_{\bar{x}}=\frac{1}{N^2} \left [ \frac{\sum \left(\sum x_i^2 \right)}{N} + \sum_{i \ne j} \bar{x}^2 \right ] -\left( \bar{x} \right )^2[/math]

I use the expression [math]\sigma^2 = E[x^2] - \left ( E[x] \right)^2[/math] again, except for[math] x_i[/math] and not [math]\bar{x}[/math] and turn it around so

[math]\frac{\left(\sum x_i^2 \right)}{N} = \sigma^2 + \left ( \frac{\sum x_i}{N}\right)^2[/math]

Now I have

[math]\sigma^2_{\bar{x}}=\frac{1}{N^2} \left [ \sum \left (\sigma^2 + \left ( \frac{\sum x_i}{N} \right )^2 \right )+ \sum_{i \ne j} \bar{x}^2 \right ] -\left( \bar{x} \right )^2[/math]
[math]=\frac{1}{N^2} \left [ N\sigma^2 + N\left ( \frac{\sum x_i}{N} \right )^2 + \sum_{i \ne j} \bar{x}^2 \right ] -\left( \bar{x} \right )^2[/math]
[math]=\frac{1}{N^2} \left [ N\sigma^2 + N\left ( \frac{\sum x_i}{N} \right )^2 + N(N-1) \bar{x}^2 \right ] -\left( \bar{x} \right )^2[/math] Number of cross terms is N*(N-1)
[math]= \left [ \frac{\sigma^2}{N} + \left ( \frac{\sum x_i}{N} \right )^2 \right ] -\left( \bar{x} \right )^2[/math] Number of cross terms is N*(N-1)
[math]= \left [ \frac{\sigma^2}{N} + \left ( \bar{x}\right )^2 \right ] -\left( \bar{x} \right )^2[/math] Number of cross terms is N*(N-1)
[math]= \frac{\sigma^2}{N} [/math] Number of cross terms is N*(N-1)


The above is the essence of counting statistics.

It says that the STATISTICAL error in an experiment decreases as a function of [math]\frac{1}{\sqrt N}[/math]

Biased and Unbiased variance

Where does this idea of an unbiased variance come from?


Using the same procedure as the previous section let's look at the average variance of the variances.

A sample variance of [math]n[/math] measurements of [math]x_i[/math] is

[math]\sigma_n^2 = \frac{\sum(x_i-\bar{x})^2}{N} = E[x^2] - \left ( E[x] \right)^2 = \frac{\sum x_i^2}{n} -\left ( \bar{x} \right)^2[/math]


The average of several sample variances is

[math]\frac{\sum \sigma_i^2}{N} = E[\sigma^2] - \left ( E[\sigma] \right)^2 = \frac{ \sum_j \left [ \frac{\sum_i x_i^2}{n} -\left ( \bar{x} \right)^2 \right ]_j}{N}[/math]
[math]= \frac{1}{n}\sum_i \left [ \frac{\sum_j x_j^2}{N} -\left ( \bar{x} \right)^2 \right ]_j}{N}[/math]

Probability Distributions

Mean(Expectation value) and variance

Mean of Discrete Probability Distribution

In the case that you know the probability distribution you can calculate the mean[math] (\mu)[/math] or expectation value E(x) and standard deviation as

For a Discrete probability distribution

[math]\mu = E[x]=\lim_{N \rightarrow \infty} \frac{\sum_{i=1}^n x_i N P(x_i)}{N}[/math]

where

[math]N=[/math] number of observations

[math]n=[/math] number of different possible observable variables

[math]x_i =[/math] ith observable quantity

[math]P(x_i) =[/math] probability of observing [math]x_i[/math] = Probability Mass Distribution for a discrete probability distribution

Mean of a continuous probability distibution

The average (mean) of a sample drawn from any probability distribution is defined in terms of the expectation value E(x) such that

The expectation value for a continuous probability distribution is calculated as

[math]\mu = E(x) = \int_{-\infty}^{\infty} x P(x)dx[/math]

Variance

Variance of a discrete PDF

[math]\sigma^2 = \sum_{i=1}^n \left [ (x_i - \mu)^2 P(x_i)\right ][/math]

Variance of a Continuous PDF

[math]\sigma^2 = \int_{-\infty}^{\infty} \left [ (x - \mu)^2 P(x)\right ]dx[/math]

Expectation of Arbitrary function

If [math]f(x)[/math] is an arbitrary function of a variable [math]x[/math] governed by a probability distribution [math]P(x)[/math]

then the expectation value of [math]f(x)[/math] is

[math]E[f(x)] = \sum_{i=1}^N f(x_i) P(x_i) [/math]

or if a continuous distribtion

[math]E[f(x)] = \int_{-\infty}^{\infty} f(x) P(x)dx[/math]

Uniform

The Uniform probability distribution function is a continuous probability function over a specified interval in which any value within the interval has the same probability of occurring.

Mathematically the uniform distribution over an interval from a to b is given by


[math]P(x) =\left \{ {\frac{1}{b-a} \;\;\;\; x \gt a \mbox{ and } x \lt b \atop 0 \;\;\;\; x\gt b \mbox{ or } x \lt a} \right .[/math]

Mean of Uniform PDF

[math]\mu = \int_{-\infty}^{\infty} xP(x)dx = \int_{a}^{b} \frac{x}{b-a} dx = \left . \frac{x^2}{2(b-a)} \right |_a^b = \frac{1}{2}\frac{b^2 - a^2}{b-a} = \frac{1}{2}(b+a)[/math]

Variance of Uniform PDF

[math]\sigma^2 = \int_{-\infty}^{\infty} (x-\mu)^2 P(x)dx = \int_{a}^{b} \frac{\left (x-\frac{b+a}{2}\right )^2}{b-a} dx = \left . \frac{(x -\frac{b+a}{2})^3}{3(b-a)} \right |_a^b [/math]
[math]=\frac{1}{3(b-a)}\left [ \left (b -\frac{b+a}{2} \right )^3 - \left (a -\frac{b+a}{2} \right)^3\right ][/math]
[math]=\frac{1}{3(b-a)}\left [ \left (\frac{b-a}{2} \right )^3 - \left (\frac{a-b}{2} \right)^3\right ][/math]
[math]=\frac{1}{24(b-a)}\left [ (b-a)^3 - (-1)^3 (b-a)^3\right ][/math]
[math]=\frac{1}{12}(b-a)^2[/math]


Now us ROOT to generate uniform distributions. http://wiki.iac.isu.edu/index.php/TF_ErrAna_InClassLab#Day_3

Binomial Distribution

Binomial random variable describes experiments in which the outcome has only 2 possibilities. The two possible outcomes can be labeled as "success" or "failure". The probabilities may be defined as

p
the probability of a success

and

q
the probability of a failure.


If we let [math]X[/math] represent the number of successes after repeating the experiment [math]n[/math] times

Experiments with [math]n=1[/math] are also known as Bernoulli trails.

Then [math]X[/math] is the Binomial random variable with parameters [math]n[/math] and [math]p[/math].

The number of ways in which the [math]x[/math] successful outcomes can be organized in [math]n[/math] repeated trials is

[math]\frac{n !}{ \left [ (n-x) ! x !\right ]}[/math] where the [math] ![/math] denotes a factorial such that [math]5! = 5\times4\times3\times2\times1[/math].

The expression is known as the binomial coefficient and is represented as

[math]{n\choose x}=\frac{n!}{x!(n-x)!}[/math]


The probability of any one ordering of the success and failures is given by

[math]P( \mbox{experimental ordering}) = p^{x}q^{n-x}[/math]


This means the probability of getting exactly k successes after n trials is

[math]P(x=k) = {n\choose x}p^{x}q^{n-x} [/math]

Mean

It can be shown that the Expectation Value of the distribution is

[math]\mu = n p[/math]


[math]\mu = \sum_{x=0}^n x P(x) = \sum_{x=0}^n x \frac{n!}{x!(n-x)!} p^{x}q^{n-x}[/math]
[math] = \sum_{x=1}^n \frac{n!}{(x-1)!(n-x)!} p^{x}q^{n-x}[/math] :summation starts from x=1 and not x=0 now
[math] = np \sum_{x=1}^n \frac{(n-1)!}{(x-1)!(n-x)!} p^{x-1}q^{n-x}[/math] :factor out [math]np[/math] : replace n-1 with m everywhere and it looks like binomial distribution
[math] = np \sum_{y=0}^{n-1} \frac{(n-1)!}{(y)!(n-y-1)!} p^{y}q^{n-y-1}[/math] :change summation index so y=x-1, now n become n-1
[math] = np \sum_{y=0}^{n-1} \frac{(n-1)!}{(y)!(n-1-y)!} p^{y}q^{n-1-y}[/math] :
[math] = np (q+p)^{n-1}[/math] :definition of binomial expansion
[math] = np 1^{n-1}[/math] :q+p =1
[math] = np [/math]


variance

[math]\sigma^2 = npq[/math]
Remember
[math]\frac{\sum(x_i-\mu)^2}{N} = \frac{\sum (x_i^2 -2x_i \mu + \mu^2)}{N} = \frac{\sum x_i^2}{N} - 2 \mu \frac{\sum x_i}{N} + \frac{N \mu^2}{N} [/math]
[math] = \frac{\sum x_i^2}{N} -2 \mu^2 + \mu^2 =\frac{\sum x_i^2}{N} - \mu^2[/math]


[math]\frac{\sum(x_i-\mu)^2}{N} =\frac{\sum x_i^2}{N} - \mu^2[/math]
[math]\Rightarrow \sigma^2 = E[(x-\mu)^2] = \sum_{x=0}^n (x_i - \mu)^2 P(x_i)[/math]
[math]= E[x^2] - \left ( E[x]\right )^2 = \sum_{x=0}^n x_i^2 P(x_i) - \left ( \sum_{x=0}^n x_i P(x_i)\right )^2[/math]


To calculate the variance of the Binomial distribution I will just calculate [math]E[x^2][/math] and then subtract off [math]\left ( E[x]\right )^2[/math].

[math]E[x^2] = \sum_{x=0}^n x^2 P(x)[/math]
[math]= \sum_{x=1}^n x^2 P(x)[/math] : x=0 term is zero so no contribution
[math]=\sum_{x=1}^n x^2 \frac{n!}{x!(n-x)!} p^{x}q^{n-x}[/math]
[math]= np \sum_{x=1}^n x \frac{(n-1)!}{(x-1)!(n-x)!} p^{x-1}q^{n-x}[/math]

Let m=n-1 and y=x-1

[math]= np \sum_{y=0}^n (y+1) \frac{m!}{(y)!(m-1-y+1)!} p^{y}q^{m-1-y+1}[/math]
[math]= np \sum_{y=0}^n (y+1) P(y)[/math]
[math]= np \left ( \sum_{y=0}^n y P(y) + \sum_{y=0}^n (1) P(y) \right)[/math]
[math]= np \left ( mp + 1 \right)[/math]
[math]= np \left ( (n-1)p + 1 \right)[/math]


[math]\sigma^2 = E[x^2] - \left ( E[x] \right)^2 = np \left ( (n-1)p + 1 \right) - (np)^2 = np(1-p) = npq[/math]

Examples

The number of times a coin toss is heads.

The probability of a coin landing with the head of the coin facing up is

[math]P = \frac{\mbox{number of desired outcomes}}{\mbox{number of possible outcomes}} = \frac{1}{2}[/math] = Uniform distribution with a=0 (tails) b=1 (heads).

Suppose you toss a coin 4 times. Here are the possible outcomes


order Number Trial # # of Heads
1 2 3 4
1 t t t t 0
2 h t t t 1
3 t h t t 1
4 t t h t 1
5 t t t h 1
6 h h t t 2
7 h t h t 2
8 h t t h 2
9 t h h t 2
10 t h t h 2
11 t t h h 2
12 t h h h 3
13 h t h h 3
14 h h t h 3
15 h h h t 3
16 h h h h 4


The probability of order #1 happening is

P( order #1) = [math]\left ( \frac{1}{2} \right )^0\left ( \frac{1}{2} \right )^4 = \frac{1}{16}[/math]

P( order #2) = [math]\left ( \frac{1}{2} \right )^1\left ( \frac{1}{2} \right )^3 = \frac{1}{16}[/math]

The probability of observing the coin land on heads 3 times out of 4 trials is.

[math]P(x=3) = \frac{4}{16} = \frac{1}{4} = {n\choose x}p^{x}q^{n-x} = \frac{4 !}{ \left [ (4-3) ! 3 !\right ]} \left ( \frac{1}{2}\right )^{3}\left ( \frac{1}{2}\right )^{4-3} = \frac{24}{1 \times 6} \frac{1}{16} = \frac{1}{4}[/math]

A 6 sided die

A die is a 6 sided cube with dots on each side. Each side has a unique number of dots with at most 6 dots on any one side.

P=1/6 = probability of landing on any side of the cube.

Expectation value :

The expected (average) value from a single roll of the dice is
[math]E({\rm Roll\ With\ 6\ Sided\ Die}) =\sum_i x_i P(x_i) = 1 \left ( \frac{1}{6} \right) + 2\left ( \frac{1}{6} \right)+ 3\left ( \frac{1}{6} \right)+ 4\left ( \frac{1}{6} \right)+ 5\left ( \frac{1}{6} \right)+ 6\left ( \frac{1}{6} \right)=\frac{1 + 2 + 3 + 4 + 5 + 6}{6} = 3.5[/math]

The variance:



Count number of times a 6 is observed when rolling a die

Poisson Distribution

[math]P(x) = \frac{\left ( \lambda s \right)^x e^{-\lambda s}}{x!}[/math]

where

[math]\lambda[/math] = probability for the occurrence of an event per unit interval [math]s[/math]


Homework Problem (Bevington pg 38)
Derive the Poisson distribution assuming a small sample size

1.) Assume that the average rate of an event is constant over a given time interval and that the events are randomly distributed over that time interval.

2.) The probability of NO events occuring over the time interval t is exponential such that

[math]P(0,t,\tau) = \exp^{-t/\tau}[/math]

where \tau is a constant of proportionality associated with the mean time

the change in the probability as a function of time is given by

[math]dP(0,t,\tau) = - P(0,t,\tau) \frac{dt}{\tau}[/math]

Gaussian

Lorentzian

Gamma

Beta

Breit Wigner

Cauchy

Chi-squared

Exponential

F-distribution

Landau

Log Normal

t-Distributioon

[1] Forest_Error_Analysis_for_the_Physical_Sciences