Difference between revisions of "Forest Error Analysis for the Physical Sciences"
Line 56: | Line 56: | ||
If each observable (<math>x_i</math>) is accompanied by an estimate of the uncertainty in that observable (<math>\delta x_i</math>) then weighted mean is defined as | If each observable (<math>x_i</math>) is accompanied by an estimate of the uncertainty in that observable (<math>\delta x_i</math>) then weighted mean is defined as | ||
− | :\bar{x} = \frac{ \sum_{i=1}^{i=n} \frac{x_i}{\delta x_i}}{\sum_{i=1}^{i=n} \frac{1}{\delta x_i}</math> | + | :<math>\bar{x} = \frac{ \sum_{i=1}^{i=n} \frac{x_i}{\delta x_i}}{\sum_{i=1}^{i=n} \frac{1}{\delta x_i}</math> |
+ | |||
+ | The variance of the distribution is defined as | ||
+ | |||
+ | :<math>\bar{x} = \sum_{i=1}^{i=n} \frac{1}{\delta x_i</math> | ||
The average of a sample drawn from any probability distribution is defined in terms of the expectation value E(x) such that | The average of a sample drawn from any probability distribution is defined in terms of the expectation value E(x) such that |
Revision as of 19:19, 4 November 2009
Class Admin
Forest_ErrorAnalysis_Syllabus
Homework
Homework is due at the beginning of class on the assigned day. If you have a documented excuse for your absence, then you will have 24 hours to hand in the homework after being released by your doctor.
Class Policies
http://wiki.iac.isu.edu/index.php/Forest_Class_Policies
Instructional Objectives
- Course Catalog Description
- Error Analysis for the Physics Sciences 3 credits. Lecture course with computation requirements. Topics include: Error propagation, Probability Distributions, Least Squares fit, multiple regression, goodnes of fit, covariance and correlations.
Prequisites:Math 360.
- Course Description
- The course assumes that the student has very limited experience with the UNIX environment and C/C++ programming. Homework problems involve modifying and compiling example programs written in C++.
Systematic and Random Uncertainties
Although the name of the class is "Error Analysis" for historical purposes, a more accurate description would be "Uncertainty Analysis". "Error" usually means a mistake is made while "Uncertainty" is a measure of how confident you are in a measurement.
Accuracy -vs- Precision
- Accuracy
- How close does an experiment come to the correct result
- Precision
- a measure of how exactly the result is determine. No reference is made to what the result means.
Reporting Uncertainties
Notation
X \pm Y = X(Y)
Statistical Distributions
Average and Variance
Average
The word "average" is used to describe a property of a probability distribution or a set of observations/measurements made in an experiment which gives an indication of a liked outcome of an experiment.
Arithmetic Mean and variance
If
observables are mode in an experiment then the arithmetic mean of those observables is defined asThe variance of the above sample is defined as
Weighted Mean and variance
If each observable (
) is accompanied by an estimate of the uncertainty in that observable ( ) then weighted mean is defined asThe variance of the distribution is defined as
The average of a sample drawn from any probability distribution is defined in terms of the expectation value E(x) such that
The expectation value for a discrete probability distribution is given by
The expectation value for a continuous probability distribution is calculated as
variance
The variance of a sample draw from a probability distribution
Binomial
Binomial random variable describes experiments in which the outcome has only 2 possibilities. The two possible outcomes can be labeled as "success" or "failure". The probabilities may be defined as
- p
- the probability of a success
and
- q
- the probability of a failure.
If we let
represent the number of successes after repeating the experiment timesExperiments with
are also known as Bernoulli trails.Then
is the Binomial random variable with parameters and .The number of ways in which the
successful outcomes can be organized in repeated trials is- where the denotes a factorial such that .
The expression is known as the binomial coefficient and is represented as
The probability of any one ordering of the success and failures is given by
This means the probability of getting exactly k successes after n trials is
It can be shown that the mean of the distribution is
and the variance is
Examples
The number of times a coin toss is heads.
The probability of a coin landing with the head of the coin facing up is
Suppose you toss a coin 4 times. Here are the possible outcomes
order Number | Trial # | # of Heads | |||
1 | 2 | 3 | 4 | ||
1 | t | t | t | t | 0 |
2 | h | t | t | t | 1 |
3 | t | h | t | t | 1 |
4 | t | t | h | t | 1 |
5 | t | t | t | h | 1 |
6 | h | h | t | t | 2 |
7 | h | t | h | t | 2 |
8 | h | t | t | h | 2 |
9 | t | h | h | t | 2 |
10 | t | h | t | h | 2 |
11 | t | t | h | h | 2 |
12 | t | h | h | h | 3 |
13 | h | t | h | h | 3 |
14 | h | h | t | h | 3 |
15 | h | h | h | t | 3 |
16 | h | h | h | h | 4 |
The probability of order #1 happening is
P( order #1) =
P( order #2) =
The probability of observing the coin land on heads 3 times out of 4 trials is.
Count number of times a 6 is observed when rolling a die
p=1/6
Expectation value :
- The expected (average) value from a single roll of the dice is
Poisson
Gaussian
Lorentzian
Propagation of Uncertainties
Statistical inference
For this class we shall define a hypothesis test as a test used to
There are two schools of thought on this
frequentist statistical inference
- Statistical inference is made using a null-hypothesis test; that is, ones that answer the question Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed?
The relative frequency of occurrence of an event, in a number of repetitions of the experiment, is a measure of the probability of that event.
Thus, if nt is the total number of trials and nx is the number of trials where the event x occurred, the probability P(x) of the event occurring will be approximated by the relative frequency as follows:
Bayesian inference.
- Statistical inference is made by using evidence or observations to update or to newly infer the probability that a hypothesis may be true. The name "Bayesian" comes from the frequent use of Bayes' theorem in the inference process.
Bayes gave a special case involving continuous probability distribution|continuous prior and posterior probability distributions and discrete probability distributions of data, but in its simplest setting involving only discrete distributions, Bayes' theorem relates the conditional probability|conditional and marginal probability|marginal probabilities of events A and B, where B has a non-vanishing probability:
- .
Each term in Bayes' theorem has a conventional name:
- P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B.
- P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B.
- P(B|A) is the conditional probability of B given A.
- P(B) is the prior or marginal probability of B, and acts as a normalizing constant.
Bayes' theorem in this form gives a mathematical representation of how the conditional probabability of event A given B is related to the converse conditional probabablity of B given A.
Example
Suppose there is a school having 60% boys and 40% girls as students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance; all the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem. The event A is that the student observed is a girl, and the event B is that the student observed is wearing trousers. To compute P(A|B), we first need to know: P(A), or the probability that the student is a girl regardless of any other information. Since the observers sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4. P(B|A), or the probability of the student wearing trousers given that the student is a girl. As they are as likely to wear skirts as trousers, this is 0.5. P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since P(B) = P(B|A)P(A) + P(B|A')P(A'), this is 0.5×0.4 + 1×0.6 = 0.8. Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:
Another, essentially equivalent way of obtaining the same result is as follows. Assume, for concreteness, that there are 100 students, 60 boys and 40 girls. Among these, 60 boys and 20 girls wear trousers. All together there are 80 trouser-wearers, of which 20 are girls. Therefore the chance that a random trouser-wearer is a girl equals 20/80 = 0.25. Put in terms of Bayes´ theorem, the probability of a student being a girl is 40/100, the probability that any given girl will wear trousers is 1/2. The product of two is 20/100, but you know the student is wearing trousers, so you remove the 20 non trouser wearing students and receive a probability of (20/100)/(80/100), or 20/80.
Chi-Square
comparing experiment with theory/function
Comparing 2 experiments
P-value
Root fundtion to evaluate meaning of Chi-square
PDG => Rather, the p-value is
the probability, under the assumption of a hypothesis H , of obtaining data at least as
incompatible with H as the data actually observed.
From http://en.wikipedia.org/wiki/P-value and http://en.wikipedia.org/wiki/Statistical_significance
the p-value is the frequency or probability with which the observed event would occur, if the null hypothesis were true. If the obtained p-value is smaller than the significance level, then the null hypothesis is rejected.
In some fields, for example nuclear and particle physics, it is common to express statistical significance in units of "σ" (sigma), the standard deviation of a Gaussian distribution. A statistical significance of "
" can be converted into a value of α via use of the error function:The use of σ is motivated by the ubiquitous emergence of the Gaussian distribution in measurement uncertainties. For example, if a theory predicts a parameter to have a value of, say, 100, and one measures the parameter to be 109 ± 3, then one might report the measurement as a "3σ deviation" from the theoretical prediction. In terms of α, this statement is equivalent to saying that "assuming the theory is true, the likelihood of obtaining the experimental result by coincidence is 0.27%" (since 1 − erf(3/√2) = 0.0027).
What is a p-value?
A p-value is a measure of how much evidence we have against the null hypothesis. The null hypothesis, traditionally represented by the symbol H0, represents the hypothesis of no change or no effect.
The smaller the p-value, the more evidence we have against H0. It is also a measure of how likely we are to get a certain sample result or a result “more extreme,” assuming H0 is true. The type of hypothesis (right tailed, left tailed or two tailed) will determine what “more extreme” means.
Much research involves making a hypothesis and then collecting data to test that hypothesis. In particular, researchers will set up a null hypothesis, a hypothesis that presumes no change or no effect of a treatment. Then these researchers will collect data and measure the consistency of this data with the null hypothesis.
The p-value measures consistency by calculating the probability of observing the results from your sample of data or a sample with results more extreme, assuming the null hypothesis is true. The smaller the p-value, the greater the inconsistency.
Traditionally, researchers will reject a hypothesis if the p-value is less than 0.05. Sometimes, though, researchers will use a stricter cut-off (e.g., 0.01) or a more liberal cut-off (e.g., 0.10). The general rule is that a small p-value is evidence against the null hypothesis while a large p-value means little or no evidence against the null hypothesis. Please note that little or no evidence against the null hypothesis is not the same as a lot of evidence for the null hypothesis.
It is easiest to understand the p-value in a data set that is already at an extreme. Suppose that a drug company alleges that only 50% of all patients who take a certain drug will have an adverse event of some kind. You believe that the adverse event rate is much higher. In a sample of 12 patients, all twelve have an adverse event.
The data supports your belief because it is inconsistent with the assumption of a 50% adverse event rate. It would be like flipping a coin 12 times and getting heads each time.
The p-value, the probability of getting a sample result of 12 adverse events in 12 patients assuming that the adverse event rate is 50%, is a measure of this inconsistency. The p-value, 0.000244, is small enough that we would reject the hypothesis that the adverse event rate was only 50%.
A large p-value should not automatically be construed as evidence in support of the null hypothesis. Perhaps the failure to reject the null hypothesis was caused by an inadequate sample size. When you see a large p-value in a research study, you should also look for one of two things:
a power calculation that confirms that the sample size in that study was adequate for detecting a clinically relevant difference; and/or a confidence interval that lies entirely within the range of clinical indifference. You should also be cautious about a small p-value, but for different reasons. In some situations, the sample size is so large that even differences that are trivial from a medical perspective can still achieve statistical significance.
As a statistician, I am not in a good position to advise you on whether a difference is trivial or not. As a medical expert, you need to balance the cost and side effects of a treatment against the benefits that the therapy provides.
The authors of the research paper should inform you what size difference is clinically relevant and what sized difference is trivial. But if they don't, you should. Ask yourself how much of a difference would be large enough to cause you to change your practice. Then compare this to the confidence interval in the research paper. If both limits of the confidence interval are smaller than a clinically relevant difference, then you should not change your practice, no matter what the p-value tells you.
You should not interpret the p-value as the probability that the null hypothesis is true. Such an interpretation is problematic because a hypothesis is not a random event that can have a probability.
Bayesian statistics provides an alternative framework that allows you to assign probabilities to hypotheses and to modify these probabilities on the basis of the data that you collect.
Example
A large number of p-values appear in a publication
Consultation Patterns and Provision of Contraception in General Practice Before Teenage Pregnancy: Case-Control Study. Churchill D, Allen J, Pringle M, Hippisley-Cox J, Ebdon D, Macpherson M, Bradley S. British Medical Journal 2000: 321(7259); 486-9. [Abstract] [Full text] [PDF]
by Churchill et al 2000. This was a study of consultation practices among teenagers who become pregnant. The researchers selected 240 patients (cases) with a recorded conception before the age of 20. Three controls were selected for each case and were matched on age and practice.
The not too surprising finding is that the cases were more likely to have consulted certain health professionals in the year before conception and were more likely to request contraceptive protection. This demonstrates that teenagers are not reluctant to seek advice about contraception.
For example, 91% of the cases (219/240) sought the advice of a general practitioner in the year before conception compared to 82% of the controls (586/719) during a similar time frame. This is a large difference. The odds ratio is 2.37. The p-value is 0.001, which indicates that this ratio is statistically significantly different from 1.0. The 95% confidence interval for the odds ratio is 1.45 to 3.86.
In contrast, 23% of the cases (56/240) sought advice from a practice nurse while 24% of the controls (170/719) sought advice. This is a small difference and the odds ratio is 0.98. The p-value is 0.905, which indicates that this odds ratio does not differ significantly from 1. As with any negative finding, you should be concerned about whether the result is due to an inadequate sample size. The confidence interval, however, is 0.69 to 1.39. This indicates that the research study had a good amount of precision and that the sample size was reasonable.
root [3] TMath::Prob(1.31,11) Double_t Prob(Double_t chi2, Int_t ndf) Computation of the probability for a certain Chi-squared (chi2) and number of degrees of freedom (ndf). Calculations are based on the incomplete gamma function P(a,x), where a=ndf/2 and x=chi2/2. P(a,x) represents the probability that the observed Chi-squared for a correct model should be less than the value chi2. The returned probability corresponds to 1-P(a,x), which denotes the probability that an observed Chi-squared exceeds the value chi2 by chance, even for a correct model.
--- NvE 14-nov-1998 UU-SAP Utrecht
References
1.) "Data Reduction and Error Analysis for the Physical Sciences", Philip R. Bevington, ISBN-10: 0079112439, ISBN-13: 9780079112439
CPP programs for Bevington
2.)An Introduction to Error Analysis, John R. Taylor ISBN 978-0-935702-75-0