Probability and Statistics

Statistics is one of the key pillars in our figuring out reality in a quantifiable nature. It is founded on the notions of probability and all of its results are in terms of what are reasonable conclusions. Statistics is largely designed to reject hypotheses, but it is often used to embrace them due to the failure to reject.

We start with descriptive statistics. This is about generating a narrative out of the data. The truth of that narrative is to be handled later, but first we must figure out what stories we want to even tell and how to tell them.

We build on the simple narratives of descriptive statistics by fitting functions to data. Again, this is about just trying to find a story that works best with the data, but we do not yet consider the question of whether it is an accurate reflection of a larger truth.

We delve into probability in a single chapter. The key concept of probability as it relates to statistics is that of the random variable. It is a function of the outcomes under consideration and, as the outcomes have different probabilities of happening, the values of the random variable reflect those probabilities. It is this simplification and abstraction of the messy details of reality into a mathematical object that enables statistics to start asking questions about likelihoods. We conclude the chapter with the Central Limit Theorem and some probability puzzlers.

Frequentist statistics is a discipline which is relatively easy to compute answers. It often asks questions that we can find useful, but that they are not exactly what we want to ask. It is a bit like using peripheral vision to scan for trouble. We go over the classic tests.

In the next chapter, we explore methods that have become more widely used as computing power has enabled them to be used. Bayesian statistics is the idea of updating our probabilities of something being true as we accumulate more data. Monte Carlo methods are a way of augmenting the data collected to simulate a more robust population sampling.

The simulations chapter explores how we can create scenarios to use on these methods. As we increase noisy collection or biased samples or have different underlying probability distributions, how do these approaches work out when we know what the actual truth of the matter is? That is, just how often can we be wrong if we use these methods when stuff is not as nice as we would like it to be?

We finish with a look towards statistics involving many dimensions, concluding with a disucssion of the appropriately named general linear model that provides a framework for covering much of what we have previously talked about.

Descriptive StatisticsExplore

Fitting FunctionsExplore

ProbabilityExplore

Frequentist StatisticsExplore

Bayesian StatisticsExplore

SimulationsExplore

Multivariate StatisticsExplore

Arithmetic Algebra Geometry Functions Many Variables Probability and Statistics Practitioners