A Case for Bayesian Statistics

Samantha Knee
6 min readMay 24, 2021

Can a Bayesian framework help us solve real world problems?

Photo by Riho Kroll on Unsplash

When I think back to my high school and college statistics classes, z-tests, t-tests, p-values and alpha values are the main words that come to my mind. Little did I know there was a completely different side to statistics, that in my opinion can be much more intuitive. In this post, I’ll explore the difference between the frequentist and Bayesian frameworks of inferential statistics, why I find the Bayesian philosophy easier to understand, and lastly, apply the Bayesian framework to some real-world events.

https://xkcd.com/1132/

Frequentists

Frequentist statistics is the approach you will be more familiar with if you’ve ever taken an entry-level statistics class. For a frequentist, the probability of an event is the rate of occurrences of an event if the experiment was repeated infinitely. For example, if you flipped a coin an infinite number of times, the occurrence of a tails outcome would trend towards 50% as the number of flips approached infinity. Frequentists do not include prior beliefs in statistical experiments and will only use data from the current experiment in drawing conclusions. In our coin example, there is no expected outcome before the coin has been flipped, because the experiment has not been performed yet.

Bayesians

Bayesians define probability as simply a degree of belief in an event, or the level of confidence you have in a particular event occurring. In contrast to frequentists, Bayesians allow you to incorporate prior knowledge into your experiment. The past knowledge that can be incorporated into experiments is known as a prior, and this prior can be used with current experiment data to draw a conclusion. Therefore, taking into account what you know about flipping coins, you already believe the tails event will happen 50% of the time, and you can update this belief as you receive more results during the experiment. This framework can make it much easier to interpret rare events that cannot be experimented multiple times.

Why Bayesians?

I was skeptical of the Bayesian philosophy at first as it was unfamiliar to me, and I appreciated the strict, objective nature of the frequentist principles I had learned. However, as I gained more experience using real data, I realized the real world was not set up for objectivity, and there is rarely a binary right or wrong solution to a problem. Bayesian statistics allows us to include some subjectivity that is often necessary as a starting point into our experiment design.

Frequentists do not provide actual probabilities that a hypothesis is true or false, and this is where p-values are often misinterpreted. A p-value is the calculated probability of obtaining the effect you are testing assuming the null hypothesis is actually true, essentially the probability of a false positive. In other words, a small p-value means there is a very small chance your results are the result of random chance. People sometimes interpret p-values as the probability that the hypothesis is true or false, when in reality frequentists don’t apply probabilities to hypotheses or any unknown values.

In the real world, the probability of hypotheses is what we need to make decisions. Bayesians provide results that can be communicated in plain English and understood logically. Although it is of course a more subjective framework, the subjective beliefs are still based on assumptions that can be used to defend your prior. Bayesian statistics is based on a theorem you probably have heard of — Bayes’ Theorem — and is the basis for why I think this framework is so intuitive.

Bayes’ Theorem!

Anyone that’s needed to calculate probabilities in a technical interview knows what a lifesaver Bayes’ Theorem can be. Once you understand the probabilities included in the theorem, it doesn’t need to be memorized and can be remembered using logic.

The first step to remembering Bayes’ is understanding that the probability of event A given event B is true is equal to the probability that both events are true divided by the probability B is true. This makes sense to me — if you multiply the probability of event B by the probability of event A given event B, this would give you the probability that both events are true.

The next step is understanding that the probability both events are true is equal to the probability of B given A is true, multiplied by the probability that A is true. We already understood this to be true in step 1, just with A and B events reversed. This gives us the final, wonderful theorem of:

P(A|B) = P(B|A)*P(A) / P(B)

So…how does this work with real probabilities?

My favorite example to apply the theorem to is the famous Monty Hall problem. This question always tripped me up, and the answer feels counterintuitive to what my brain wants the answer to be. However, once I applied a Bayesian framework to the problem, the solution made much more sense.

https://theuijunkie.com/monty-hall-problem-explained/

In the Monty Hall problem, game show host Monty Hall gives you the option to pick from three doors. Behind one of the doors is a car while the other two doors have goats behind them. First, you pick one of the three doors. Next, Monty opens another door, revealing one of the goats. Now is where it gets tricky — you have the option of either sticking with your original door or switching to the other remaining door. What should you do?

Lucky for us, we have Bayes’ Theorem to help us solve the problem! In this case, we will solve for the probability that the first door we picked has the car behind it given Monty opens a door with a goat behind it.

P(H): probability the door we picked has a car behind it, before knowing what door Monty opens. P(H)=1/3

P(not H): probability we did not a door with a car behind it. P(not H)=2/3

P(M|H): the probability that Monty shows a door with a goat behind it, given we picked the door with the car. He always picks a door with a goat, so P(M|H) = 1.

P(M|not H): the probability that Monty shows a door with a goat behind it, given we did not pick the door with the car. Again, since this is always true, P(M|NotH) = 1.

A helpful part of Bayes theorem I didn’t include in my example is the fact that we can use an alternative way to calculate the denominator in the theorem, or P(B). P(B) also equals P(B|A)*P(A)+P(B|notA)*P(notA). Understanding this, we now have everything we need to calculate the probability of H given M.

P(H|M) = 1*1/3 / (1*1/3+1*2/3) = 1/3 / 1 = 1/3

The probability that we picked the correct door in our first attempt is unchanged by Monty’s second door reveal; since the car can be behind the door we picked or the door Monty didn’t pick, this means there is a 2/3 probability the car is behind the door not picked or revealed. Therefore, according to Bayes’ Theorem, we should always switch doors!

I hope you found this information helpful — in summary, Bayesians are all about updating our beliefs with the presence of new information. Maybe this is a mindset we could all incorporate into our thoughts and beliefs!

--

--