An insight from Ole Peters that can help kids see some pitfalls in summary statistics

Yesterday we studied problem #7 from Mosteller’s 50 Challenging Problems in Probability. That project is here:

Walking through problem 7 from Mosteller’s 50 Challenging Problems in Probability

The problem is about a game of roulette and has the surprising result that you have a better than 50% chance of being ahead after 36 bets. Last night I realized that the project might have accidentally left the kids with the impression that the game had a positive expected value – whoops!

So, today I wanted to be sure that they did not have this impression. In thinking about how to talk through this topic last night, I realized that some of the ideas that Ole Peters has shared recently are somewhat similar, so I decided to share those ideas with the kids today, too.

We started by reviewing the results of yesterday’s project:

In the initial conversation the boys thought that we should look at how much you won when you were ahead and how much you lost when you were behind. Off camera we modified our program from yesterday to address these questions.

Here we talk about the program and then see the results.

Now I introduced the boys to Ole Peters’ coin flipping game from the talk below. We watched the 5 min segment from roughly 4:00 to 9:00 where Peters explains the game and shows a pretty surprising result:

I’d originally intended to play around with a computer program to simulate Peters’ game, but we were running a bit long so I decided to just talk through it.

The boys were a little surprised by the results, but I think they were able to understand why the outcome for the individuals was different from the outcome for the group.

I really enjoyed this project with the kids today. Hopefully the two simple, but somewhat surprising, ideas from today stick with them:

(i) Having a high chance of winning doesn’t mean a bet is a good bet, and

(ii) Even if the result of a game is positive for a large group, it can still have a negative outcome for almost everyone in that group.


Walking through problem #7 from Mosteller’s 50 Challenging Problems in Probability

We’ve been going through Mosteller’s 50 Challenging problems in Probability this school year. Today we looked at problem #7.

The problem is about gambling on a roulette wheel. Specifically a wheel with 38 spaces for which you get 35 times your money (plus your bet) back if you guess the number correctly.

Here’s the problem and some initial thought from the boys:

At the end of the last video the boys had a plan for how to solve the problem. That plan and the solution to the problem are here:

To end the project I had the boys spend some time writing a computer program to simulate the game. Writing this code was a nice project for them and running it gave us a nice chance to talk about the problem from a different angle:

This was a fun project, but I think I need to add a second section to it tomorrow to clarify that just because you have a decent chance of being ahead, doesn’t mean the game has a positive expected value.

Stumbling through problem #4 in Mosteller’s 50 Challenging Problems in Probability

Sometimes I think a project is going to go really smoothly and I’m just plain wrong. Today was one of those days, unfortunately, as I completely misjudged how difficult this problem would be for my younger son.

He and I ended up spending another 20 min on the problem after the project was over and that time was much more productive. I’m kicking myself a little – and wish that I’d approached the problem differently – but you can’t win them all 🙂

With that disclaimer out of the way, problem #4 from Mosteller’s probability book is a classic:

How many rolls, on average, does it take to roll a 6 on a fair, 6-sided die?

Here’s the introduction to the problem and the initial thoughts from both kids:

Next I had the boys roll dice off camera and record how long it tool to roll a 6. Here are the results of those experiments:

Now we moved from experiments to diving into the math – this is where I probably should have realized that my younger son was struggling a bit to see the math, but I failed to see his struggle:

And then we get lost, unfortunately. I turn the camera off around 6 min and we spent 10 more min talk about the problem off camera. I share this video only so that I can go back and learn from it later and see what I could have done better. Not everything goes well all the time . . . .

While we spoke off camera my older son found a very clever way to solve the problem. Here he explains that solution and my younger son was able to chime in on one little mistake at the end. It was a nice silver lining to a project that went off the rails a little bit:

Explaining some statistical ideas from Nassim Taleb to kids

I saw a fascinating tweet from Nassim Taleb last night:

It reminded me of this lecture Taleb posted a few years ago (particularly the part starting around 8:45):

So, after seeing the tweet last night I decided to take a shot at sharing some of Taleb’s ideas with my kids. The obvious problem is that the details are pretty advanced. The point I thought I could communicate, though, was the idea of (to borrow Taleb’s terms) Mediocrastan vs. Extemistan – the worlds where one observation shouldn’t change things that much vs one in which one observation can change your world view completely. After talking through those ideas, I thought it would be fun to show the boys how different some probability distributions from Mediocrastan and Extemistan look.

We started by talking about distributions that stay close to the mean (like height) and ones where one observation can be far from the mean (like wealth or damage from volcanic eruptions).

Now we took a close look at a bunch of random draws from a normal distribution. The idea here (and in the following two videos) is a high level introduction to the “probabilistic veil” concept from Taleb’s tweet:

Next we moved to the Cauchy distribution. I was hoping that the kids would be able to see that the draws from this distribution were so different from the draws from the normal distribution that they could say with some certainty that these draws here definitely did not come from the normal distribution. It was fun to hear they take some guesses at the max and mins as I increased the number of data points.

Finally, we looked at the class of stable distributions. I didn’t try to describe these distributions in any detail, but rather just said that there was a parameter that we could vary between 1 and 2. Here the goal was to see if we could say if a distribution looked like a normal distribution or the Cauchy distribution. We were also able to see that varying the parameter changed the distribution, but that it would be pretty difficult to tell from the data if the distribution came from the parameter being 1.2 or 1.5 – this is one of the points in the Taleb video lecture from above.

I’m reasonably happy with how this discussion went today – these are pretty advanced ideas to be sharing with kids. Fortunately Mathematica makes it somewhat easy to see how different the various distributions look. Hopefully this conversation helps the boys get a little peek at the ideas of probability distributions and also helps them understand that the process of going from data to a probability distribution can be extremely difficult (even with millions of data points).

A great intro stats question for kids that I learned thanks to Ole Peters

Yesterday I saw a great introductory stats question thanks to a tweet from Ole Peters. The question is here:

In case it doesn’t come through in the tweet, here’s the problem:

You flip a fair coin 20 times. If this sequence contains at least one HHHH, I pay you $100. If it contains at least one HHHT, you pay me $100. If it contains neither, nobody wins.

The question, essentially, is this -> Would you like to play this game?

I introduced the game to my son and asked him what he thought:

So my son thought that the sequence HHHH would appear more than HHHT. Now we went to a short Mathematica program that I wrote to explore the game:

Next we talked about the surprise – HHHT was much more likely than HHHH, and more than 10x more likely to occur alone. The idea here was a little hard for him to see, but eventually he was able to figure out why HHHH was so unlikely to occur alone.

Finally, we went back to the whiteboard to talk through the details one more time. What I was trying to talk about here – and unfortunately not doing a great job of articulating – was:

(i) Why does HHHH occur alone so infrequently,
(ii) Why do the sequences HHHH and HHHT occur together so much, and
(iii) Why does HHHT occur alone much more frequently than HHHH?

I think this is an absolutely amazing introductory statistics problem for kids to think through. It is a really neat problem all by itself, but it also helps kids see that analyzing a time series of data – even a simple one – can be surprisingly subtle!

Follow up #2 to John Shonder’s US weather data visulaization

Two weeks ago I saw an amazing piece of work by John Shonder shared on Twitter:

I’ve already done two projects with the boys using Shonder’s ideas. The first was just walking through his code and showing him that the underlying ideas weren’t that complicated:

Using John Shonder’s Amazing US Temperature visualization wtih kids

At the end of that project I asked the boys for follow up ideas. My younger son (in 7th grade) thought it would be interesting to look at percent change rather than raw temperature change. We did that follow up yesterday:

Follow up #1 to John Shonder’s US temperature change visualizaiton

My older son (in 9th grade) thought it would be interesting to see if we could use the data to make predictions about future temperatures. We looked at that idea today.

Since an even cursory discussion of predictions is way more complicated than I’d like a 15 min talk with a 7th grader and an 9th grader to be, I decided to focus more on best fit curves rather than on actual predictions.

A funny side note to this discussion is that when I told my older son about this change he said – “That sounds pretty hard.” I told him not to worry, that there was a Mathematica command that does the fitting. His response was “of course there is” – ha ha.

So, we started today’s project by looking at plots of some of the county average temperature data. One thing I did here was have the boys estimate what a best fit line would look like by placing a ruler on the computer screen:

Next we used Mathematica to find the best fit line to the data and used Shonder’s code to do a county by county visualization of the slope of that best fit line.

Not too surprisingly, this visualization looked a lot like Shonder’s original one and the percent change one we looked at yesterday. The fact that all three of these visualization looked pretty similar led to a nice discussion about why that wasn’t so surprising:

Next we fit with a quadratic function rather than a line. As with the fit to the line, we looked a several counties first to get a feel for what was going on:

Finally, we did a county by county visualization of the x^2 coefficient of the quadratic polynomial. Here we got a visual that looked very different from the ones we’d seen before:

I’ve really enjoyed the discussions that we’ve had using Shonder’s project. It is amazing to me how Mathematica (and Shonder’s terrific code!) makes a pretty difficult data analysis project accessible to kids.

Follow up #1 to John Shonder’s US temperature change visualization

Last weekend we did a project inspired by this incredible data visualization project from John Shonder:

That project is here:

At the end of last week’s project I asked the boys to think of some follow up projects. My younger son thought it would be interesting to see the percent change in temperature rather than the absolute difference. We did that project today.

The boys have been hiking in the White Mountains for about a week and just got home last night. So, to start today’s project we took a quick look at last week’s project and talked about what changes we’d need to make to implement my younger son’s idea:

Off camera the boys looked up how to convert Fahrenheit to Kelvin so that we could talk about percent change. We started the second part of today’s project by looking at the code where Shonder takes the difference between 10 year averages and changing that code to compute the percent increase.

It is great that Shonder’s code is so accessible that we can make this simple change and spend time talking about math that is easily accessible to a 7th grader.

To finish, we took a careful look at the new visualization. For clarity, below the video are the pictures from last week and this week. I should have prepared both of these for the boys to see in the video, but even though I didn’t, their thoughts on the change are really interesting:

Here’s last week’s visual:

Screen Shot 2019-06-16 at 1.21.46 PM

And here’s this week’s – you have to look pretty carefully to see the differences, but I still think today’s project was worthwhile:

Screen Shot 2019-06-22 at 9.16.13 AM