A “new to me” demonstration of the difference between Nassim Taleb’s “mediocristan” and “extremistan” thanks to Steve Phelps

Yesterday I saw a really neat tweet from Steve Phelps:

The idea he is studying goes like this:

Select three points uniformly at random inside of a unit square. What is the expected area of the circle passing through those three points?

This question turns out to have a lot of nice surprises. The first is that exploring the idea of how to find the circle is a great project for kids. The second is that the distribution of circle areas is fascinating.

I started the project today by having the kids explore how to find the inscribed and circumscribed circles of a triangle using paper folding techniques.

My younger son went first showing how to find the incircle:

My olde son went next showing how to find the circumcircle:

With that introduction we went to the whiteboard to talk through the problem that Steve Phelps shared yesterday. I asked the boys to give me their guess about the average area of the circle passing through three random points in the unit square. Their guesses – and reasoning – were really interesting:

Now that we’d talked through some of the introductory ideas in the problem, we talked about how to find the area of a circle passing through three specific points. The fun surprise here is that finding this circle isn’t as hard as it seems initially:

Following the sketch of how to find the circle in the last video, I thought I’d show them a way to find the area of this circle using ideas from coordinate geometry and linear algebra – topics that my younger son and older son have been studying recently. Not everything came to mind right way for the boys, but that’s fine – I wasn’t trying to put them on the spot, but just show them how ideas they are learning about now come into play on this problem:

Finally, we went to the computer to look at the some simulations. The kids noticed almost immediately that the mean of the results was heavily influenced by the maximum area – that’s exactly the idea of “extremistan” that Nassim Taleb talks about!

This project is a great way for kids to explore a statistical sampling problem that doesn’t obey the central limit theorem!

I really love the problem that Phelps posted! It is such a great way to combine fascinating and fundamental ideas from geometry and statistics

Sharing an interesting (and famous) population sampling problem with kids

I saw a great thread on twitter last week – actually in the reverse order in which the tweets appeared. First I saw from Vincent Pantaloni:

Which led me to this amazing tweet from Andrew Webb:

I thought it would be fun to do a project on this idea with the boys. Unlike a few (or maybe most) of our introductory statistics exercises, the program here was likely going to be too hard for the kids to write themselves, so I just wrote it myself and the boys played with it at the end.

To start I had them look at Pantaloni’s tweet:

Next we looked at Webb’s tweet – this one requires a bit more explanation, but the boys were able to understand what Webb’s animation was showing:

Now I spent 5 min explaining how the program I wrote worked. Since my simulation was quite a bit more simplified than the prior two (and also didn’t have any animation), I wanted to be sure they understood what I was doing before we dove in:

For the first run of my simulation, we looked at a 5000 trials of a pond with 1,500 fish and sampling from 4% of the pond From the conversation here you can hear that the boys are gaining a pretty good understanding of the process and are also able to make sense of the distribution of outcomes:

Finally, we looked at 5000 trials of a pond with 750 fish and sampling from 16% of the pond. Again the boys did a nice job explaining the results.

At the end we talked about why this sort of sampling problem can be really difficult.

Exploring some of the recent Betelgeuse observations with kids

A lot of people have been talking about recent observations of the star Betelgeuse this week. Here’s one great thread I happened to see:

After seeing this thread I thought it would be fun to share some of the ideas about the recent observations of Betelgeuse with the boys. Although I’m way out of my league here, there were some great resources I found that I thought would help the boys understand what was going on. Two of those resources were:

A discussion on Astroblog about Betelgeuse

The Light Curve Generator from the American Association of Variable Star Observers

I started today’s project by showing the boys the article on Astroblog and then the graph in Eric Mamajek’s tweet:

Next we looked at a graph from the Light Curve generator showing how the brightness of Betelgeuse has varied going back about 6 months. Sorry for the glare on the computer screen 😦

The boys had different ideas about how to interpret the data – which was fun to hear:

Next I had each on my son’s create a new graph. My younger son went first and he wanted to look at the observations from a single astronomer. We did this by using the green dots since there were only to people who collected that data. The astronomer whose data we looked at was Wolfgang Volmann:

My older son went second – he wanted to look at the observations of Betelgeuse going back a long time. We were able to zoom in on a time period in the 70s and 80s in which many observations showed that Betelgeuse was pretty dim.

This was a really fun project to work through with the kids. It really highlights the difficulty of collecting data in astronomy, and in the real world in general! It was fun to hear their ideas about how to think through the

Two probability problems that seem similar but have different answers

Earlier in the week we looked at the game Ox Blocks which uses a 6-sided die with 2 sides each having O’s, X’s, and a blank. The game is a really fun version of tic tac toe:

Playing with Ox Blocks thanks to the Mathematical Objects podcast

Playing this game reminded me of an old project we’d done on a fun probability problem from Elchanan Mossel:

Exploring Elchanan Mossel’s fantastic probability problem.

For today’s project we looked at two problems inspired by these two projects. The problems seem pretty similar:

(1) If you have a fair 6-sided die with sides marked 2, 2, 4, 4, 6, and 6, how many rolls on average will it take for you to roll a 6.

(2) If you have a fair 6-sided die with sides marked 1, 2, 3, 4, 5, and 6, how many rolls on average will it take to roll a 6 if any sequence of rolls containing an odd number prior to seeing a 6 doesn’t count. So, 2, 4, 4, 6 would count, for example, and 2, 4, 5, 6 would not count.

I started the project today looking at the first problem, which is inspired by the Ox Blocks project:

Now we moved to the 2nd problem. To introduce the problem I had the boys play the game a few times and we found that lots of sequences of rolls were thrown out:

To help the boys understand this second game a bit more I moved to a slightly different question -> for valid sequences of rolls in the 2nd game, how often do you see a 6 on the first roll.

This question was slightly difficult for the kids to understand, but we made pretty good progress:

Finally, we went to the computer to run a simulation for the 2nd game. This video runs a little long as I asked my younger son to explain the program. But once we get through the explanation we see that their guesses for the expected number of rolls and also the percentage of 6’s on the first roll were roughly right!

Writing a program to see how long it takes to get HHHH when flipping a coin

Yesterday we did a fun project exploring how long it takes, on average, to create certain words like COVFEFE or ABRACADABRA when selecting letters at random. We also simplified the problem a bit by looking at sequences of H’s and T’s for coin flips. That project is here:

Talking Markov chains and Martingales with kids

Today’s project was writing a computer program to simulate flipping a coin until we saw HHHH. In yesterday’s project we found that it would take 30 coin flips on average. We started today’s project by talking about how to write a program to do this simulation. Following this discussion the boys wrote their program off camera:

When the boys finished their program we talked through it and looked at the shape of the distribution of the number of flips it took to get to HHHH. They were pretty surprised by this shape:

To wrap up the project we spent 5 min talking about how the program would need to change to look at a general sequence of 4 flips – HTHT, for example. We didn’t actually make the changes, though, as we’d already spent enough time working through the ideas this morning:

This was a fun statistics / programming project that has a pretty surprising result. We’ll definitely have to follow up with the program for a generic sequence of flips soon!

Exploring Markov chains and Martingales with kids

Earlier this week I saw a neat tweet from Greg Egan:

It reminded me of an old project we did back in 2017 using Markov chains and Martingales:

The Most Interesting Piece of Math I Learned in 2017 -> The COVFEFE Problem

The Martingale take on the COVFEFE problem (and the amazing ABRACADABRA problem) came from this paper:

Martingale’s and the ABRACADABRA problem by Di Ai

This week I was using some of the Markov chain ideas for a few fun projects with my older son who is studying linear algebra. Today I though it would be fun to revisit the COVFEFE problem and then look at some coin flipping examples inspired by Greg’s tweet.

We started with a brief discussion of the COVFEFE problem and then switched to coin flipping at the end. It took me a bit to get my brain going on this project – sorry for a few obvious mistakes at the beginning of the discussion . . . .

Next we went to the computer to look at the approach to the word typing / coin flipping problems via Markov chains. Instead of the COVFEFE problem, we are looking at the expected number of coin flips required to see the sequence HTHT. The ideas here and the code are things I learned from Nassim Taleb – see the references in the project linked above:

Next we returned to the whiteboard to talk about the Martingale approach to the problem. The ideas here are things that I learned from Christopher Long and also from the paper linked above.

Here we take a quick look at the COVFEFE problem, the ABRACADABRA problem, and see why the HTHT problem takes on average 20 flips.

Finally, we computed the expected number of flips required to see each of the 16 different combinations of 4 coin flip sequences -> HHHH, HHHT, HHTH, and etc. The calculation for all 16 cases took a little longer than I usually want one of these videos to run, but I wanted to do all of the cases to help the boys understand the ideas that go into the calculations:

A detailed discussion of the concepts of both Markov Chains and Martingales are above what anyone could reasonably expect a 10th grader and 8th grader to understand. But the ideas are so neat that I thought showing these fun examples would make a great project for kids.

Sharing some of Nassim Taleb’s ideas about probability distributions with kids

Yesterday Nassim Taleb shared a short paper looking at the tails of various probability distributions.

The paper is not for kids – the math is advanced – but I thought there might be a way to connect some of Taleb’s ideas with the project that we did on the coupon collector problem yesterday. By coincidence, in that project we’d spent some time talking about maximum values in a bunch of repeated trials.

Here’s that project:

Sharing the coupon collector problem with kids

So, we started by talking about the distributions we saw in yesterday’s project – especially the distribution of the number of trials required to find all 5 coupons (or, maybe even more simply – the number of rolls required to see all 5 numbers on a 5-sided die).

Also in this video I’m trying to introduce the idea that Taleb was studying – can we say anything about the tail of a distribution having seen only 100, 200, or even 1,000 samples?

Now we moved to the computer and looked more carefully at our 100, 200, and 1000 sample trials versus a 1 million sample trial. The boys were able to see at a high level how the amount of unseen area in the tail declines roughly like 1/n where n is the number of trials. This was one of the results in Taleb’s paper that I thought they would be able to understand visually.

Now I switched to a distribution that you really can’t say much about even if you have millions of samples. The problem is the so-called “archer” problem that we’ve explored before:

Helping kids understand when the central limit theorem applies and when it doesn’t

First I introduced the problem and let my younger son notice that we were really studying the distribution of \tan( \theta ) since he’s learning trig now.

Finally, we returned to the computer to see the strange distributions that come from the archer problem. Even thought the boys had seen some of the ideas here before they were still surprised. Try to guess some of the numbers along with them as you watch the video!