This question from a math exam in India was flying around math twitter last week:

I mistakenly thought the question was just sort of a fun joke and not all that interesting, but then I saw a series of tweets beginning with:

Assuming a monkey is typing letters uniformly at random, the expected waiting time for "covfefe" to appear is indeed exactly 26^7. These types of questions can sometimes be tricky, depending on which states lead to which states, but not this one.

Here's a classic result in martingale theory. Same setting as the "covfefe" problem, but you're now typing "abracadabra". The answer here is not 26^11, but 26^11+26^4+26 (!). Note 11*26^11 is still a valid upper bound.

That last tweet was definitely a “wait . . . what??” moment for me.

Thinking about ABRACADABRA right off the bat was too hard, so I simplified the problem drastically too see what was going on. Suppose you are flipping a fair coin, what is the expected number of flips until you see the sequence H H? What about H T? These two problems are also, I think, great ways to introduce ideas about stopping time problems to kids. (the answers are 5 flips and 4 flips respectively).

Playing around with the easier problems showed me why the ABRACADABRA problem could have a longer stopping time than I would have guessed, but I couldn’t solve the problem exactly. Then I found this paper (written as part of an undergraduate research program at the University of Chicago!) which gave a wonderful explanation of the ABRACADABRA problem and (almost incredibly) a way to think about the problem that allows you to solve it in your head!

After going through that paper I was happy to have learned some new ideas about stopping time problems and more or less moved on. But then one more nice surprise came from the COVFEFE problem when Nassim Taleb shared his Markov chain solution:

I’d never played with Markov chains in Mathematica (or, basically anywhere really) so I thought it would be fun to use what I learned from Taleb’s code to explore the ABRACADABRA problem. Working through that code gave me a much better understanding of Long’s “which states lead to which states” comment above. It took me a bit of time to realize, for example, that the state ABRA can move to the state AB, for example.

Again, copying Taleb’s code, here’s the transition matrix:

and the graph of the states plus the stopping time which matches 26^11 + 26^4 + 26:

It was fun to learn that the original COVFEFE problem was part of a class of problems that are much more subtle than they might seem at first glance. Learning about the connections to martingales and learning how to implement Markov chains in Mathematica was a really nice surprise, too.

4 thoughts on “The most interesting piece of math I learned in 2017 -> the COVFEFE problem”

Funny thing about the HH/HT problem is that if throw a coin and stop when any of those sequences appear, the probability of being HH or HT is the same.

I’m glad you asked as it helped me remember how the process worked. The transition matrix is a little hard to see, so I zoomed in on it a bit. There are 4 possible transitions from ABRA.

(from right to left in the transition matrix):

(1) ABRAC will happen 1/26 of the time when the 5th letter is a C.

(2) AB will happen 1/26 of the time when the 5th letter is a B

(3) A will happen 1/26 of the time when the 5th letter is an A.

(4) FAIL will happen 23/26 of the time when the 5 letter is anything else.

Funny thing about the HH/HT problem is that if throw a coin and stop when any of those sequences appear, the probability of being HH or HT is the same.

Why is there a transition from ABRA -> A; wouldn’t that be a fail condition?

I’m glad you asked as it helped me remember how the process worked. The transition matrix is a little hard to see, so I zoomed in on it a bit. There are 4 possible transitions from ABRA.

(from right to left in the transition matrix):

(1) ABRAC will happen 1/26 of the time when the 5th letter is a C.

(2) AB will happen 1/26 of the time when the 5th letter is a B

(3) A will happen 1/26 of the time when the 5th letter is an A.

(4) FAIL will happen 23/26 of the time when the 5 letter is anything else.

I get it now. thank you!