Ride the bus, a scholar's game

Last week, my buddies and I had a few hours between the last football game of the day and that evening’s entertainment. So we decided to make some entertainment for ourselves in the form of a drinking game called ride the bus. A quick Google search suggests that there are maybe 500 different games called “ride the bus”. But I’ll explain the version we played. There are a few initial rounds that produce a loser, who then has to ride the bus. Four rows of four cards each are dealt face down, and there’s a task for each row

Here’s why it’s a game. If your guess is wrong on R1, the card you turned over is discarded, and new one is dealt in its place, you have to take 2 drinks, and then you start over. If you are wrong on R2, you replace both cards you turned over, take 4 drinks, and start over. If you’re wrong on R3, same thing happens but you take 6 drinks. If you’re wrong on R4, same thing happens but you’re taking 8 drinks. So if you get all four rows right on the first round, you don’t have to drink anything. In the worst case, you never get four in a row right, and you have to take 2 drinks per card for a nice total of 104 sips of awful beer. Mercy rule usually gets called before then.

Statistics of beer drinking

So the first question that came to mind was, “how many drinks can I expect to take?” So you start by computing the probability of succeeding in each row on a random guess: 0.5 for R1, a little better than 0.5 for R2 and R3, and 0.25 for R4. But then I stopped, because that’s a very homework-y way of doing this.

First off, I’m not randomly guessing. If I’ve seen a bunch of red cards, then I’ll start guessing black for R1. If I draw a Jack in R1 and a 9 in R2, then I’ll guess outside for R3. Second off, even if I wrote down my decision rules and could do this with pencil and paper, I’d get bored and stop because it’s a hard problem and just not that fun. But one of the nice things about doing statistics these days is that we don’t need to do stuff on paper. Trying to compute a p-value table for a sine-transformed sum of two non-central F-distributed random variables by hand? Stop, simulate it in R or Python, and use the three years you just saved to build that motorcycle from scratch you’ve always dreamed of (“of which you’ve always dreamed”?).

Coding up the simulator

We could do this more numerically than what I coded up, but this OO approach is more legible. First, we need classes for cards and a deck. For the game, it’ll be helpful to quickly get the color and suit of a card, and have them be comparable based on face value. The deck needs to initialize to a full deck, be shufflable, be able to deal cards, and say if it’s empty.

Modeling a player is a little trickier. I gave two examples above of how a human isn’t random. The second of example - guessing inside when the range of the first two cards is small - is easy to handle. The first - trying to implement memory - isn’t straight forward. My implementation gives every player a skill level between 0 and 1. If they have skill level p, then for every card they see, a random number between 0 and 1 is drawn and if the number is smaller than p, they record the suit and face value of the card. So if p = 0.5, they remember around half the cards they see; if it’s zero, they remember none; and if it’s one, they remember everything. The player will always know the card they turned over in R1 and R2 to make guess for R2 and R3 (they’re heads up in the real version); the skill level determines if they record it for use in later rounds. The player class has methods to see a card, forget everything it has remembered, and submit guesses for each of the four games. The guessing methods try to count the number of cards remaining in for each option (e.g. red vs. black, higher vs. lower) based on what the player remembers and previous rows if applicable, and guesses the option with the most remaining cards. For kicks, we can define a skill level p = -1 player (also known as an “idiot”, or “Paul”), who always guesses black, higher, a die roll for between or outside*, and spades.

The final class is the game class. The game class takes a player and allows it to play however many games it wants. A call to play_game() wipes the player’s memory, shuffles a new deck, and sets up the board. At each row, it has the player guess and determines if they’re right or not. Based on this, it resets and assigns the player to drink, or lets the player continue. We can make a simplification to the human version without affecting any probabilities: use a one by four grid instead of a four by four grid.

*R3 is the only weird one, since always guessing outside gives you two thirds odds and always guessing between gives you one third odds.

My computer is drunk: simulation results

Back to the first question: how many drinks can I expect to take. I created five players: one idiot and players with skills 0, 0.25, 0.5, and 1. I then had each of them play 2,500 games of ride the bus and monitored each’s convergence towards an estimate of the expected number of drinks they have to take per game. See the plot below.

Expected drinks for ride the bus. Han Solo.
Average number of drinks taken as the number of games played increase for five differently skilled players. The legend gives the average number of drinks taken after 2,500 games.

So the big gain happens between random guessing and skill zero. The only real change here is that in R2 and R3, we use information from the previous rows we turned over; R1 and R4 are effectively random guesses since the player can’t remember how many of each suit they’ve seen. Between remembering no cards and remembering all cards, it’s only a 7.2 drink difference (less than one unlucky round). And between remembering no cards and remembering around 25% of the cards you see, it’s only 3.1 drinks. And honestly, most real humans probably have skill less than 0.25. So the dude thinking real hard about the cards he’s seen so that he can crush ride the bus like a boss probably hasn’t done enough statistical simulations. Just play the game and have some fun.

The one cool thing you can do is win on your first round. We’ll use a perfect skilled player, since we only care about the first round and the previous cards are heads up at this point. That said, a zero skill player would probably fair about the same. Bottom line: you have around an 8.25% chance of winning in the first round and not having to drink. Better odds than I thought, since random guessing would be around 3.125% from the reasoning a few paragraphs ago.

The worst thing that can happen is getting to R4, guessing the suit wrong, then taking 8 drinks. For a player who doesn’t remember any cards, this happens 2.46 times per game on average. For a player who remembers every card, this happens 2.17 times per game on average. For an idiot, this happens (gasp) 1.47 times per game on average. Of course, this is because they rarely ever reach R4. The last row is hard to reach for an idiot because R2 and R3 are really hard for them: that’s when having even a simple strategy (e.g. if the R1 card is low, guess higher) helps a lot. When they do reach R4, they have just as much success as a skill zero player.

There’s a lot more you could look at here. How many times can you expect to have your heart broken on R4 twice in a row? What do the histograms of number of drinks taken for each skill level look like (not tryna mess with matplotlib that much)? That’s the great thing about statistical simulation. You don’t have to get out a new piece of paper and start each calculation from scratch. You just add a counter to your existing code or write a simple plot function.

One last thought on probabilistic decision making. A player who can’t remember any cards and one who can remember every card are both doing the best they can, in a sense. In the first case, they have no information other than the contents of a full-deck and the cards from previous rows. In the latter, they know the distribution of the remaining cards precisely (and the cards from previous rows too). But a player with a non-zero, non-perfect skill level isn’t doing the best they can: they aren’t quantifying their uncertainty. We have some idea of what the distribution of the cards remaining is, but we also know that we don’t know it entirely. This is the most interesting case too, since this is how most real humans play. One approach might be to put a prior on the card distribution being the original, full-deck distribution and then use the cards they can remember to construct the posterior. I want to revisit this later, either to implement that strategy (probably to minimal effect) or compare how predefined decision rules compare to a learned model (either tree-based or softmax regression). Ride on.