Earlier this week, I attended the DotNetNotts user group meeting. The second talk of the night was by Simon Painter and was about some of the algorithms used in Machine Learning.
As part of his explanation of decision trees, he used an example based on the game Guess Who?
|
Here's a screenshot, from YouTube, where the hybrid part of the meetup was relayed. |
If you're not familiar with the game, you have to ask yes or no questions to identify one of the characters:
Alex, Alfred, Anita, Anne, Bernard, Bill, Charles, Claire
David, Eric, Frans, George, Herman, Joe, Maria, Max
Paul, Peter, Philip, Richard, Robert, Sam, Susan, Tom
As part of his talk, Simon stated that the best strategy and ideal scenario for any decision tree is to divide all the available options in half (a binary split). However, for this game, there are no characteristics of the characters that make this possible (and hence the challenge of the game).
Simon did however point out that there is the possibility of using compound questions to have a greater chance of success by more evenly dividing the groups in half each time.
So, instead of limiting questions to the form of "is the person wearing a hat?" you use questions like "does the person present as female OR have facial hair?" or "does the person have blue eyes, a hat, OR red hair?"
Such questions were relevant for the rest of the talk, but it got me wondering.
I looked at all those people and their names and thought I saw something...
About half the people seem to have names containing two vowels...
It turns out that 15 people have names containing two vowels. This is better than any visual differentiator but is still far from perfect.
Ideally, you want to divide the group in half each time.
So, we'd go from 24 > 12 > 6 > 3
When you get to only having three options left, there are myriad ways (questions) to differentiate any of the options in this version of the game, but (without having done the math), it's just as quick to guess each person/option in turn.
What we need to maximize our chance of winning, and probably remove all fun from the game, is a set of questions that will divide the group in half until it gets to a group of 3.
It was only on my way home that I realized that if I'm going to look at the letters in the names of the people there are probably more and better questions that can be used to play a perfect game of Guess Who without using compound questions?
And so, (because I'm "fun") I tried.
It actually turned out to be really easy to find such questions. And by "really easy", I mean it took me less time than it's so far taken me to type this.
Here are the questions:
Question 1: Does the person's name start with a letter that comes before 'H' in the alphabet?
This is a simple split.
If they answer yes, you get the people Alex - George.
If they answer no, you are left with Herman - Tom.
If the first answer was Yes, question 2 is: Does the person's name start with a letter that comes before 'C' in the alphabet?
Another simple split.
If they answer yes, you get the people Alex - Bill.
If they answer no, you are left with Charles - George.
If the answers so far are Yes & Yes, the 3rd question is: Does the person have facial hair?
If the answer to this question is Yes, you're left with Alex, Alfred & Bernard
If the answer to this question is No, you're left with Anita, Anne & Bill.
If the answers so far are Yes & No, the 3rd question is: Does the person's name start with a letter that comes before 'E' in the alphabet?
If the answer to this question is Yes, you're left with Charles, Claire & David
If the answer to this question is No, you're left with Eric, Frans & George.
If the first answer was No, the next question to ask was the hardest to identify. Question 2 is: Does the person's name contain fewer than 3 consonants?
Another simple split.
If they answer yes, you get the people Joe, Maria, Max, Paul, Sam & Tom.
If they answer no, you are left with Herman, Peter, Philip, Richard, Robert & Susan.
If the answers so far are No & Yes, the 3rd question is: Does the person's name start with a letter that comes before 'P' in the alphabet?
If the answer to this question is Yes, you're left with Joe, Maria & Max
If the answer to this question is No, you're left with Paul, Sam & Tom.
If the answers so far are No & No, the 3rd question is: Does the person's name start with a letter that comes before 'R' in the alphabet?
If the answer to this question is Yes, you're left with Herman, Peter & Philip
If the answer to this question is No, you're left with Richard, Robert & Susan.
Yes, this was a questionable use of my time ;)
If you don't think questions about the letters in the person's name are appropriate or allowed, please keep that opinion to yourself.
Anyway, if you want to know more about machine learning the whole event is available online.
Further silly posts about over-analyzing games are unlikely to follow. "Normal" service will resume shortly.