As some people may already know, the second Man vs. Machine Poker Challenge is set to begin at the Rio in a week. Online pros will be taking on Polaris 2 this time around. I’ve covered this challenge with a couple of previous posts already in one where I talked about it and another where I responded to some comments by a Polaris programmer named Mike Johanson.
Well I had some questions about the actual machine known as Polaris that Phil Laak and Ali Eslami played last year and Mike was kind enough to answer all of the questions that I had about it. He had some very interesting things to say on the subjects of Polairs and how it is able to compete against human players such as the online pros it will be taking on July 3rd.
Below are my questions and his answers;
Jeremy: I know you’re pretty busy getting things together with the Polaris competition only about a week away but I was wondering if you would have time to answer a few quick questions about it.
Mike: No problem! We’re in pretty good shape this year. Last year, we were still figuring out which bots to put in the seat up until the day of the competition. This time, we’ve had a pretty good idea for the last month.
Part of why we’re well-prepared right now is because this is our second poker competition this summer. There’s also the AAAI Computer Poker Competition, which is an open, research-oriented poker tournament for bots. The results are announced at the AAAI (Association for the Advancement of Artificial Intelligence)
conference each year (mid-July this year), but the bots play for a solid month, starting June 15th. This year, the bots are playing two Heads-Up Limit events, one Heads-Up No-Limit event, and a 6-player Limit event. The nice thing about the computer matches is that the bots play millions of hands, so you know down to the millibet (0.001 small bets/hand) how well they do against each other.
The bots we submitted to AAAI are close cousins to the bots we’ll use in the Man-Machine competition, so we’ve been ready to play since the 15th.
Jeremy: I’ve heard Polaris has been upgraded and improved upon from last year when it played Phil Laak and Ali Eslami. What were the upgrades that your team made to the machine?
Mike: We’ve made a lot of progress in two areas – board texture and adapting.
Last year’s Polaris bot didn’t have a good grasp of the board texture. It always had a good idea of how likely it was to win a hand, but it had a murky idea of how much potential to improve its hand had (like flush or straight draws) or how easy it was to bluff (or be bluffed) with certain boards. We’ve made some great progress on that this year.
In matches 1, 2 and 4 of last year’s competition, Polaris didn’t adapt to its opponent at all. It used exactly the same strategy on hand 1 as on hand 500. Our bot plays close to a Nash equilibrium strategy, which means it’s tough to beat, but it isn’t able to exploit opponent weaknesses. If you want to try to win instead of trying to not lose, you need to be able to change what you’re doing to exploit your opponent.
In match 3 last year, we tried an experimental learning bot, but it had some issues. This year, we have a great system for adapting safely to increase the amount we win from an opponent. This system is actually the subject of a paper we’re presenting at a conference on July 8th, right after the competition. A lot of the work we’ve done on poker has led to discoveries that apply outside of poker and outside of games. We’re doing a lot of good science through this work.
Jeremy: What exactly is Polaris? Is it an actual machine or computer?
Mike: Polaris is the name we use for the collection of programs we have that play poker. For example, the bot that’ll play at the Man-Machine match just plays heads-up limit, but when we branch into No-Limit or work on Ring again, we’ll call those bots Polaris, too.
Most of the work in making Polaris is done well before the match on one of the University of Alberta’s clusters. The program teaches itself how to play poker by playing billions of hands against itself – there’s very little human knowledge that goes into designing its strategy. To do that, we use 8 CPUs with 8 gigs of RAM each, and we run it for two to three weeks. That winds up making a 30 gigabyte program that describes the strategy that bot will use to play poker.
At the match, since the bot has already self-taught itself how to play, it doesn’t need a very powerful compute to actually play. We can run it on a single off-the-shelf laptop.
So, if anything was the machine behind Polaris, it would be the cluster we use to train the bot, and not the Mac laptop on stage sitting across from the pro.
Jeremy: This is now the second year that Polaris has played some professional poker players. Do you think that this will become a long standing competition between pro poker players and Polaris?
Mike: We certainly hope so – we learn a lot from the matches, we think the pros do too, and it’s a lot of fun all around. Eventually, when we’ve gotten as far as we can with Heads-Up Limit, we’d like to branch out to other games. No-Limit and Ring present their own challenges, and we can learn a lot about the science of AI from asking questions like “What’s different about this game? What can we reuse, and what’s totally new?”
If we have a convincing win this year, we might try to play No-Limit next year and will probably lose badly. We can build on that and do better the following year, and so on.
Jeremy: Will you personally be making the trip to the Rio to watch the competition?
Mike: Yep, we’re taking the whole research group this year. There’s 8 of us graduate students that do the programming, four professors, and hopefully a couple of our past members that still contribute will join us, too. It’s a big team effort.
Jeremy: Do you have any predictions on the outcome of the competition?
Mike: I think our chances of winning are pretty good. It was a tight match last year, and while our bot is a lot stronger, we’re also playing against Heads-Up Limit specialists this time. We’ve been doing a lot of pre-match testing, though, and the experts we’ve been playing against seem really impressed. When poker pros say after a match that “I’d let it play my chips any time”, we know we’re on the right
track.