Black Jack Through Reinforcement Learning By Jonathan Quenzer To have a computer learn how to play Blackjack through reinforcement learning Computer starts off with no memory After each hand is played the computer learns more ID: 174443
Download Presentation The PPT/PDF document "Learning How to Play" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Learning How to Play Black Jack Through Reinforcement Learning
By: Jonathan
QuenzerSlide2
To have a computer learn how to play Blackjack through reinforcement learningComputer starts off with no memory. After each hand is played, the computer learns more.
Goal is to have computer make the best possible decision of how much to bet and when to hit/stay
Splitting hands and doubling down will not be included. This decreases the odds of winning.
ObjectivesSlide3
The dealer has a 5-8% advantage depending on the specific rules without using card countingThe player through correct strategy and using card counting can obtain at most a 2% advantage on the dealer
The Odds of WinningSlide4
I wrote a Matlab program to simulate Black Jack. Feature vectors were generated by running the program and analyzing each hand played.All of the features were scaled to have a mean of ½, minimum of 0, and maximum of 1
.
Experimental SetupSlide5
Feature Set Generation
Hand History
Ace
The number of cards remaining in the deck for each value of card
2
…
10
PT
Player Total Without Aces
PANumber of Aces in Player's HandDSCThe Dealers Showing CardHit/Stay+1 = Hit, -1 = StayWin1.5 = Black Jack, 1 = Win, 0 = LoseLose1 = Lose, 0 = WinPush1 = Push, 0 = Win
Feature Set Hit/StayAceThe number of cards remaining in the deck for each value of card2…10CC Calculated Card Count: Hi-Opt II usedPTPlayer Total Without AcesPANumber of Aces in Player's HandDSCThe Dealers Showing CardHit/Stay+1 = Hit, -1 = Stay: Opposite of Hit/Stay from Hand History if the player lost
Feature Set Bet AmountAceThe number of cards remaining in the deck for each value of card2…10CC Calculated Card Count: Hi-Opt II usedBet Min/Max+1.5 if hand was Black Jack, +1 if won, -1 if lost
= ClassificationSlide6
Example of 5 nearest neighborsNeighbors sum to +3, so decide to Hit
KNN classifierSlide7
Computer started with no knowledgeThe player gained advantage over dealer using 10 nearest neighbors
ResultsSlide8
Computer simulated three players playing 1000 handsComputer started with large feature set from 5000 hands
Results