Sign in to follow this  
Followers 0
Honest

Why does the return rate of a game appear to vary when calculated by t

9 posts in this topic

I have coded two C# programs, which use two different approaches to evaluate the outcome of a certain casino-style game (casino-style in the sense that the user pays points to take a turn, and sometimes receives points as a reward depending on that turn’s result). Program 1 calculates the average profitability of the best game play decision for each possible game state, starting at a round's end and working to the beginning. The average profitability of the starting game state is equivalent to, thus can be used to infer, the average profitability of the game as a whole. Program 1 also outputs a proper strategy for the game.

 

Program 2 accepts a strategy as input (I use the same strategy generated by Program 1), and simulates actual beginning-to-end game play using that strategy, cycling through many iterations of the game to gather statistics. This program outputs the return rate of the game based on the simulated trials (100% being breakeven).

 

Desired behavior: To produce correct and non-contradictory results in Program 1’s gamePlay.value variable for the starting game state (representing the game’s profitability in points), and Program 2’s returnRate variable (representing the game’s return rate).

 

Specific problem: Program 1’s gamePlay.value variable for the starting game state (Colorless, Colorless, Colorless) outputs 51.025 when the user inputs the same starting parameters as those which are hard-coded into Program 2 (namely, cost = 51 and baseBet = 50). A secondary task of Program 1 is to calculate the average number of turns remaining in the round, for each possible game state. Again, by noting this value for the starting state, the average number of turns in the round as a whole is known. There are, on average, 4.246 turns per round. By multiplying this number by the cost per turn, 51, we see that the average cost per round is 216.546. Adding the 51.025 profit yields 267.571, and dividing this number by the cost per round reveals a 123.563% return rate for the game.

 

This is not what Program 2 calculates, even using an extremely large number of game play samples. Some output samples, each of which are the result of one million game play turns, include:

 

1.00978242220553 1.00976014965254 1.00977590536083 1.0098289475708 1.00979315220468

 

123.563% and 100.98% are very far from each other.

 

Code to reproduce problem:

 

PROGRAM 1

 

http://pastebin.com/Ar3AptcK

 

PROGRAM 2

 

http://pastebin.com/GrH17kEZ

 

What I Have Considered: The possible types of problems I have tried to categorize this as are programming logic errors, snowballing rounding errors, and casting errors. It is difficult to isolate the error(s) because Program 1 solves the game using backward induction, while Program 2 gathers information by playing the game start to finish many times, two fundamentally different approaches. I have spent a fairly substantial amount of time with Program 1 in the debugger, including working out the results for the last few game states on paper, and it seems to be functioning properly and to a very respectable precision, as far as I can tell. Program 2 is more difficult to do this with due to the inclusion of randomness in the game, but I have stepped through a small number of iterations and the calculations seem to me to be on point. Can anyone clarify the reason that these two approaches to return rate calculation produce conflicting results?

 

The game (this is detailed info about the game being studied, feel free to skip this section): This game consists of three objects called panels, which the user paints. Each panel starts out Colorless. At first, the user must select 1, which paints one random panel. A Colorless panel will turn Purple when painted. A Purple panel will turn Blue when painted again. A Blue panel, when painted again, will immediately cause all panels to return to Colorless, and a new round begins. Once the user has painted a panel, he/she can, on the next turn, select 1 or 2, which will paint one random panel, or two random panels, respectively. Once two units of paint have been applied, the user can select 1, 2, or 3, and the appropriate result is applied. When a new round begins, only option 1 is available until 2 and 3 are unlocked again.

 

Each turn costs 51 points, regardless of whether 1, 2, or 3 is selected. There is a payout table, as laid out in the code, which may award points back to the user based on the resulting state of the panels. The user can also decide to start a new round at any time, and revert all panels to Colorless, which does not have a cost. The goal of the game is to earn rather than lose points, on average.

0

Share this post


Link to post
Share on other sites

If it's okay with the admins I'd like to offer a $30 reward (PayPal) for the single person who provides a satisfactory answer to this question.

 

A satisfactory answer is one that:

 

- Explains why the two approaches produce conflicting results, in a way clear enough for me to understand how to harmonize them.  If fixing one error uncovers another, leaving the results still conflicting, the answer is satisfactory if and only if I can easily debug the new error and achieve correct results.

 

- Is not a contrived solution that simply produces two identical percentages, but actually fixes the problem, so that if the game is changed (different number of colors/panels/payouts/etc.), both programs could be written to correctly evaluate the resulting game, as well.

0

Share this post


Link to post
Share on other sites

Can you fix the text in the op? It hurts my eyes

0

Share this post


Link to post
Share on other sites

It looks like a mod has changed the font back to default.  Anything else troubling about it?

0

Share this post


Link to post
Share on other sites

It's fine.  It'd just be way nice to read if you avoid not having it as the default in future, second button from the top left resets it, the pink and white eraser. 

1

Share this post


Link to post
Share on other sites

Before I bother reading through the code and looking at the logic, can you please just identify where all the calculations take place. I'm really not bothered to read the entire thing unless it's necessary.

0

Share this post


Link to post
Share on other sites

I mean, most of the code is calculation, so I'm not exactly sure what you mean.  The "core" calculation for Program 1 takes place in checkProfitability() starting on line 110, while Program 2 is outputting the result of line 76.

0

Share this post


Link to post
Share on other sites

After many weeks of pain, I finally figured out the problem.  The reward offer will be cancelled in 24 hours (in case someone was preparing to post the answer).  The answer has two parts; one which eliminates the major disparity, and the other explains a minor leftover difference.

 

Thanks to everyone who thought about helping with this.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.