SIMULATION SETTINGS
Tip: Try running with default parameters first to see how each algorithm performs, then experiment with different settings.

Environment

2-10 slots
pulls

Algorithms

UCB Parameters

1.0

ε-Greedy Parameters

0.10

Thompson Parameters

1.0
1.0

Slot Machines (Arms)

Each arm has a fixed but initially unknown reward probability

🏆 Run simulation to see results!

Cumulative Reward

Total rewards accumulated over time

Cumulative Regret

Difference from optimal strategy

Exploration Rate

Fraction of non-greedy actions taken over time

Arm Selection Distribution

Frequency of pulls per arm for each algorithm