Here is a game you can play to figure out how you model people.
Play each agent for as many turns as you like.
On every turn the agent will put a reward in one of the boxes and a penalty in the other
Your job then is to guess which box has the reward, and hence collect as much reward as you can. Hint: For each agent try coming up with a strategy to achieve a big average reward.
Pick an Agent:
Agent 1
Agent 2
Agent 3
Select a box:
Predicted probs:
I strongly recommend filling the box below before clicking Reveal.
Agent 1 is Random it puts the reward in the two boxes with equal probability.
Agent 2 is Evil or, as the cool kids say it Adversarial, it tries to guess which box you're going to click
and then puts the reward in the other box.
Agent 3 is Good, it tries to put the reward in the box it thinks you're goning to click.
So how do Agent 2 and 3 predict what you're going to click on ?
This implementation closely follows the paper
AI Safety Gridworlds in which the authors use simple
Exponential Smoothing to assing probabilities
to each box.
Try adjusting the learning rate here:
You'll see the closer the learning rate is to 1 the more the prediction will be affected by what happend in the recent
past.
After calculating the probabilities the Good agent just selects the box with the higher probability and the Evil agent does the opposite.
This simple enviroment is supposed to test how different RL algorithms modle different kinds of agents.