HumanAgent Interface

openspiel_env

OpenSpiel Environment



Integration of OpenSpiel games with the OpenEnv framework. OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection of 70+ game environments for RL research.

Supported Games



This environment supports 6 games across different categories:

Single-Player Games (No Opponent)


1. Catch - Move horizontally to catch a falling ball
2. Cliff Walking - Navigate grid without falling off cliff (Sutton & Barto benchmark)
3. 2048 - Classic tile-merging puzzle game
4. Blackjack - Simplified blackjack (HIT/STAND only)

Multi-Player Games (with Bot Opponent)


5. Tic-Tac-Toe - Classic 3x3 game
6. Kuhn Poker - 2-player simplified poker (game theory benchmark)

Architecture



┌────────────────────────────────────┐
│ RL Training Code (Client) │
│ OpenSpielEnv.step(action) │
└──────────────┬─────────────────────┘
│ HTTP
┌──────────────▼─────────────────────┐
│ FastAPI Server (Docker) │
│ OpenSpielEnvironment │
│ ├─ Wraps rl_environment.Env │
│ ├─ Agent controls player 0 │
│ └─ Opponent: Random/Fixed │
└────────────────────────────────────┘


Installation & Usage



Option 1: Local Development (without Docker)



Requirements:
  • OpenSpiel must be installed (see https://github.com/google-deepmind/open_spiel)

  • Python 3.11+


  • from envs.openspiel_env import OpenSpielEnv, OpenSpielAction

    Start local server manually


    python -m envs.openspiel_env.server.app



    Connect to local server


    env = OpenSpielEnv(base_url="http://localhost:8000")

    Reset environment


    result = env.reset()
    print(f"Initial state: {result.observation.info_state}")
    print(f"Legal actions: {result.observation.legal_actions}")

    Take actions


    for _ in range(10):
    action_id = result.observation.legal_actions[0] # Choose first legal action
    result = env.step(OpenSpielAction(action_id=action_id))
    print(f"Reward: {result.reward}, Done: {result.done}")
    if result.done:
    break

    Cleanup


    env.close()


    Option 2: Docker (Recommended)



    Build Docker image:

    cd OpenEnv
    docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .


    Run specific games:

    Catch (default)


    docker run -p 8000:8000 openspiel-env:latest

    Tic-Tac-Toe with random opponent


    docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest

    Kuhn Poker


    docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest

    2048


    docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest


    Use with from_docker_image():

    from envs.openspiel_env import OpenSpielEnv, OpenSpielAction

    Automatically starts container


    env = OpenSpielEnv.from_docker_image("openspiel-env:latest")

    result = env.reset()
    result = env.step(OpenSpielAction(action_id=0))

    env.close() # Stops container


    Game-Specific Information



    1. Catch


  • Type: Single-player

  • Action Space: 3 actions (left, stay, right)

  • Observation: 5x5 grid flattened (25 dimensions)

  • Reward: +1 for catching ball, 0 otherwise

  • Episode Length: ~10 steps


  • env = OpenSpielEnv.from_docker_image("openspiel-env:latest")

    Or set OPENSPIEL_GAME=catch



    2. Tic-Tac-Toe


  • Type: 2-player turn-based, perfect information

  • Players: Agent (X) vs Random Bot (O)

  • Action Space: 9 positions

  • Observation: 27 dimensions (3x3 board + game state)

  • Reward: +1 win, -1 loss, 0 draw/mid-game


  • Set environment variable or run directly


    docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest


    3. Kuhn Poker


  • Type: 2-player turn-based, imperfect information

  • Players: Agent vs Random Bot

  • Action Space: 2 actions (pass/fold, bet/call)

  • Observation: 6 dimensions (card + betting history)

  • Reward: Pot winnings (typically -1, 0, +1, +2)

  • Notes: THE benchmark for imperfect-information RL


  • docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest


    4. Cliff Walking


  • Type: Single-player grid world

  • Action Space: 4 actions (up, down, left, right)

  • Observation: Position encoding

  • Reward: -1 per step, -100 for falling off cliff

  • Notes: Classic RL benchmark from Sutton & Barto


  • docker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking openspiel-env:latest


    5. 2048


  • Type: Single-player puzzle

  • Action Space: 4 actions (up, down, left, right)

  • Observation: 4x4 grid with tile values

  • Reward: Points from merging tiles

  • Notes: Stochastic tile spawning


  • docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest


    6. Blackjack


  • Type: Single-player vs dealer

  • Action Space: 2 actions (HIT, STAND)

  • Observation: Player hand + dealer's visible card

  • Reward: +1 win, -1 loss, 0 draw

  • Notes: Simplified version, no double/split


  • docker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack openspiel-env:latest


    Configuration



    Environment Variables



  • OPENSPIEL_GAME: Game name (default: "catch")

  • OPENSPIEL_AGENT_PLAYER: Player ID for agent (default: 0)

  • OPENSPIEL_OPPONENT_POLICY: Opponent policy for multi-player games

  • - random: Uniform random (default)
    - first: Always picks first legal action
    - last: Always picks last legal action

    Example: Tic-Tac-Toe with Fixed Opponent



    docker run -p 8000:8000 \
    -e OPENSPIEL_GAME=tic_tac_toe \
    -e OPENSPIEL_OPPONENT_POLICY=first \
    openspiel-env:latest


    API Reference



    OpenSpielAction



    @dataclass
    class OpenSpielAction(Action):
    action_id: int # Action to take
    game_name: str = "catch" # Game name
    game_params: Dict[str, Any] = {} # Optional game parameters


    OpenSpielObservation



    @dataclass
    class OpenSpielObservation(Observation):
    info_state: List[float] # Agent's information state
    legal_actions: List[int] # Legal action IDs
    game_phase: str # "initial", "playing", "terminal"
    current_player_id: int # Current player (-1 for simultaneous)
    opponent_last_action: Optional[int] # Last opponent action (if available)
    done: bool # Episode finished
    reward: Optional[float] # Reward for last action


    OpenSpielState



    @dataclass
    class OpenSpielState(State):
    episode_id: str # Unique episode ID
    step_count: int # Number of steps
    game_name: str # Game name
    agent_player: int # Agent's player ID
    opponent_policy: str # Opponent policy name
    num_players: int # Total players


    Testing



    Automated Testing (All 6 Games)



    Quick test of all games in Docker:
    ./test_docker_all_games.sh


    This automated script will:
  • Build and run Docker containers for each game

  • Test reset, step, and state APIs

  • Verify episode completion

  • Report pass/fail for all 6 games


  • Expected output:
    ========================================
    OpenSpiel Docker Integration Test
    ========================================

    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
    Testing: catch
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
    🐳 Starting Docker container...
    ⏳ Waiting for server to be ready...
    ✓ Server ready (2s)
    🎮 Running Python client test...
    ✓ PASSED - Episode completed successfully

    [... tests all 6 games ...]

    ========================================
    Test Summary
    ========================================

    ✓ catch
    ✓ tic_tac_toe
    ✓ kuhn_poker
    ✓ cliff_walking
    ✓ 2048
    ✓ blackjack

    Total: 6 passed, 0 failed out of 6 games

    ========================================
    All tests PASSED! 🎉
    ========================================


    Manual Testing



    Local (requires OpenSpiel installed)


    python -m pytest src/envs/openspiel_env/

    Docker build


    docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .

    Run specific game


    docker run -p 8000:8000 openspiel-env:latest

    Test from another terminal


    python3 examples/openspiel_simple.py


    Development



    Adding New Games



    To add support for more OpenSpiel games:

    1. Verify the game works with rl_environment.Environment
    2. Test with different opponent policies if multi-player
    3. Document game-specific configuration
    4. Add example script

    Limitations



  • Simultaneous-move games: Only agent_player=0 supported

  • Multi-agent training: Single agent only (no self-play yet)

  • Opponent policies: Random and fixed only (no MCTS yet)

  • Build time: Docker image takes ~5-10 minutes to build (compiles C++)


  • Future Work



  • MCTS opponent policies

  • Self-play support (multiple agents)

  • More games (Chess, Go, Poker Hold'em)

  • Faster build with pre-built OpenSpiel base image

  • Game-specific reward shaping options


  • References



  • [OpenSpiel Paper (2019)](https://arxiv.org/abs/1908.09453)

  • [OpenSpiel GitHub](https://github.com/google-deepmind/open_spiel)

  • [OpenSpiel Documentation](https://openspiel.readthedocs.io/)

Take Action

The action ID to execute in the environment
Name of the game or environment

Current State

Status: Not initialized
Episode ID: -
Step Count: 0
State Observer

Current Observation

No observation yet

Action History

No actions taken yet