OpenSpiel Environment
Integration of OpenSpiel games with the OpenEnv framework. OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection of 70+ game environments for RL research.
Supported Games
This environment supports 6 games across different categories:
Single-Player Games (No Opponent)
1. 
Catch - Move horizontally to catch a falling ball
2. 
Cliff Walking - Navigate grid without falling off cliff (Sutton & Barto benchmark)
3. 
2048 - Classic tile-merging puzzle game
4. 
Blackjack - Simplified blackjack (HIT/STAND only)
Multi-Player Games (with Bot Opponent)
5. 
Tic-Tac-Toe - Classic 3x3 game
6. 
Kuhn Poker - 2-player simplified poker (game theory benchmark)
Architecture
┌────────────────────────────────────┐
│ RL Training Code (Client)          │
│   OpenSpielEnv.step(action)        │
└──────────────┬─────────────────────┘
               │ HTTP
┌──────────────▼─────────────────────┐
│ FastAPI Server (Docker)            │
│   OpenSpielEnvironment             │
│     ├─ Wraps rl_environment.Env    │
│     ├─ Agent controls player 0     │
│     └─ Opponent: Random/Fixed      │
└────────────────────────────────────┘
Installation & Usage
Option 1: Local Development (without Docker)
Requirements:- OpenSpiel must be installed (see https://github.com/google-deepmind/open_spiel)
- Python 3.11+
from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
Start local server manually
python -m envs.openspiel_env.server.app
Connect to local server
env = OpenSpielEnv(base_url="http://localhost:8000")
Reset environment
result = env.reset()
print(f"Initial state: {result.observation.info_state}")
print(f"Legal actions: {result.observation.legal_actions}")
Take actions
for _ in range(10):
    action_id = result.observation.legal_actions[0]  # Choose first legal action
    result = env.step(OpenSpielAction(action_id=action_id))
    print(f"Reward: {result.reward}, Done: {result.done}")
    if result.done:
        break
Cleanup
env.close()
Option 2: Docker (Recommended)
Build Docker image:cd OpenEnv
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
Run specific games:Catch (default)
docker run -p 8000:8000 openspiel-env:latest
Tic-Tac-Toe with random opponent
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
Kuhn Poker
docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
2048
docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
Use with from_docker_image():from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
Automatically starts container
env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
result = env.reset()
result = env.step(OpenSpielAction(action_id=0))
env.close()  # Stops container
Game-Specific Information
1. Catch
Type: Single-playerAction Space: 3 actions (left, stay, right)Observation: 5x5 grid flattened (25 dimensions)Reward: +1 for catching ball, 0 otherwiseEpisode Length: ~10 stepsenv = OpenSpielEnv.from_docker_image("openspiel-env:latest")
Or set OPENSPIEL_GAME=catch
2. Tic-Tac-Toe
Type: 2-player turn-based, perfect informationPlayers: Agent (X) vs Random Bot (O)Action Space: 9 positionsObservation: 27 dimensions (3x3 board + game state)Reward: +1 win, -1 loss, 0 draw/mid-gameSet environment variable or run directly
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
3. Kuhn Poker
Type: 2-player turn-based, imperfect informationPlayers: Agent vs Random BotAction Space: 2 actions (pass/fold, bet/call)Observation: 6 dimensions (card + betting history)Reward: Pot winnings (typically -1, 0, +1, +2)Notes: THE benchmark for imperfect-information RLdocker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
4. Cliff Walking
Type: Single-player grid worldAction Space: 4 actions (up, down, left, right)Observation: Position encodingReward: -1 per step, -100 for falling off cliffNotes: Classic RL benchmark from Sutton & Bartodocker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking openspiel-env:latest
5. 2048
Type: Single-player puzzleAction Space: 4 actions (up, down, left, right)Observation: 4x4 grid with tile valuesReward: Points from merging tilesNotes: Stochastic tile spawningdocker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
6. Blackjack
Type: Single-player vs dealerAction Space: 2 actions (HIT, STAND)Observation: Player hand + dealer's visible cardReward: +1 win, -1 loss, 0 drawNotes: Simplified version, no double/splitdocker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack openspiel-env:latest
Configuration
Environment Variables
OPENSPIEL_GAME: Game name (default: "catch")OPENSPIEL_AGENT_PLAYER: Player ID for agent (default: 0)OPENSPIEL_OPPONENT_POLICY: Opponent policy for multi-player games  - 
random: Uniform random (default)
  - 
first: Always picks first legal action
  - 
last: Always picks last legal action
Example: Tic-Tac-Toe with Fixed Opponent
docker run -p 8000:8000 \
  -e OPENSPIEL_GAME=tic_tac_toe \
  -e OPENSPIEL_OPPONENT_POLICY=first \
  openspiel-env:latest
API Reference
OpenSpielAction
@dataclass
class OpenSpielAction(Action):
    action_id: int                      # Action to take
    game_name: str = "catch"            # Game name
    game_params: Dict[str, Any] = {}    # Optional game parameters
OpenSpielObservation
@dataclass
class OpenSpielObservation(Observation):
    info_state: List[float]             # Agent's information state
    legal_actions: List[int]            # Legal action IDs
    game_phase: str                     # "initial", "playing", "terminal"
    current_player_id: int              # Current player (-1 for simultaneous)
    opponent_last_action: Optional[int] # Last opponent action (if available)
    done: bool                          # Episode finished
    reward: Optional[float]             # Reward for last action
OpenSpielState
@dataclass
class OpenSpielState(State):
    episode_id: str                     # Unique episode ID
    step_count: int                     # Number of steps
    game_name: str                      # Game name
    agent_player: int                   # Agent's player ID
    opponent_policy: str                # Opponent policy name
    num_players: int                    # Total players
Testing
Automated Testing (All 6 Games)
Quick test of all games in Docker:./test_docker_all_games.sh
This automated script will:
Build and run Docker containers for each gameTest reset, step, and state APIsVerify episode completionReport pass/fail for all 6 gamesExpected output:========================================
OpenSpiel Docker Integration Test
========================================
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Testing: catch
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  🐳 Starting Docker container...
  ⏳ Waiting for server to be ready...
  ✓ Server ready (2s)
  🎮 Running Python client test...
  ✓ PASSED - Episode completed successfully
[... tests all 6 games ...]
========================================
Test Summary
========================================
  ✓ catch
  ✓ tic_tac_toe
  ✓ kuhn_poker
  ✓ cliff_walking
  ✓ 2048
  ✓ blackjack
Total: 6 passed, 0 failed out of 6 games
========================================
All tests PASSED! 🎉
========================================
Manual Testing
Local (requires OpenSpiel installed)
python -m pytest src/envs/openspiel_env/
Docker build
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
Run specific game
docker run -p 8000:8000 openspiel-env:latest
Test from another terminal
python3 examples/openspiel_simple.py
Development
Adding New Games
To add support for more OpenSpiel games:
1. Verify the game works with 
rl_environment.Environment2. Test with different opponent policies if multi-player
3. Document game-specific configuration
4. Add example script
Limitations
Simultaneous-move games: Only agent_player=0 supportedMulti-agent training: Single agent only (no self-play yet)Opponent policies: Random and fixed only (no MCTS yet)Build time: Docker image takes ~5-10 minutes to build (compiles C++)Future Work
MCTS opponent policiesSelf-play support (multiple agents)More games (Chess, Go, Poker Hold'em)Faster build with pre-built OpenSpiel base imageGame-specific reward shaping optionsReferences
[OpenSpiel Paper (2019)](https://arxiv.org/abs/1908.09453)[OpenSpiel GitHub](https://github.com/google-deepmind/open_spiel)[OpenSpiel Documentation](https://openspiel.readthedocs.io/)