OpenSpiel Environment
Integration of OpenSpiel games with the OpenEnv framework. OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection of 70+ game environments for RL research.
Supported Games
This environment supports 6 games across different categories:
Single-Player Games (No Opponent)
1.
Catch - Move horizontally to catch a falling ball
2.
Cliff Walking - Navigate grid without falling off cliff (Sutton & Barto benchmark)
3.
2048 - Classic tile-merging puzzle game
4.
Blackjack - Simplified blackjack (HIT/STAND only)
Multi-Player Games (with Bot Opponent)
5.
Tic-Tac-Toe - Classic 3x3 game
6.
Kuhn Poker - 2-player simplified poker (game theory benchmark)
Architecture
┌────────────────────────────────────┐
│ RL Training Code (Client) │
│ OpenSpielEnv.step(action) │
└──────────────┬─────────────────────┘
│ HTTP
┌──────────────▼─────────────────────┐
│ FastAPI Server (Docker) │
│ OpenSpielEnvironment │
│ ├─ Wraps rl_environment.Env │
│ ├─ Agent controls player 0 │
│ └─ Opponent: Random/Fixed │
└────────────────────────────────────┘
Installation & Usage
Option 1: Local Development (without Docker)
Requirements:- OpenSpiel must be installed (see https://github.com/google-deepmind/open_spiel)
- Python 3.11+
from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
Start local server manually
python -m envs.openspiel_env.server.app
Connect to local server
env = OpenSpielEnv(base_url="http://localhost:8000")
Reset environment
result = env.reset()
print(f"Initial state: {result.observation.info_state}")
print(f"Legal actions: {result.observation.legal_actions}")
Take actions
for _ in range(10):
action_id = result.observation.legal_actions[0] # Choose first legal action
result = env.step(OpenSpielAction(action_id=action_id))
print(f"Reward: {result.reward}, Done: {result.done}")
if result.done:
break
Cleanup
env.close()
Option 2: Docker (Recommended)
Build Docker image:cd OpenEnv
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
Run specific games:Catch (default)
docker run -p 8000:8000 openspiel-env:latest
Tic-Tac-Toe with random opponent
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
Kuhn Poker
docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
2048
docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
Use with from_docker_image():from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
Automatically starts container
env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
result = env.reset()
result = env.step(OpenSpielAction(action_id=0))
env.close() # Stops container
Game-Specific Information
1. Catch
Type: Single-playerAction Space: 3 actions (left, stay, right)Observation: 5x5 grid flattened (25 dimensions)Reward: +1 for catching ball, 0 otherwiseEpisode Length: ~10 stepsenv = OpenSpielEnv.from_docker_image("openspiel-env:latest")
Or set OPENSPIEL_GAME=catch
2. Tic-Tac-Toe
Type: 2-player turn-based, perfect informationPlayers: Agent (X) vs Random Bot (O)Action Space: 9 positionsObservation: 27 dimensions (3x3 board + game state)Reward: +1 win, -1 loss, 0 draw/mid-gameSet environment variable or run directly
docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
3. Kuhn Poker
Type: 2-player turn-based, imperfect informationPlayers: Agent vs Random BotAction Space: 2 actions (pass/fold, bet/call)Observation: 6 dimensions (card + betting history)Reward: Pot winnings (typically -1, 0, +1, +2)Notes: THE benchmark for imperfect-information RLdocker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
4. Cliff Walking
Type: Single-player grid worldAction Space: 4 actions (up, down, left, right)Observation: Position encodingReward: -1 per step, -100 for falling off cliffNotes: Classic RL benchmark from Sutton & Bartodocker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking openspiel-env:latest
5. 2048
Type: Single-player puzzleAction Space: 4 actions (up, down, left, right)Observation: 4x4 grid with tile valuesReward: Points from merging tilesNotes: Stochastic tile spawningdocker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
6. Blackjack
Type: Single-player vs dealerAction Space: 2 actions (HIT, STAND)Observation: Player hand + dealer's visible cardReward: +1 win, -1 loss, 0 drawNotes: Simplified version, no double/splitdocker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack openspiel-env:latest
Configuration
Environment Variables
OPENSPIEL_GAME: Game name (default: "catch")OPENSPIEL_AGENT_PLAYER: Player ID for agent (default: 0)OPENSPIEL_OPPONENT_POLICY: Opponent policy for multi-player games -
random: Uniform random (default)
-
first: Always picks first legal action
-
last: Always picks last legal action
Example: Tic-Tac-Toe with Fixed Opponent
docker run -p 8000:8000 \
-e OPENSPIEL_GAME=tic_tac_toe \
-e OPENSPIEL_OPPONENT_POLICY=first \
openspiel-env:latest
API Reference
OpenSpielAction
@dataclass
class OpenSpielAction(Action):
action_id: int # Action to take
game_name: str = "catch" # Game name
game_params: Dict[str, Any] = {} # Optional game parameters
OpenSpielObservation
@dataclass
class OpenSpielObservation(Observation):
info_state: List[float] # Agent's information state
legal_actions: List[int] # Legal action IDs
game_phase: str # "initial", "playing", "terminal"
current_player_id: int # Current player (-1 for simultaneous)
opponent_last_action: Optional[int] # Last opponent action (if available)
done: bool # Episode finished
reward: Optional[float] # Reward for last action
OpenSpielState
@dataclass
class OpenSpielState(State):
episode_id: str # Unique episode ID
step_count: int # Number of steps
game_name: str # Game name
agent_player: int # Agent's player ID
opponent_policy: str # Opponent policy name
num_players: int # Total players
Testing
Automated Testing (All 6 Games)
Quick test of all games in Docker:./test_docker_all_games.sh
This automated script will:
Build and run Docker containers for each gameTest reset, step, and state APIsVerify episode completionReport pass/fail for all 6 gamesExpected output:========================================
OpenSpiel Docker Integration Test
========================================
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Testing: catch
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🐳 Starting Docker container...
⏳ Waiting for server to be ready...
✓ Server ready (2s)
🎮 Running Python client test...
✓ PASSED - Episode completed successfully
[... tests all 6 games ...]
========================================
Test Summary
========================================
✓ catch
✓ tic_tac_toe
✓ kuhn_poker
✓ cliff_walking
✓ 2048
✓ blackjack
Total: 6 passed, 0 failed out of 6 games
========================================
All tests PASSED! 🎉
========================================
Manual Testing
Local (requires OpenSpiel installed)
python -m pytest src/envs/openspiel_env/
Docker build
docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
Run specific game
docker run -p 8000:8000 openspiel-env:latest
Test from another terminal
python3 examples/openspiel_simple.py
Development
Adding New Games
To add support for more OpenSpiel games:
1. Verify the game works with
rl_environment.Environment2. Test with different opponent policies if multi-player
3. Document game-specific configuration
4. Add example script
Limitations
Simultaneous-move games: Only agent_player=0 supportedMulti-agent training: Single agent only (no self-play yet)Opponent policies: Random and fixed only (no MCTS yet)Build time: Docker image takes ~5-10 minutes to build (compiles C++)Future Work
MCTS opponent policiesSelf-play support (multiple agents)More games (Chess, Go, Poker Hold'em)Faster build with pre-built OpenSpiel base imageGame-specific reward shaping optionsReferences
[OpenSpiel Paper (2019)](https://arxiv.org/abs/1908.09453)[OpenSpiel GitHub](https://github.com/google-deepmind/open_spiel)[OpenSpiel Documentation](https://openspiel.readthedocs.io/)