zkwentz commited on
Commit
9d8bf2a
ยท
verified ยท
1 Parent(s): 21695fa

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # Use the pre-built OpenSpiel base image
8
+ ARG BASE_IMAGE=ghcr.io/meta-pytorch/openenv-openspiel-base:latest
9
+ FROM ${BASE_IMAGE} AS builder
10
+
11
+ # Copy OpenEnv core (base image already set WORKDIR=/app)
12
+ WORKDIR /app
13
+
14
+ ARG BUILD_MODE=in-repo
15
+
16
+ # Copy OpenSpiel environment
17
+ COPY . /app/env
18
+
19
+ WORKDIR /app/env
20
+
21
+ # Ensure uv is available (for local builds where base image lacks it)
22
+ RUN if ! command -v uv >/dev/null 2>&1; then \
23
+ curl -LsSf https://astral.sh/uv/install.sh | sh && \
24
+ mv /root/.local/bin/uv /usr/local/bin/uv && \
25
+ mv /root/.local/bin/uvx /usr/local/bin/uvx; \
26
+ fi
27
+
28
+ # Install dependencies using uv sync
29
+ # If uv.lock exists, use it; otherwise resolve on the fly
30
+ RUN --mount=type=cache,target=/root/.cache/uv \
31
+ if [ -f uv.lock ]; then \
32
+ uv sync --frozen --no-install-project --no-editable; \
33
+ else \
34
+ uv sync --no-install-project --no-editable; \
35
+ fi
36
+
37
+ RUN --mount=type=cache,target=/root/.cache/uv \
38
+ if [ -f uv.lock ]; then \
39
+ uv sync --frozen --no-editable; \
40
+ else \
41
+ uv sync --no-editable; \
42
+ fi
43
+
44
+ # Final runtime stage
45
+ FROM ${BASE_IMAGE}
46
+
47
+ WORKDIR /app
48
+
49
+ # Copy the virtual environment from builder
50
+ COPY --from=builder /app/env/.venv /app/.venv
51
+
52
+ # Copy the environment code
53
+ COPY --from=builder /app/env /app/env
54
+
55
+ # Set PATH to use the virtual environment
56
+ ENV PATH="/app/.venv/bin:$PATH"
57
+
58
+ # Extend Python path for OpenEnv (base image set PYTHONPATH=/app/src)
59
+ # We prepend OpenSpiel paths
60
+ ENV PYTHONPATH="/repo:/repo/build/python:/app/env:$PYTHONPATH"
61
+
62
+ # OpenSpiel-specific environment variables (can be overridden at runtime)
63
+ ENV OPENSPIEL_GAME=catch
64
+ ENV OPENSPIEL_AGENT_PLAYER=0
65
+ ENV OPENSPIEL_OPPONENT_POLICY=random
66
+
67
+ # Health check (curl is provided by openenv-base)
68
+ HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
69
+ CMD curl -f http://localhost:8000/health || exit 1
70
+
71
+ # Note: EXPOSE 8000 already set by openenv-base
72
+ # Run the FastAPI server (uvicorn installed by openenv-base)
73
+ ENV ENABLE_WEB_INTERFACE=true
74
+ CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
README.md CHANGED
@@ -1,10 +1,348 @@
1
  ---
2
- title: Openspiel
3
- emoji: ๐Ÿ˜ป
4
- colorFrom: yellow
5
- colorTo: indigo
6
  sdk: docker
7
  pinned: false
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: OpenSpiel Environment Server
3
+ emoji: ๐ŸŽฎ
4
+ colorFrom: red
5
+ colorTo: purple
6
  sdk: docker
7
  pinned: false
8
+ app_port: 8000
9
+ base_path: /web
10
+ tags:
11
+ - openenv
12
  ---
13
 
14
+ # OpenSpiel Environment
15
+
16
+ Integration of OpenSpiel games with the OpenEnv framework. OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection of 70+ game environments for RL research.
17
+
18
+ ## Supported Games
19
+
20
+ This environment supports 6 games across different categories:
21
+
22
+ ### Single-Player Games (No Opponent)
23
+ 1. **Catch** - Move horizontally to catch a falling ball
24
+ 2. **Cliff Walking** - Navigate grid without falling off cliff (Sutton & Barto benchmark)
25
+ 3. **2048** - Classic tile-merging puzzle game
26
+ 4. **Blackjack** - Simplified blackjack (HIT/STAND only)
27
+
28
+ ### Multi-Player Games (with Bot Opponent)
29
+ 5. **Tic-Tac-Toe** - Classic 3x3 game
30
+ 6. **Kuhn Poker** - 2-player simplified poker (game theory benchmark)
31
+
32
+ ## Architecture
33
+
34
+ ```
35
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
36
+ โ”‚ RL Training Code (Client) โ”‚
37
+ โ”‚ OpenSpielEnv.step(action) โ”‚
38
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
39
+ โ”‚ HTTP
40
+ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
41
+ โ”‚ FastAPI Server (Docker) โ”‚
42
+ โ”‚ OpenSpielEnvironment โ”‚
43
+ โ”‚ โ”œโ”€ Wraps rl_environment.Env โ”‚
44
+ โ”‚ โ”œโ”€ Agent controls player 0 โ”‚
45
+ โ”‚ โ””โ”€ Opponent: Random/Fixed โ”‚
46
+ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
47
+ ```
48
+
49
+ ## Installation & Usage
50
+
51
+ ### Option 1: Local Development (without Docker)
52
+
53
+ **Requirements:**
54
+ - OpenSpiel must be installed (see https://github.com/google-deepmind/open_spiel)
55
+ - Python 3.11+
56
+
57
+ ```python
58
+ from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
59
+
60
+ # Start local server manually
61
+ # python -m envs.openspiel_env.server.app
62
+
63
+ # Connect to local server
64
+ env = OpenSpielEnv(base_url="http://localhost:8000")
65
+
66
+ # Reset environment
67
+ result = env.reset()
68
+ print(f"Initial state: {result.observation.info_state}")
69
+ print(f"Legal actions: {result.observation.legal_actions}")
70
+
71
+ # Take actions
72
+ for _ in range(10):
73
+ action_id = result.observation.legal_actions[0] # Choose first legal action
74
+ result = env.step(OpenSpielAction(action_id=action_id))
75
+ print(f"Reward: {result.reward}, Done: {result.done}")
76
+ if result.done:
77
+ break
78
+
79
+ # Cleanup
80
+ env.close()
81
+ ```
82
+
83
+ ### Option 2: Docker (Recommended)
84
+
85
+ **Build Docker image:**
86
+
87
+ ```bash
88
+ cd OpenEnv
89
+ docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
90
+ ```
91
+
92
+ **Run specific games:**
93
+
94
+ ```bash
95
+ # Catch (default)
96
+ docker run -p 8000:8000 openspiel-env:latest
97
+
98
+ # Tic-Tac-Toe with random opponent
99
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
100
+
101
+ # Kuhn Poker
102
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
103
+
104
+ # 2048
105
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
106
+ ```
107
+
108
+ **Use with from_docker_image():**
109
+
110
+ ```python
111
+ from envs.openspiel_env import OpenSpielEnv, OpenSpielAction
112
+
113
+ # Automatically starts container
114
+ env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
115
+
116
+ result = env.reset()
117
+ result = env.step(OpenSpielAction(action_id=0))
118
+
119
+ env.close() # Stops container
120
+ ```
121
+
122
+ ## Game-Specific Information
123
+
124
+ ### 1. Catch
125
+ - **Type**: Single-player
126
+ - **Action Space**: 3 actions (left, stay, right)
127
+ - **Observation**: 5x5 grid flattened (25 dimensions)
128
+ - **Reward**: +1 for catching ball, 0 otherwise
129
+ - **Episode Length**: ~10 steps
130
+
131
+ ```python
132
+ env = OpenSpielEnv.from_docker_image("openspiel-env:latest")
133
+ # Or set OPENSPIEL_GAME=catch
134
+ ```
135
+
136
+ ### 2. Tic-Tac-Toe
137
+ - **Type**: 2-player turn-based, perfect information
138
+ - **Players**: Agent (X) vs Random Bot (O)
139
+ - **Action Space**: 9 positions
140
+ - **Observation**: 27 dimensions (3x3 board + game state)
141
+ - **Reward**: +1 win, -1 loss, 0 draw/mid-game
142
+
143
+ ```python
144
+ # Set environment variable or run directly
145
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe openspiel-env:latest
146
+ ```
147
+
148
+ ### 3. Kuhn Poker
149
+ - **Type**: 2-player turn-based, imperfect information
150
+ - **Players**: Agent vs Random Bot
151
+ - **Action Space**: 2 actions (pass/fold, bet/call)
152
+ - **Observation**: 6 dimensions (card + betting history)
153
+ - **Reward**: Pot winnings (typically -1, 0, +1, +2)
154
+ - **Notes**: THE benchmark for imperfect-information RL
155
+
156
+ ```python
157
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker openspiel-env:latest
158
+ ```
159
+
160
+ ### 4. Cliff Walking
161
+ - **Type**: Single-player grid world
162
+ - **Action Space**: 4 actions (up, down, left, right)
163
+ - **Observation**: Position encoding
164
+ - **Reward**: -1 per step, -100 for falling off cliff
165
+ - **Notes**: Classic RL benchmark from Sutton & Barto
166
+
167
+ ```python
168
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking openspiel-env:latest
169
+ ```
170
+
171
+ ### 5. 2048
172
+ - **Type**: Single-player puzzle
173
+ - **Action Space**: 4 actions (up, down, left, right)
174
+ - **Observation**: 4x4 grid with tile values
175
+ - **Reward**: Points from merging tiles
176
+ - **Notes**: Stochastic tile spawning
177
+
178
+ ```python
179
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 openspiel-env:latest
180
+ ```
181
+
182
+ ### 6. Blackjack
183
+ - **Type**: Single-player vs dealer
184
+ - **Action Space**: 2 actions (HIT, STAND)
185
+ - **Observation**: Player hand + dealer's visible card
186
+ - **Reward**: +1 win, -1 loss, 0 draw
187
+ - **Notes**: Simplified version, no double/split
188
+
189
+ ```python
190
+ docker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack openspiel-env:latest
191
+ ```
192
+
193
+ ## Configuration
194
+
195
+ ### Environment Variables
196
+
197
+ - `OPENSPIEL_GAME`: Game name (default: "catch")
198
+ - `OPENSPIEL_AGENT_PLAYER`: Player ID for agent (default: 0)
199
+ - `OPENSPIEL_OPPONENT_POLICY`: Opponent policy for multi-player games
200
+ - `random`: Uniform random (default)
201
+ - `first`: Always picks first legal action
202
+ - `last`: Always picks last legal action
203
+
204
+ ### Example: Tic-Tac-Toe with Fixed Opponent
205
+
206
+ ```bash
207
+ docker run -p 8000:8000 \
208
+ -e OPENSPIEL_GAME=tic_tac_toe \
209
+ -e OPENSPIEL_OPPONENT_POLICY=first \
210
+ openspiel-env:latest
211
+ ```
212
+
213
+ ## API Reference
214
+
215
+ ### OpenSpielAction
216
+
217
+ ```python
218
+ @dataclass
219
+ class OpenSpielAction(Action):
220
+ action_id: int # Action to take
221
+ game_name: str = "catch" # Game name
222
+ game_params: Dict[str, Any] = {} # Optional game parameters
223
+ ```
224
+
225
+ ### OpenSpielObservation
226
+
227
+ ```python
228
+ @dataclass
229
+ class OpenSpielObservation(Observation):
230
+ info_state: List[float] # Agent's information state
231
+ legal_actions: List[int] # Legal action IDs
232
+ game_phase: str # "initial", "playing", "terminal"
233
+ current_player_id: int # Current player (-1 for simultaneous)
234
+ opponent_last_action: Optional[int] # Last opponent action (if available)
235
+ done: bool # Episode finished
236
+ reward: Optional[float] # Reward for last action
237
+ ```
238
+
239
+ ### OpenSpielState
240
+
241
+ ```python
242
+ @dataclass
243
+ class OpenSpielState(State):
244
+ episode_id: str # Unique episode ID
245
+ step_count: int # Number of steps
246
+ game_name: str # Game name
247
+ agent_player: int # Agent's player ID
248
+ opponent_policy: str # Opponent policy name
249
+ num_players: int # Total players
250
+ ```
251
+
252
+ ## Testing
253
+
254
+ ### Automated Testing (All 6 Games)
255
+
256
+ **Quick test of all games in Docker:**
257
+ ```bash
258
+ ./test_docker_all_games.sh
259
+ ```
260
+
261
+ This automated script will:
262
+ - Build and run Docker containers for each game
263
+ - Test reset, step, and state APIs
264
+ - Verify episode completion
265
+ - Report pass/fail for all 6 games
266
+
267
+ **Expected output:**
268
+ ```
269
+ ========================================
270
+ OpenSpiel Docker Integration Test
271
+ ========================================
272
+
273
+ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
274
+ Testing: catch
275
+ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
276
+ ๐Ÿณ Starting Docker container...
277
+ โณ Waiting for server to be ready...
278
+ โœ“ Server ready (2s)
279
+ ๐ŸŽฎ Running Python client test...
280
+ โœ“ PASSED - Episode completed successfully
281
+
282
+ [... tests all 6 games ...]
283
+
284
+ ========================================
285
+ Test Summary
286
+ ========================================
287
+
288
+ โœ“ catch
289
+ โœ“ tic_tac_toe
290
+ โœ“ kuhn_poker
291
+ โœ“ cliff_walking
292
+ โœ“ 2048
293
+ โœ“ blackjack
294
+
295
+ Total: 6 passed, 0 failed out of 6 games
296
+
297
+ ========================================
298
+ All tests PASSED! ๐ŸŽ‰
299
+ ========================================
300
+ ```
301
+
302
+ ### Manual Testing
303
+
304
+ ```bash
305
+ # Local (requires OpenSpiel installed)
306
+ python -m pytest src/envs/openspiel_env/
307
+
308
+ # Docker build
309
+ docker build -f src/envs/openspiel_env/server/Dockerfile -t openspiel-env:latest .
310
+
311
+ # Run specific game
312
+ docker run -p 8000:8000 openspiel-env:latest
313
+
314
+ # Test from another terminal
315
+ python3 examples/openspiel_simple.py
316
+ ```
317
+
318
+ ## Development
319
+
320
+ ### Adding New Games
321
+
322
+ To add support for more OpenSpiel games:
323
+
324
+ 1. Verify the game works with `rl_environment.Environment`
325
+ 2. Test with different opponent policies if multi-player
326
+ 3. Document game-specific configuration
327
+ 4. Add example script
328
+
329
+ ## Limitations
330
+
331
+ - **Simultaneous-move games**: Only agent_player=0 supported
332
+ - **Multi-agent training**: Single agent only (no self-play yet)
333
+ - **Opponent policies**: Random and fixed only (no MCTS yet)
334
+ - **Build time**: Docker image takes ~5-10 minutes to build (compiles C++)
335
+
336
+ ## Future Work
337
+
338
+ - MCTS opponent policies
339
+ - Self-play support (multiple agents)
340
+ - More games (Chess, Go, Poker Hold'em)
341
+ - Faster build with pre-built OpenSpiel base image
342
+ - Game-specific reward shaping options
343
+
344
+ ## References
345
+
346
+ - [OpenSpiel Paper (2019)](https://arxiv.org/abs/1908.09453)
347
+ - [OpenSpiel GitHub](https://github.com/google-deepmind/open_spiel)
348
+ - [OpenSpiel Documentation](https://openspiel.readthedocs.io/)
__init__.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ OpenSpiel Environment Integration.
9
+
10
+ This module provides integration between OpenSpiel games and the OpenEnv framework.
11
+ OpenSpiel (https://github.com/google-deepmind/open_spiel) is DeepMind's collection
12
+ of environments and algorithms for research in RL in games.
13
+
14
+ Supported games:
15
+ - Catch (1P)
16
+ - Tic-Tac-Toe (2P)
17
+ - Kuhn Poker (2P, imperfect info)
18
+ - Cliff Walking (1P)
19
+ - 2048 (1P)
20
+ - Blackjack (1P)
21
+ """
22
+
23
+ from .client import OpenSpielEnv
24
+ from .models import OpenSpielAction, OpenSpielObservation, OpenSpielState
25
+
26
+ __all__ = ["OpenSpielEnv", "OpenSpielAction", "OpenSpielObservation", "OpenSpielState"]
client.py ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ OpenSpielEnv HTTP Client.
9
+
10
+ This module provides the client for connecting to an OpenSpiel Environment server
11
+ over HTTP.
12
+ """
13
+
14
+ from __future__ import annotations
15
+
16
+ from typing import Any, Dict, Optional, TYPE_CHECKING
17
+
18
+ from core.client_types import StepResult
19
+
20
+ from core.http_env_client import HTTPEnvClient
21
+
22
+ from .models import OpenSpielAction, OpenSpielObservation, OpenSpielState
23
+
24
+ if TYPE_CHECKING:
25
+ from core.containers.runtime import ContainerProvider
26
+
27
+
28
+ class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):
29
+ """
30
+ HTTP client for OpenSpiel Environment.
31
+
32
+ This client connects to an OpenSpielEnvironment HTTP server and provides
33
+ methods to interact with it: reset(), step(), and state access.
34
+
35
+ Example:
36
+ >>> # Connect to a running server
37
+ >>> client = OpenSpielEnv(base_url="http://localhost:8000")
38
+ >>> result = client.reset()
39
+ >>> print(result.observation.info_state)
40
+ >>>
41
+ >>> # Take an action
42
+ >>> result = client.step(OpenSpielAction(action_id=1, game_name="catch"))
43
+ >>> print(result.observation.reward)
44
+
45
+ Example with Docker:
46
+ >>> # Automatically start container and connect
47
+ >>> client = OpenSpielEnv.from_docker_image("openspiel-env:latest")
48
+ >>> result = client.reset()
49
+ >>> result = client.step(OpenSpielAction(action_id=0))
50
+ """
51
+
52
+ def _step_payload(self, action: OpenSpielAction) -> Dict[str, Any]:
53
+ """
54
+ Convert OpenSpielAction to JSON payload for step request.
55
+
56
+ Args:
57
+ action: OpenSpielAction instance.
58
+
59
+ Returns:
60
+ Dictionary representation suitable for JSON encoding.
61
+ """
62
+ return {
63
+ "action_id": action.action_id,
64
+ "game_name": action.game_name,
65
+ "game_params": action.game_params,
66
+ }
67
+
68
+ def _parse_result(
69
+ self, payload: Dict[str, Any]
70
+ ) -> StepResult[OpenSpielObservation]:
71
+ """
72
+ Parse server response into StepResult[OpenSpielObservation].
73
+
74
+ Args:
75
+ payload: JSON response from server.
76
+
77
+ Returns:
78
+ StepResult with OpenSpielObservation.
79
+ """
80
+ obs_data = payload.get("observation", {})
81
+
82
+ observation = OpenSpielObservation(
83
+ info_state=obs_data.get("info_state", []),
84
+ legal_actions=obs_data.get("legal_actions", []),
85
+ game_phase=obs_data.get("game_phase", "playing"),
86
+ current_player_id=obs_data.get("current_player_id", 0),
87
+ opponent_last_action=obs_data.get("opponent_last_action"),
88
+ done=payload.get("done", False),
89
+ reward=payload.get("reward"),
90
+ metadata=obs_data.get("metadata", {}),
91
+ )
92
+
93
+ return StepResult(
94
+ observation=observation,
95
+ reward=payload.get("reward"),
96
+ done=payload.get("done", False),
97
+ )
98
+
99
+ def _parse_state(self, payload: Dict[str, Any]) -> OpenSpielState:
100
+ """
101
+ Parse server response into OpenSpielState object.
102
+
103
+ Args:
104
+ payload: JSON response from /state endpoint.
105
+
106
+ Returns:
107
+ OpenSpielState object with environment state information.
108
+ """
109
+ return OpenSpielState(
110
+ episode_id=payload.get("episode_id"),
111
+ step_count=payload.get("step_count", 0),
112
+ game_name=payload.get("game_name", "unknown"),
113
+ agent_player=payload.get("agent_player", 0),
114
+ opponent_policy=payload.get("opponent_policy", "random"),
115
+ game_params=payload.get("game_params", {}),
116
+ num_players=payload.get("num_players", 1),
117
+ )
docker_issue.md ADDED
@@ -0,0 +1 @@
 
 
1
+ # port issue? fix proxy?
models.py ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Data models for OpenSpiel Environment.
9
+
10
+ This module defines the Action, Observation, and State types for OpenSpiel games.
11
+ """
12
+
13
+ from __future__ import annotations
14
+
15
+ from dataclasses import dataclass, field
16
+ from typing import Any, Dict, List, Optional
17
+
18
+ from core.env_server import Action, Observation, State
19
+
20
+
21
+ @dataclass
22
+ class OpenSpielAction(Action):
23
+ """
24
+ Action for OpenSpiel environments.
25
+
26
+ Attributes:
27
+ action_id: The integer action ID to take (from legal_actions).
28
+ game_name: Name of the OpenSpiel game (e.g., "catch", "tic_tac_toe").
29
+ game_params: Optional game-specific parameters (e.g., {"rows": 8, "columns": 6}).
30
+ """
31
+ action_id: int
32
+ game_name: str = "catch"
33
+ game_params: Dict[str, Any] = field(default_factory=dict)
34
+
35
+
36
+ @dataclass
37
+ class OpenSpielObservation(Observation):
38
+ """
39
+ Observation from OpenSpiel environment.
40
+
41
+ This represents what the agent sees after taking an action.
42
+ For single-player games, this is straightforward.
43
+ For multi-player games, this is from the perspective of the agent player.
44
+
45
+ Attributes:
46
+ info_state: Information state tensor (list of floats) for the agent.
47
+ This contains all information available to the agent.
48
+ legal_actions: List of legal action IDs the agent can take.
49
+ game_phase: String describing the current phase (e.g., "playing", "terminal").
50
+ current_player_id: ID of the current player (-1 for simultaneous, player ID otherwise).
51
+ opponent_last_action: Last action taken by opponent (if available, None otherwise).
52
+ """
53
+ info_state: List[float]
54
+ legal_actions: List[int]
55
+ game_phase: str = "playing"
56
+ current_player_id: int = 0
57
+ opponent_last_action: Optional[int] = None
58
+
59
+
60
+ @dataclass
61
+ class OpenSpielState(State):
62
+ """
63
+ State for OpenSpiel environment.
64
+
65
+ Attributes:
66
+ game_name: Name of the OpenSpiel game.
67
+ agent_player: Which player ID the agent controls (0 by default).
68
+ opponent_policy: Name of the opponent policy ("random", "fixed", etc.).
69
+ game_params: Game-specific parameters.
70
+ num_players: Total number of players in the game.
71
+ """
72
+ game_name: str = "catch"
73
+ agent_player: int = 0
74
+ opponent_policy: str = "random"
75
+ game_params: Dict[str, Any] = field(default_factory=dict)
76
+ num_players: int = 1
openenv.yaml ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ spec_version: 1
2
+ name: openspiel
3
+ type: space
4
+ runtime: fastapi
5
+ app: server.app:app
6
+ port: 8000
7
+
pyproject.toml ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ [build-system]
8
+ requires = ["setuptools>=45", "wheel"]
9
+ build-backend = "setuptools.build_meta"
10
+
11
+ [project]
12
+ name = "openenv-openspiel"
13
+ version = "0.1.0"
14
+ description = "__ENV_TITLE_NAME__ environment for OpenEnv"
15
+ requires-python = ">=3.10"
16
+ dependencies = [
17
+ # Core OpenEnv dependencies (required for server functionality)
18
+ # "openenv-core @ git+https://github.com/meta-pytorch/OpenEnv.git@main#subdirectory=src/core",
19
+ "openenv-core>=0.1.0",
20
+ "fastapi>=0.115.0",
21
+ "pydantic>=2.0.0",
22
+ "uvicorn>=0.24.0",
23
+ "requests>=2.31.0",
24
+ # Environment-specific dependencies
25
+ # Add all dependencies needed for your environment here
26
+ # Examples:
27
+ # "numpy>=1.19.0",
28
+ # "torch>=2.0.0",
29
+ # "gymnasium>=0.29.0",
30
+ # "openspiel>=1.0.0",
31
+ # "smolagents>=1.22.0,<2",
32
+ ]
33
+
34
+ [project.optional-dependencies]
35
+ dev = [
36
+ "pytest>=8.0.0",
37
+ "pytest-cov>=4.0.0",
38
+ ]
39
+
40
+ [project.scripts]
41
+ # Server entry point - enables running via: uv run --project . server
42
+ # or: python -m openspiel.server.app
43
+ server = "openspiel.server.app:main"
44
+
45
+ [tool.setuptools]
46
+ packages = ["openspiel", "openspiel.server"]
47
+ package-dir = { "openspiel" = ".", "openspiel.server" = "server" }
48
+
49
+ # [tool.setuptools.packages.find]
50
+ # where = ["."]
51
+
server/Dockerfile.openspiel-base ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # Pre-built OpenSpiel base image
8
+ # This image contains OpenSpiel compiled and ready to use
9
+ # Built from: docker build -t openspiel-base:latest -f src/envs/openspiel_env/server/Dockerfile.openspiel-base .
10
+ # In GitHub Actions, this is overridden to use the GHCR base image
11
+ ARG BASE_IMAGE=openenv-base:latest
12
+ FROM ${BASE_IMAGE}
13
+
14
+ # Avoid interactive prompts during build
15
+ ENV DEBIAN_FRONTEND=noninteractive
16
+ ENV TZ=UTC
17
+
18
+ # Install build dependencies (curl already installed by openenv-base)
19
+ RUN apt-get update && apt-get install -y --no-install-recommends \
20
+ build-essential \
21
+ clang \
22
+ cmake \
23
+ git \
24
+ sudo \
25
+ && rm -rf /var/lib/apt/lists/*
26
+
27
+ # Set up OpenSpiel build directory
28
+ RUN mkdir /repo
29
+ WORKDIR /repo
30
+
31
+ # Clone OpenSpiel
32
+ RUN git clone https://github.com/google-deepmind/open_spiel.git .
33
+
34
+ # Run OpenSpiel's installation script (downloads C++ dependencies)
35
+ RUN ./install.sh
36
+
37
+ # Install Python dependencies
38
+ RUN pip3 install --no-cache-dir --upgrade setuptools testresources importlib_metadata
39
+ RUN pip3 install --no-cache-dir --upgrade -r requirements.txt cmake
40
+
41
+ # Build OpenSpiel with Python 3.11
42
+ # Use the exact same Python executable as the base image
43
+ RUN mkdir -p build
44
+ WORKDIR /repo/build
45
+ RUN cmake -DPython3_EXECUTABLE=/usr/local/bin/python3 -DCMAKE_CXX_COMPILER=$(which clang++) ../open_spiel
46
+ RUN make -j$(nproc) pyspiel
47
+
48
+ # Install OpenSpiel Python requirements
49
+ WORKDIR /repo
50
+ RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
51
+
52
+ # Set Python path for OpenSpiel
53
+ ENV PYTHONPATH=/repo:/repo/build/python:${PYTHONPATH}
54
+
55
+ # Test OpenSpiel import to verify ABI compatibility
56
+ RUN python3 -c "import pyspiel; print('OpenSpiel import successful')" || echo "OpenSpiel import failed"
57
+
58
+ # Clean up build dependencies to reduce image size
59
+ RUN apt-get remove -y build-essential clang cmake git sudo || true && \
60
+ apt-get autoremove -y && \
61
+ apt-get clean && \
62
+ rm -rf /var/lib/apt/lists/*
63
+
64
+ # Set working directory back to /app (standard for openenv-base)
65
+ WORKDIR /app
server/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Server-side implementation for OpenSpiel environments."""
server/app.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ FastAPI application for the OpenSpiel Environment.
9
+
10
+ This module creates an HTTP server that exposes OpenSpiel games
11
+ over HTTP endpoints, making them compatible with HTTPEnvClient.
12
+
13
+ Usage:
14
+ # Development (with auto-reload):
15
+ uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
16
+
17
+ # Production:
18
+ uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
19
+
20
+ # Or run directly:
21
+ python -m server.app
22
+
23
+ Environment variables:
24
+ OPENSPIEL_GAME: Game name to serve (default: "catch")
25
+ OPENSPIEL_AGENT_PLAYER: Agent player ID (default: 0)
26
+ OPENSPIEL_OPPONENT_POLICY: Opponent policy (default: "random")
27
+ """
28
+
29
+ import os
30
+
31
+ try:
32
+ from openenv_core.env_server.http_server import create_app
33
+ except Exception as e: # pragma: no cover
34
+ raise ImportError("openenv_core is required for the web interface. Install dependencies with '\n uv sync\n'") from e
35
+
36
+ from .openspiel_environment import OpenSpielEnvironment
37
+ from models import OpenSpielAction, OpenSpielObservation
38
+
39
+ # Get game configuration from environment variables
40
+ game_name = os.getenv("OPENSPIEL_GAME", "catch")
41
+ agent_player = int(os.getenv("OPENSPIEL_AGENT_PLAYER", "0"))
42
+ opponent_policy = os.getenv("OPENSPIEL_OPPONENT_POLICY", "random")
43
+
44
+ # Create the environment instance
45
+ env = OpenSpielEnvironment(
46
+ game_name=game_name,
47
+ agent_player=agent_player,
48
+ opponent_policy=opponent_policy,
49
+ )
50
+
51
+ # Create the FastAPI app with web interface and README integration
52
+ app = create_app(env, OpenSpielAction, OpenSpielObservation, env_name="openspiel")
53
+
54
+ def main(host: str = "0.0.0.0", port: int = 8000):
55
+ """
56
+ Entry point for direct execution via uv run or python -m.
57
+
58
+ This function enables running the server without Docker:
59
+ uv run --project . server
60
+ uv run --project . server --port 8001
61
+ python -m .server.app
62
+
63
+ Args:
64
+ host: Host address to bind to (default: "0.0.0.0")
65
+ port: Port number to listen on (default: 8000)
66
+
67
+ For production deployments, consider using uvicorn directly with
68
+ multiple workers:
69
+ uvicorn openspiel.server.app:app --workers 4
70
+ """
71
+ import uvicorn
72
+
73
+ uvicorn.run(app, host=host, port=port)
74
+
75
+ if __name__ == "__main__":
76
+ import argparse
77
+
78
+ parser = argparse.ArgumentParser()
79
+ parser.add_argument("--port", type=int, default=8000)
80
+ args = parser.parse_args()
81
+ main(port=args.port)
server/build_docker.sh ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
3
+ # All rights reserved.
4
+ #
5
+ # This source code is licensed under the BSD-style license found in the
6
+ # LICENSE file in the root directory of this source tree.
7
+
8
+ # Script to build the OpenSpiel environment Docker image
9
+ # Usage: ./build_docker.sh [tag]
10
+ #
11
+ # Note: Requires envtorch-base:latest to be built first.
12
+ # See: src/core/containers/images/README.md
13
+
14
+ set -e
15
+
16
+ TAG="${1:-latest}"
17
+ IMAGE_NAME="openspiel-env:${TAG}"
18
+
19
+ echo "๐Ÿณ Building OpenSpiel Environment Docker Image"
20
+ echo "================================================"
21
+ echo "Image: $IMAGE_NAME"
22
+ echo ""
23
+
24
+ # Get script directory
25
+ SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
26
+
27
+ # Navigate to OpenEnv root (4 levels up from server/)
28
+ OPENENV_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)"
29
+
30
+ echo "๐Ÿ“ OpenEnv root: $OPENENV_ROOT"
31
+ echo ""
32
+
33
+ # Build OpenSpiel environment image
34
+ # Note: Docker will automatically pull ghcr.io/meta-pytorch/openenv-base:latest if needed
35
+ echo "โณ Building (this may take 5-10 minutes due to OpenSpiel compilation)..."
36
+ docker build \
37
+ -f "$SCRIPT_DIR/Dockerfile" \
38
+ -t "$IMAGE_NAME" \
39
+ "$OPENENV_ROOT"
40
+
41
+ if [ $? -eq 0 ]; then
42
+ echo ""
43
+ echo "โœ… Build successful!"
44
+ echo ""
45
+ echo "๐Ÿš€ Run with different games:"
46
+ echo ""
47
+ echo " # Catch (default)"
48
+ echo " docker run -p 8000:8000 $IMAGE_NAME"
49
+ echo ""
50
+ echo " # Tic-Tac-Toe"
51
+ echo " docker run -p 8000:8000 -e OPENSPIEL_GAME=tic_tac_toe $IMAGE_NAME"
52
+ echo ""
53
+ echo " # Kuhn Poker"
54
+ echo " docker run -p 8000:8000 -e OPENSPIEL_GAME=kuhn_poker $IMAGE_NAME"
55
+ echo ""
56
+ echo " # Cliff Walking"
57
+ echo " docker run -p 8000:8000 -e OPENSPIEL_GAME=cliff_walking $IMAGE_NAME"
58
+ echo ""
59
+ echo " # 2048"
60
+ echo " docker run -p 8000:8000 -e OPENSPIEL_GAME=2048 $IMAGE_NAME"
61
+ echo ""
62
+ echo " # Blackjack"
63
+ echo " docker run -p 8000:8000 -e OPENSPIEL_GAME=blackjack $IMAGE_NAME"
64
+ echo ""
65
+ else
66
+ echo ""
67
+ echo "โŒ Build failed!"
68
+ exit 1
69
+ fi
server/openspiel_environment.py ADDED
@@ -0,0 +1,267 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ OpenSpiel Environment Server Implementation.
9
+
10
+ This module wraps OpenSpiel's rl_environment.Environment and exposes it
11
+ via the OpenEnv Environment interface.
12
+ """
13
+
14
+ import uuid
15
+ from typing import Any, Dict
16
+
17
+ from openenv_core.env_server.interfaces import Environment
18
+ from openenv_core.env_server.types import State
19
+
20
+ from ..models import OpenSpielAction, OpenSpielObservation, OpenSpielState
21
+ from .opponent_policies import get_opponent_policy, OpponentPolicy
22
+
23
+ # Import OpenSpiel
24
+ try:
25
+ from open_spiel.python import rl_environment
26
+ import pyspiel
27
+ except ImportError as e:
28
+ raise ImportError(
29
+ "OpenSpiel is not installed. "
30
+ "Please install it following instructions at: "
31
+ "https://github.com/google-deepmind/open_spiel"
32
+ ) from e
33
+
34
+
35
+ class OpenSpielEnvironment(Environment):
36
+ """
37
+ OpenSpiel Environment wrapper for OpenEnv.
38
+
39
+ This environment wraps OpenSpiel games and provides a single-agent interface.
40
+ For multi-player games, the agent controls one player while opponent(s) use
41
+ a fixed policy (e.g., random).
42
+
43
+ Supported games:
44
+ - Single-player: catch, cliff_walking, 2048, blackjack
45
+ - Multi-player: tic_tac_toe, kuhn_poker
46
+
47
+ Args:
48
+ game_name: Name of the OpenSpiel game (e.g., "catch", "tic_tac_toe").
49
+ agent_player: Which player ID the agent controls (default 0).
50
+ opponent_policy: Policy for opponent players ("random", "first", etc.).
51
+ game_params: Optional game-specific parameters.
52
+
53
+ Example:
54
+ >>> env = OpenSpielEnvironment("catch")
55
+ >>> obs = env.reset()
56
+ >>> print(obs.info_state) # Agent's observation
57
+ >>> obs = env.step(OpenSpielAction(action_id=1))
58
+ >>> print(obs.reward)
59
+ """
60
+
61
+ def __init__(
62
+ self,
63
+ game_name: str = "catch",
64
+ agent_player: int = 0,
65
+ opponent_policy: str = "random",
66
+ game_params: Dict[str, Any] | None = None,
67
+ ):
68
+ """Initialize OpenSpiel environment."""
69
+ super().__init__()
70
+
71
+ self.game_name = game_name
72
+ self.agent_player = agent_player
73
+ self.game_params = game_params or {}
74
+
75
+ # Create OpenSpiel environment
76
+ try:
77
+ self._ospiel_env = rl_environment.Environment(
78
+ game_name, **self.game_params
79
+ )
80
+ except Exception as e:
81
+ raise ValueError(
82
+ f"Failed to create OpenSpiel game '{game_name}': {e}"
83
+ ) from e
84
+
85
+ self.num_players = self._ospiel_env.num_players
86
+ self.is_turn_based = self._ospiel_env.is_turn_based
87
+
88
+ # Validate agent_player
89
+ if agent_player >= self.num_players:
90
+ raise ValueError(
91
+ f"agent_player={agent_player} >= num_players={self.num_players}"
92
+ )
93
+
94
+ # Set up opponent policy for multi-player games
95
+ self.opponent_policy_fn: OpponentPolicy | None = None
96
+ if self.num_players > 1:
97
+ self.opponent_policy_fn = get_opponent_policy(opponent_policy)
98
+
99
+ # Initialize state
100
+ self._state = OpenSpielState(
101
+ game_name=game_name,
102
+ agent_player=agent_player,
103
+ opponent_policy=opponent_policy,
104
+ game_params=self.game_params,
105
+ num_players=self.num_players,
106
+ )
107
+
108
+ # Track last opponent action for learning
109
+ self._last_opponent_action: int | None = None
110
+
111
+ def reset(self) -> Observation:
112
+ """
113
+ Reset the environment and return initial observation.
114
+
115
+ For multi-player games, this will autoplay opponent turns until
116
+ it's the agent's turn (or terminal state).
117
+
118
+ Returns:
119
+ Initial observation for the agent.
120
+ """
121
+ # Reset OpenSpiel environment
122
+ time_step = self._ospiel_env.reset()
123
+
124
+ # Reset state tracking
125
+ self._state.episode_id = str(uuid.uuid4())
126
+ self._state.step_count = 0
127
+ self._last_opponent_action = None
128
+
129
+ # Autoplay opponent turns until agent's turn
130
+ time_step = self._auto_play_opponents(time_step)
131
+
132
+ # Convert to OpenEnv observation
133
+ return self._make_observation(time_step)
134
+
135
+ def step(self, action: Action) -> Observation:
136
+ """
137
+ Execute agent's action and return resulting observation.
138
+
139
+ For multi-player games, this will:
140
+ 1. Apply the agent's action
141
+ 2. Autoplay opponent turns until it's the agent's turn again
142
+ 3. Return the observation from the agent's perspective
143
+
144
+ Args:
145
+ action: OpenSpielAction containing the action_id to execute.
146
+
147
+ Returns:
148
+ Observation after action execution (and opponent turns if multi-player).
149
+
150
+ Raises:
151
+ ValueError: If action is not an OpenSpielAction.
152
+ """
153
+ if not isinstance(action, OpenSpielAction):
154
+ raise ValueError(f"Expected OpenSpielAction, got {type(action)}")
155
+
156
+ # Apply agent's action
157
+ if self.is_turn_based:
158
+ # Turn-based: single action
159
+ time_step = self._ospiel_env.step([action.action_id])
160
+ else:
161
+ # Simultaneous-move: need actions for all players
162
+ # For now, only support agent as player 0 in simultaneous games
163
+ if self.agent_player != 0:
164
+ raise NotImplementedError(
165
+ "Simultaneous-move games only support agent_player=0"
166
+ )
167
+ # Get opponent actions
168
+ opponent_actions = []
169
+ for player_id in range(self.num_players):
170
+ if player_id == self.agent_player:
171
+ opponent_actions.append(action.action_id)
172
+ else:
173
+ legal_actions = time_step.observations["legal_actions"][player_id]
174
+ opp_action = self.opponent_policy_fn.select_action(
175
+ legal_actions, time_step.observations
176
+ )
177
+ opponent_actions.append(opp_action)
178
+ time_step = self._ospiel_env.step(opponent_actions)
179
+
180
+ self._state.step_count += 1
181
+
182
+ # Autoplay opponent turns (for turn-based games)
183
+ if self.is_turn_based:
184
+ time_step = self._auto_play_opponents(time_step)
185
+
186
+ # Convert to OpenEnv observation
187
+ return self._make_observation(time_step)
188
+
189
+ @property
190
+ def state(self) -> OpenSpielState:
191
+ """Get current environment state."""
192
+ return self._state
193
+
194
+ def _auto_play_opponents(self, time_step) -> Any:
195
+ """
196
+ Autoplay opponent turns until it's the agent's turn or game is terminal.
197
+
198
+ Args:
199
+ time_step: Current TimeStep from OpenSpiel environment.
200
+
201
+ Returns:
202
+ Updated TimeStep after opponent moves.
203
+ """
204
+ # Single-player games: nothing to do
205
+ if self.num_players == 1:
206
+ return time_step
207
+
208
+ # Multi-player games: play opponent turns
209
+ while (
210
+ not time_step.last()
211
+ and time_step.observations["current_player"] != self.agent_player
212
+ ):
213
+ current_player = time_step.observations["current_player"]
214
+ legal_actions = time_step.observations["legal_actions"][current_player]
215
+
216
+ # Select opponent action
217
+ opp_action = self.opponent_policy_fn.select_action(
218
+ legal_actions, time_step.observations
219
+ )
220
+ self._last_opponent_action = opp_action
221
+
222
+ # Apply opponent action
223
+ time_step = self._ospiel_env.step([opp_action])
224
+ self._state.step_count += 1
225
+
226
+ return time_step
227
+
228
+ def _make_observation(self, time_step) -> OpenSpielObservation:
229
+ """
230
+ Convert OpenSpiel TimeStep to OpenEnv Observation.
231
+
232
+ Args:
233
+ time_step: OpenSpiel TimeStep object.
234
+
235
+ Returns:
236
+ OpenSpielObservation for the agent.
237
+ """
238
+ # Extract agent's information
239
+ info_state = time_step.observations["info_state"][self.agent_player]
240
+ legal_actions = time_step.observations["legal_actions"][self.agent_player]
241
+ current_player_id = time_step.observations["current_player"]
242
+
243
+ # Determine game phase
244
+ if time_step.last():
245
+ game_phase = "terminal"
246
+ elif time_step.first():
247
+ game_phase = "initial"
248
+ else:
249
+ game_phase = "playing"
250
+
251
+ # Get reward for agent
252
+ reward = None
253
+ if time_step.rewards is not None:
254
+ reward = float(time_step.rewards[self.agent_player])
255
+
256
+ # Create observation
257
+ obs = OpenSpielObservation(
258
+ info_state=info_state.tolist() if hasattr(info_state, "tolist") else list(info_state),
259
+ legal_actions=legal_actions,
260
+ game_phase=game_phase,
261
+ current_player_id=current_player_id,
262
+ opponent_last_action=self._last_opponent_action,
263
+ done=time_step.last(),
264
+ reward=reward,
265
+ )
266
+
267
+ return obs
server/opponent_policies.py ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ Opponent policies for multi-player OpenSpiel games.
9
+
10
+ These policies are used to control non-agent players in multi-player games,
11
+ allowing single-agent RL training against fixed or adaptive opponents.
12
+ """
13
+
14
+ import random
15
+ from typing import Any, Protocol
16
+
17
+
18
+ class OpponentPolicy(Protocol):
19
+ """Protocol for opponent policies."""
20
+
21
+ def select_action(self, legal_actions: list[int], observations: dict[str, Any]) -> int:
22
+ """
23
+ Select an action for the opponent.
24
+
25
+ Args:
26
+ legal_actions: List of legal action IDs.
27
+ observations: Current observations from the environment.
28
+
29
+ Returns:
30
+ Selected action ID.
31
+ """
32
+ ...
33
+
34
+
35
+ class RandomOpponent:
36
+ """Random opponent that selects uniformly from legal actions."""
37
+
38
+ def select_action(self, legal_actions: list[int], observations: dict[str, Any]) -> int:
39
+ """Select a random legal action."""
40
+ if not legal_actions:
41
+ raise ValueError("No legal actions available")
42
+ return random.choice(legal_actions)
43
+
44
+
45
+ class FixedActionOpponent:
46
+ """Opponent that always selects the same action (e.g., first legal action)."""
47
+
48
+ def __init__(self, action_selector: str = "first"):
49
+ """
50
+ Initialize fixed action opponent.
51
+
52
+ Args:
53
+ action_selector: Which action to select ("first", "last", "middle").
54
+ """
55
+ self.action_selector = action_selector
56
+
57
+ def select_action(self, legal_actions: list[int], observations: dict[str, Any]) -> int:
58
+ """Select a fixed legal action based on selector."""
59
+ if not legal_actions:
60
+ raise ValueError("No legal actions available")
61
+
62
+ if self.action_selector == "first":
63
+ return legal_actions[0]
64
+ elif self.action_selector == "last":
65
+ return legal_actions[-1]
66
+ elif self.action_selector == "middle":
67
+ return legal_actions[len(legal_actions) // 2]
68
+ else:
69
+ return legal_actions[0]
70
+
71
+
72
+ def get_opponent_policy(policy_name: str) -> OpponentPolicy:
73
+ """
74
+ Get an opponent policy by name.
75
+
76
+ Args:
77
+ policy_name: Name of the policy ("random", "first", "last", "middle").
78
+
79
+ Returns:
80
+ OpponentPolicy instance.
81
+
82
+ Raises:
83
+ ValueError: If policy_name is not recognized.
84
+ """
85
+ if policy_name == "random":
86
+ return RandomOpponent()
87
+ elif policy_name in ("first", "last", "middle"):
88
+ return FixedActionOpponent(action_selector=policy_name)
89
+ else:
90
+ raise ValueError(f"Unknown opponent policy: {policy_name}")
server/prepare_hf.sh ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Custom HF deployment script for openspiel_env
3
+ # OpenSpiel uses a different base image with C++ compilation
4
+
5
+ set -e
6
+
7
+ DOCKERFILE_PATH="$1"
8
+ BASE_IMAGE_REF="$2"
9
+
10
+ echo "OpenSpiel: Using custom Dockerfile preparation"
11
+
12
+ # Cross-platform sed in-place editing
13
+ sed_inplace() {
14
+ if sed --version >/dev/null 2>&1; then
15
+ # GNU sed (Linux)
16
+ sed -i "$@"
17
+ else
18
+ # BSD sed (macOS)
19
+ sed -i '' "$@"
20
+ fi
21
+ }
22
+
23
+ # Replace ARG with hardcoded FROM using the special OpenSpiel base
24
+ sed_inplace 's|ARG OPENSPIEL_BASE_IMAGE=.*|FROM ghcr.io/meta-pytorch/openenv-openspiel-base:sha-e622c7e|g' "$DOCKERFILE_PATH"
25
+ sed_inplace '/^FROM \${OPENSPIEL_BASE_IMAGE}/d' "$DOCKERFILE_PATH"
26
+
27
+ echo "OpenSpiel: Modified Dockerfile to use GHCR OpenSpiel base image"
28
+ echo "OpenSpiel builds can take 10-15 minutes due to C++ compilation"
test_docker_all_games.sh ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
3
+ # All rights reserved.
4
+ #
5
+ # This source code is licensed under the BSD-style license found in the
6
+ # LICENSE file in the root directory of this source tree.
7
+
8
+ # Automated test script for all OpenSpiel games in Docker
9
+ # Usage: ./test_docker_all_games.sh
10
+
11
+ set -e
12
+
13
+ # Colors for output
14
+ GREEN='\033[0;32m'
15
+ RED='\033[0;31m'
16
+ YELLOW='\033[1;33m'
17
+ BLUE='\033[0;34m'
18
+ NC='\033[0m' # No Color
19
+
20
+ # Configuration
21
+ IMAGE_NAME="openspiel-env:latest"
22
+ CONTAINER_NAME="openspiel-test"
23
+ PORT=8000
24
+ HEALTH_CHECK_URL="http://localhost:${PORT}/health"
25
+ MAX_WAIT=30
26
+
27
+ # Games to test
28
+ GAMES=("catch" "tic_tac_toe" "kuhn_poker" "cliff_walking" "2048" "blackjack")
29
+
30
+ # Results tracking
31
+ declare -a RESULTS
32
+ PASSED=0
33
+ FAILED=0
34
+
35
+ echo -e "${BLUE}========================================${NC}"
36
+ echo -e "${BLUE}OpenSpiel Docker Integration Test${NC}"
37
+ echo -e "${BLUE}========================================${NC}"
38
+ echo ""
39
+
40
+ # Function to cleanup containers
41
+ cleanup() {
42
+ echo -e "${YELLOW}Cleaning up containers...${NC}"
43
+ docker stop ${CONTAINER_NAME} 2>/dev/null || true
44
+ docker rm ${CONTAINER_NAME} 2>/dev/null || true
45
+ }
46
+
47
+ # Function to wait for server health
48
+ wait_for_health() {
49
+ local game=$1
50
+ echo -e " โณ Waiting for server to be ready..."
51
+
52
+ for i in $(seq 1 $MAX_WAIT); do
53
+ if curl -s -f ${HEALTH_CHECK_URL} > /dev/null 2>&1; then
54
+ echo -e " ${GREEN}โœ“${NC} Server ready (${i}s)"
55
+ return 0
56
+ fi
57
+ sleep 1
58
+ done
59
+
60
+ echo -e " ${RED}โœ—${NC} Server health check failed after ${MAX_WAIT}s"
61
+ return 1
62
+ }
63
+
64
+ # Function to test a game
65
+ test_game() {
66
+ local game=$1
67
+ echo -e "\n${BLUE}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}"
68
+ echo -e "${BLUE}Testing: ${game}${NC}"
69
+ echo -e "${BLUE}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}"
70
+
71
+ # Stop any existing container
72
+ cleanup
73
+
74
+ # Start container with game
75
+ echo -e " ๐Ÿณ Starting Docker container..."
76
+ docker run -d \
77
+ --name ${CONTAINER_NAME} \
78
+ -p ${PORT}:8000 \
79
+ -e OPENSPIEL_GAME=${game} \
80
+ ${IMAGE_NAME} > /dev/null
81
+
82
+ # Wait for server to be ready
83
+ if ! wait_for_health ${game}; then
84
+ echo -e " ${RED}โœ— FAILED${NC} - Server did not start"
85
+ RESULTS+=("${game}:FAILED:Server did not start")
86
+ FAILED=$((FAILED + 1))
87
+ cleanup
88
+ return 1
89
+ fi
90
+
91
+ # Run Python client test
92
+ echo -e " ๐ŸŽฎ Running Python client test..."
93
+ if NO_PROXY=localhost,127.0.0.1 HTTP_PROXY= HTTPS_PROXY= \
94
+ PYTHONPATH=$PWD/src:$PYTHONPATH \
95
+ python3 examples/openspiel_simple.py > /tmp/test_${game}.log 2>&1; then
96
+
97
+ # Check if episode completed successfully
98
+ if grep -q "Episode finished!" /tmp/test_${game}.log; then
99
+ echo -e " ${GREEN}โœ“ PASSED${NC} - Episode completed successfully"
100
+ RESULTS+=("${game}:PASSED")
101
+ PASSED=$((PASSED + 1))
102
+ else
103
+ echo -e " ${RED}โœ— FAILED${NC} - Episode did not complete"
104
+ RESULTS+=("${game}:FAILED:Episode incomplete")
105
+ FAILED=$((FAILED + 1))
106
+ fi
107
+ else
108
+ echo -e " ${RED}โœ— FAILED${NC} - Python client error"
109
+ RESULTS+=("${game}:FAILED:Client error")
110
+ FAILED=$((FAILED + 1))
111
+ fi
112
+
113
+ # Cleanup
114
+ cleanup
115
+ }
116
+
117
+ # Run tests for all games
118
+ for game in "${GAMES[@]}"; do
119
+ test_game ${game}
120
+ done
121
+
122
+ # Print summary
123
+ echo -e "\n${BLUE}========================================${NC}"
124
+ echo -e "${BLUE}Test Summary${NC}"
125
+ echo -e "${BLUE}========================================${NC}"
126
+ echo ""
127
+
128
+ for result in "${RESULTS[@]}"; do
129
+ IFS=':' read -r game status message <<< "$result"
130
+ if [ "$status" == "PASSED" ]; then
131
+ echo -e " ${GREEN}โœ“${NC} ${game}"
132
+ else
133
+ echo -e " ${RED}โœ—${NC} ${game} - ${message}"
134
+ fi
135
+ done
136
+
137
+ echo ""
138
+ echo -e "Total: ${PASSED} passed, ${FAILED} failed out of ${#GAMES[@]} games"
139
+ echo ""
140
+
141
+ # Exit with appropriate code
142
+ if [ $FAILED -eq 0 ]; then
143
+ echo -e "${GREEN}========================================${NC}"
144
+ echo -e "${GREEN}All tests PASSED! ๐ŸŽ‰${NC}"
145
+ echo -e "${GREEN}========================================${NC}"
146
+ exit 0
147
+ else
148
+ echo -e "${RED}========================================${NC}"
149
+ echo -e "${RED}Some tests FAILED${NC}"
150
+ echo -e "${RED}========================================${NC}"
151
+ exit 1
152
+ fi
uv.lock ADDED
The diff for this file is too large to render. See raw diff