Coding-tuned · Anthropic & OpenAI compatible · HuggingFace Spaces
export ANTHROPIC_BASE_URL=\ https://YOUR-USER-space-name.hf.space export ANTHROPIC_API_KEY=gemma4-local claude --model gemma-4-26b
from openai import OpenAI
client = OpenAI(
base_url="https://YOUR-SPACE.hf.space/v1",
api_key="gemma4-local",
)
r = client.chat.completions.create(
model="gemma-4-26b",
messages=[{"role":"user",
"content":"write binary search"}],
)
curl YOUR-SPACE.hf.space/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemma-4-26b",
"messages": [
{"role":"user","content":"hello"}
]
}'
/health returns
model_loaded: false
until ready. Subsequent restarts load from disk in ~60 s.
| Method | Path | Notes |
|---|---|---|
| GET | /health | Status + model_loaded |
| GET | /v1/models | Model list (OpenAI) |
| POST | /v1/chat/completions | OpenAI-compatible · streaming supported |
| POST | /v1/messages | Anthropic-compatible · used by Claude Code |