Available Models
Returns a list of all Eburon AI models currently deployed on this VPS. Model names are aliased to precious metals for privacy.
/api/tags
Response
models array
Array of model objects
Generate Completion
Generate a text completion from a prompt. Use for single-turn tasks like reasoning, writing, or code generation.
/api/generate
Parameters
model required
Selected model name
prompt required
The prompt to generate from
system string
System prompt override
stream boolean
Stream tokens as they're generated
options object
temperature, top_p, top_k, num_predict (max tokens), seed
Chat Completion
Generate the next message in a conversation. Maintains context across multiple turns for coherent dialogues.
/api/chat
Parameters
model required
Selected model name
messages required
Array of {role: "system"|"user"|"assistant", content: "..."}
stream boolean
Stream tokens as they're generated
options object
temperature, top_p, num_predict, stop (stop sequences)
Create Embedding
Generate vector embeddings for text. Use with vector databases for RAG, semantic search, and similarity matching.
/api/embeddings
Parameters
model required
Embedding model name
prompt required
Text to embed into a vector
Pull a Model
Download a model from Ollama library. Progress streams as Server-Sent Events.
/api/pull
Parameters
name required
Model to pull (e.g., llama3:8b)
stream boolean
Stream progress updates
Model Info
Get detailed information about a specific model.
/api/show
Parameters
name required
Full model name
Delete Model
Remove a model to free up disk space.
/api/delete
Parameters
name required
Full model name to delete
Ollama CLI
Direct Ollama commands available on this VPS.
Available Commands
ollama list
List all installed models
ollama pull <model>
Download a model from library
ollama show <model>
Show model details
ollama run <model>
Run model interactively
ollama create <name> -f <Modelfile>
Create model from Modelfile
ollama cp <src> <dst>
Copy a model
ollama rm <model>
Delete a model
cURL Examples
Using cURL to interact with the API from command line.
Examples
# List models
curl https://llm.eburon.ai/api/tags
# Chat completion
curl -X POST https://llm.eburon.ai/api/chat \
-H "Content-Type: application/json" \
-d '{"model":"qwen3.5:latest","messages":[{"role":"user","content":"Hello"}]}'
# Generate with streaming
curl -X POST https://llm.eburon.ai/api/generate \
-H "Content-Type: application/json" \
-d '{"model":"minimax-m2.7:cloud","prompt":"Hi","stream":true}'
# Create embedding
curl -X POST https://llm.eburon.ai/api/embeddings \
-H "Content-Type: application/json" \
-d '{"model":"embeddinggemma:latest","prompt":"Hello world"}'
Python SDK
Using Python with httpx to call the API.
Example Code
import httpx client = httpx.Client() # Chat resp = client.post("https://llm.eburon.ai/api/chat", json={"model": "qwen3.5:latest", "messages": [{"role": "user", "content": "Hello"}]}) print(resp.json()["message"]["content"]) # Embedding resp = client.post("https://llm.eburon.ai/api/embeddings", json={"model": "embeddinggemma:latest", "prompt": "Hello world"}) print(resp.json()["embedding"][:5])
Node.js SDK
Using Node.js fetch or axios to call the API.
Example Code
const resp = await fetch('https://llm.eburon.ai/api/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'qwen3.5:latest', messages: [{role: 'user', content: 'Hello'}] }) }); const data = await resp.json(); console.log(data.message.content);