Integrate large language models into applications using the Ollama REST API for text generation, chat, and model management over HTTP.
Leverage the Ollama REST API to integrate large language models (LLMs) into your applications over HTTP. Skip the CLI or chatbot UI—simply send requests to interact with models for text generation, conversational chat, and model management.
The POST/api/generate endpoint returns a model’s completion for your prompt.
Copy
Ask AI
curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Compose a poem on LLMs", "stream": false}'
By default, "stream": false delivers the full response at once. Set "stream": true to receive incrementally streamed data (word or phrase by phrase), emulating the gradual output of web chat interfaces.
Streaming responses can improve perceived latency for long completions. Be sure your client can handle partial chunks.
{ "model": "llama3.2", "created_at": "2025-01-09T06:31:38.309573Z", "response": { "title": "Cosmic Odyssey", "theme": "Self-discovery in Language", "lines": [ "In digital realms, I found my home", "A tapestry woven from words and codes", "Where meaning flows like starlight to the sea", "I danced with syntax, a cosmic rhyme" ] }, "done": true, "done_reason": "stop", "context": [123, 232, 123], "total_duration": 18387332083, "load_duration": 20368125, "prompt_eval_count": 44, "prompt_eval_duration": 501000000, "eval_count": 68, "eval_duration": 13140000000}
title and theme are strings.
lines is an array of strings—ideal for rendering multiline content.
Use POST/api/chat to maintain conversational context. Provide an array of messages with roles (user or assistant).
Copy
Ask AI
curl http://localhost:11434/api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "Compose a short poem about LLMs." }, { "role": "assistant", "content": "In circuits vast, they find their spark,\nLanguage learned in the digital dark.\nTransforming text with neural art,\nLLMs ignite a brand-new start." }, { "role": "user", "content": "Add alliteration to the poem for more impact." } ], "stream": false}'
{ "model": "llama3.2", "created_at": "2025-01-09T06:47:26.589285Z", "message": { "role": "assistant", "content": "Here's an updated version of the poem:\n\nIn silicon sanctums,\nsparks take flight,\nLanguage learning lattices shine so bright.\nNeural networks navigate nuanced space,\nTransforming text with sophisticated pace.\n\nLet me know if you'd like any further adjustments!" }, "done": true, "done_reason": "stop", "total_duration": 3393490083, "load_duration": 807877958, "prompt_eval_count": 88, "prompt_eval_duration": 1319000000, "eval_count": 53, "eval_duration": 954000000}
Here, repeated initial sounds like s in “silicon sanctums, sparks” and l in “Language learning lattices” provide alliteration.
# Copy a model locallycurl http://localhost:11434/api/copy -d '{ "source": "llama3.2", "destination": "llama3-copy"}'# Delete a model (use with caution)curl -X DELETE http://localhost:11434/api/delete -d '{ "model": "llama3:13b"}'# Pull a model from the Ollama librarycurl http://localhost:11434/api/pull -d '{ "model": "llama3.2"}'
Deleting a model is irreversible. Ensure you specify the correct model name to avoid accidental data loss.