Chat

This page will teach you how to use our Chat endpoint

Overview

The Chat endpoint provides conversational AI responses. It allows for both streaming and non-streaming responses with multiple AI models.


Making a request

Non-Streaming

import requests
import json

# Define the API endpoint and headers
api_url = "http://ai.is-a.dev/v1/chat/completions"
headers = {
    "Content-Type": "application/json"
}

payload = {
    "model": "llama-3.1-70b-turbo",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a haiku about recursion in programming."}
    ]
    "tools": False    # Enables or disables Tools
}

response = requests.post(api_url, headers=headers, data=json.dumps(payload))
completion = response.json()
print(completion['choices'][0]['message']['content'])const api_url = "http://katz.is-a.dev/v1/chat/completions";

Streaming

We also provide a way to get real-time responses while LLM is generating your response:

You may change the model by replacing model parameter with a model in the Model listarrow-up-right

Response Examples

If you disabled streaming, the response is returned as a single JSON payload containing the LLM’s full reply.

Error Handling

In the event of an error, the server responds with a 500 Internal Server Error. The response body includes a JSON object detailing the error message and status code.

Error Example

This response may indicate various issues, such as missing or malformed input data or server-side processing errors.

Last updated