tuneapi.apis package

Submodules

tuneapi.apis.model_anthropic module

Connect to the Anthropic API to use Claude series of LLMs

class tuneapi.apis.model_anthropic.Anthropic(id: str | None = 'claude-3-haiku-20240307', base_url: str = 'https://api.anthropic.com/v1/messages', api_token: str | None = None, extra_headers: Dict[str, str] | None = None)

Bases: ModelInterface

chat(chats: Thread | str, model: str | None = None, max_tokens: int = 4096, temperature: float | None = None, token: str | None = None, usage: bool = False, extra_headers: Dict[str, str] | None = None, **kwargs)

This is the blocking function to block chat with the model

async chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = 4096, temperature: float | None = None, token: str | None = None, usage: bool = False, extra_headers: Dict[str, str] | None = None, **kwargs)

This is the async function to block chat with the model

distributed_chat(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the blocking function to chat with the model in a distributed manner

async distributed_chat_async(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the async function to chat with the model in a distributed manner

set_api_token(token: str) None

This are used to set the API token for the model

stream_chat(chats: Thread | str, model: str | None = None, max_tokens: int = 1024, temperature: float | None = None, token: str | None = None, debug: bool = False, usage: bool = False, extra_headers: Dict[str, str] | None = None, timeout=(5, 30), raw: bool = False, **kwargs) Any

This is the blocking function to stream chat with the model where each token is iteratively generated

async stream_chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = 1024, temperature: float | None = None, token: str | None = None, debug: bool = False, usage: bool = False, extra_headers: Dict[str, str] | None = None, timeout=(5, 30), raw: bool = False, **kwargs) Any

This is the async function to stream chat with the model where each token is iteratively generated

tuneapi.apis.model_gemini module

Connect to the Google Gemini API to their LLMs. See more Gemini.

class tuneapi.apis.model_gemini.Gemini(id: str | None = 'gemini-2.0-flash-exp', base_url: str = 'https://generativelanguage.googleapis.com/v1beta/models/{id}:{rpc}', extra_headers: Dict[str, str] | None = None, api_token: str | None = None, emebdding_url: str | None = None)

Bases: ModelInterface

chat(chats: Thread | str, model: str | None = None, max_tokens: int = 4096, temperature: float = 1, token: str | None = None, timeout=None, extra_headers: Dict[str, str] | None = None, **kwargs) Any

This is the blocking function to block chat with the model

async chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = None, temperature: float = 1, token: str | None = None, timeout=None, extra_headers: Dict[str, str] | None = None, **kwargs) Any

This is the async function to block chat with the model

distributed_chat(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the blocking function to chat with the model in a distributed manner

async distributed_chat_async(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the async function to chat with the model in a distributed manner

embedding(chats: Thread | List[str] | str, model: str = 'text-embedding-004', extra_headers: Dict[str, str] | None = None, token: str | None = None, timeout: Tuple[int, int] = (5, 60), raw: bool = False) EmbeddingGen

This is the blocking function to get embeddings for the chat

async embedding_async(chats: Thread | List[str] | str, model: str = 'text-embedding-004', extra_headers: Dict[str, str] | None = None, token: str | None = None, timeout: Tuple[float, float] = (5.0, 60.0), raw: bool = False) EmbeddingGen

This is the async function to get embeddings for the chat

set_api_token(token: str) None

This are used to set the API token for the model

stream_chat(chats: Thread | str, model: str | None = None, max_tokens: int = 4096, temperature: float = 1, token: str | None = None, debug: bool = False, extra_headers: Dict[str, str] | None = None, raw: bool = False, timeout=(5, 60), **kwargs)

This is the blocking function to stream chat with the model where each token is iteratively generated

async stream_chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = 4096, temperature: float = 1, token: str | None = None, timeout=(5, 60), raw: bool = False, debug: bool = False, extra_headers: Dict[str, str] | None = None, **kwargs)

This is the async function to stream chat with the model where each token is iteratively generated

tuneapi.apis.model_gemini.get_structured_schema(model: type[BaseModel]) Dict[str, Any]

Converts a Pydantic BaseModel to a JSON schema compatible with Gemini API, including anyOf for optional or union types and handling nested structures correctly.

Parameters:

model – The Pydantic BaseModel class to convert.

Returns:

A dictionary representing the JSON schema.

tuneapi.apis.model_openai module

Connect to the OpenAI API and use their LLMs.

class tuneapi.apis.model_openai.Groq(id: str = 'llama3-70b-8192', base_url: str = 'https://api.groq.com/openai/v1/chat/completions', extra_headers: Dict[str, str] | None = None, api_token: str | None = None)

Bases: OpenAIProtocol

A class to interact with Groq’s Large Language Models (LLMs) via their API. Note this class does not contain the embedding method.

id

Identifier for the Mistral model.

Type:

str

base_url

The base URL for the Mistral API. Defaults to “https://api.groq.com/openai/v1/chat/completions”.

Type:

str

extra_headers

Additional headers to include in API requests.

Type:

Optional[Dict[str, str]]

api_token

API token for authenticating requests. If not provided, it will use the token from the environment variable MISTRAL_TOKEN.

Type:

Optional[str]

Note

For more information, visit the Mistral API documentation at https://console.groq.com/

embedding(**k)

If you pass a list then returned items are in the insertion order

class tuneapi.apis.model_openai.Mistral(id: str = 'mistral-small-latest', base_url: str = 'https://api.mistral.ai/v1/chat/completions', extra_headers: Dict[str, str] | None = None, api_token: str | None = None)

Bases: OpenAIProtocol

A class to interact with Mistral’s Large Language Models (LLMs) via their API. Note this class does not contain the embedding method.

id

Identifier for the Mistral model.

Type:

str

base_url

The base URL for the Mistral API. Defaults to “https://api.mistral.ai/v1/chat/completions”.

Type:

str

extra_headers

Additional headers to include in API requests.

Type:

Optional[Dict[str, str]]

api_token

API token for authenticating requests. If not provided, it will use the token from the environment variable MISTRAL_TOKEN.

Type:

Optional[str]

embedding(*a, **k)

Raises NotImplementedError as Mistral does not support embeddings.

Note

For more information, visit the Mistral API documentation at https://console.mistral.ai/

embedding(**k)

If you pass a list then returned items are in the insertion order

class tuneapi.apis.model_openai.OpenAIProtocol(id: str, base_url: str, extra_headers: Dict[str, str] | None, api_token: str | None, emebdding_url: str | None, image_gen_url: str | None, audio_transcribe_url: str | None, audio_gen_url: str | None)

Bases: ModelInterface

chat(chats: Thread | str, model: str | None = None, max_tokens: int = None, temperature: float = 1, parallel_tool_calls: bool = False, token: str | None = None, usage: bool = False, extra_headers: Dict[str, str] | None = None, **kwargs) Any

This is the blocking function to block chat with the model

async chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = None, temperature: float = 1, parallel_tool_calls: bool = False, token: str | None = None, usage: bool = False, extra_headers: Dict[str, str] | None = None, **kwargs) Any

This is the async function to block chat with the model

distributed_chat(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the blocking function to chat with the model in a distributed manner

async distributed_chat_async(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, **kwargs)

This is the async function to chat with the model in a distributed manner

embedding(chats: Thread | List[str] | str, model: str = 'text-embedding-3-small', token: str | None = None, raw: bool = False, extra_headers: Dict[str, str] | None = None, timeout: Tuple[int, int] = (5, 60)) EmbeddingGen

If you pass a list then returned items are in the insertion order

async embedding_async(chats: Thread | List[str] | str, model: str = 'text-embedding-3-small', token: str | None = None, timeout: Tuple[int, int] = (10, 60), raw: bool = False, extra_headers: Dict[str, str] | None = None) EmbeddingGen

If you pass a list then returned items are in the insertion order

image_gen(prompt: str, style: str = 'natural', model: str = 'dall-e-3', n: int = 1, size: str = '1024x1024', extra_headers: Dict[str, str] | None = None, timeout: Tuple[int, int] = (5, 60), **kwargs) ImageGen

This is the blocking function to generate images

async image_gen_async(prompt: str, style: str = 'natural', model: str = 'dall-e-3', n: int = 1, size: str = '1024x1024', extra_headers: Dict[str, str] | None = None, timeout: Tuple[int, int] = (5, 60), **kwargs) <module 'PIL.Image' from '/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/PIL/Image.py'>

This is the async function to generate images

set_api_token(token: str) None

This are used to set the API token for the model

speech_to_text(prompt: str, audio: str, model='whisper-1', timestamp_granularities=['segment'], **kwargs) Transcript

Translates audio using the OpenAI API. Unfortunately, I couldn’t figure out how to get this working with the python requests library, so I’m using the openai library instead. For both of our sake let’s hope openai is stable long enough.

Parameters:
  • prompt (str) – The instruction prompt to guide the translation.

  • audio (str) – The path to the audio file to translate.

  • model (str) – The model to use for translation.

  • response_format (str) – The format of the response. Possible values are “json”, “text”, “srt”, “verbose_json”, or “vtt”. Defaults to “json”.

  • timestamp_granularities (List[str]) – The timestamp granularities to include in the response. Defaults to [“segment”].

Returns:

The translated text as a string, or None if an error occurs.

stream_chat(chats: Thread | str, model: str | None = None, max_tokens: int = None, temperature: float = 1, parallel_tool_calls: bool = False, token: str | None = None, timeout=(5, 60), usage: bool = False, extra_headers: Dict[str, str] | None = None, debug: bool = False, raw: bool = False, **kwargs)

This is the blocking function to stream chat with the model where each token is iteratively generated

async stream_chat_async(chats: Thread | str, model: str | None = None, max_tokens: int = None, temperature: float = 1, parallel_tool_calls: bool = False, token: str | None = None, timeout=(5, 60), usage: bool = False, extra_headers: Dict[str, str] | None = None, debug: bool = False, raw: bool = False, **kwargs)

This is the async function to stream chat with the model where each token is iteratively generated

text_to_speech(prompt: str, voice: str = 'shimmer', model='tts-1', response_format='wav', extra_headers: Dict[str, str] | None = None, timeout: Tuple[int, int] = (5, 60), **kwargs) bytes
async text_to_speech_async(prompt: str, voice: str = 'shimmer', model='tts-1', response_format='wav', extra_headers: Dict[str, str] | None = None, timeout: Tuple[int, int] = (5, 60), **kwargs) bytes
class tuneapi.apis.model_openai.Openai(id: str = 'gpt-4o', base_url: str = 'https://api.openai.com/v1/chat/completions', extra_headers: Dict[str, str] | None = None, api_token: str | None = None, emebdding_url: str | None = None, image_gen_url: str | None = None, audio_transcribe: str | None = None, audio_gen_url: str | None = None)

Bases: OpenAIProtocol

class tuneapi.apis.model_openai.TuneModel(id: str = 'meta/llama-3.1-8b-instruct', base_url: str = 'https://proxy.tune.app/chat/completions', org_id: str | None = None, extra_headers: Dict[str, str] | None = None, api_token: str | None = None)

Bases: OpenAIProtocol

A class to interact with Groq’s Large Language Models (LLMs) via their API.

id

Identifier for the Mistral model.

Type:

str

base_url

The base URL for the Mistral API. Defaults to “https://proxy.tune.app/chat/completions”.

Type:

str

org_id

Organization ID for the Tune API.

Type:

Optional[str]

extra_headers

Additional headers to include in API requests.

Type:

Optional[Dict[str, str]]

api_token

API token for authenticating requests. If not provided, it will use the token from the environment variable MISTRAL_TOKEN.

Type:

Optional[str]

Note

For more information, visit the Mistral API documentation at https://tune.app/

embedding(chats: Thread | List[str] | str, model: str = 'openai/text-embedding-3-small', token: str | None = None, timeout: Tuple[int, int] = (5, 60), raw: bool = False, extra_headers: Dict[str, str] | None = None)

If you pass a list then returned items are in the insertion order

async embedding_async(chats: Thread | List[str] | str, model: str = 'openai/text-embedding-3-small', token: str | None = None, timeout: Tuple[int, int] = (5, 60), raw: bool = False, extra_headers: Dict[str, str] | None = None)

If you pass a list then returned items are in the insertion order

tuneapi.apis.turbo module

tuneapi.apis.turbo.distributed_chat(model: ModelInterface, prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, usage: bool = False, **kwargs) List | Tuple[List, Usage]

Distributes multiple chat prompts across a thread pool for parallel processing.

This function creates a pool of worker threads to process multiple chat prompts concurrently. It handles retry logic for failed requests and maintains the order of responses corresponding to the input prompts.

Args:
model (ModelInterface): The base model instance to clone for each worker thread. Each thread gets its own model

instance to ensure thread safety.

prompts (List[Thread]): A list of chat prompts to process. The order of responses will match the order of these

prompts.

post_logic (Optional[callable], default=None): A function to process each chat response before storing. If None,

raw responses are stored. Function signature should be: f(chat_response) -> processed_response

max_threads (int, default=10): Maximum number of concurrent worker threads. Adjust based on API rate limits and

system capabilities.

retry (int, default=3): Number of retry attempts for failed requests. Set to 0 to disable retries.

pbar (bool, default=True): Whether to display a progress bar.

Returns:
List[Any]: A list of responses or errors, maintaining the same order as input prompts.

Successful responses will be either raw or processed (if post_logic provided). Failed requests (after retries) will contain the last error encountered.

Raises:

ValueError: If max_threads < 1 or retry < 0 TypeError: If model is not an instance of ModelInterface

Example:
>>> model = ChatModel(api_token="...")
>>> prompts = [
...     Thread([Message("What is 2+2?")]),
...     Thread([Message("What is Python?")])
... ]
>>> responses = distributed_chat(model, prompts, max_threads=5)
>>> for prompt, response in zip(prompts, responses):
...     print(f"Q: {prompt}

A: {response} “)

Note:
  • Each worker thread gets its own model instance to prevent sharing state

  • Progress bar shows both initial processing and retries

  • The function maintains thread safety through message passing channels

async tuneapi.apis.turbo.distributed_chat_async(model: ModelInterface, prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, debug=False, usage: bool = False, **kwargs)

Module contents