tuneapi.types package

Submodules

tuneapi.types.chats module

This file contains all the datatypes relevant for a chat conversation. In general this is the nomenclature that we follow:

Message: a unit of information produced by a role
Thread: a group of messages is called a thread. There can be many 2 types of threads, linear and tree based.
ThreadsList: a group of linear threads is called a threads list.
Dataset: a container for grouping threads lists is called a dataset

Almost all the classes contain to_dict and from_dict for serialisation and deserialisation.

class tuneapi.types.chats.Dataset(train: ThreadsList, eval: ThreadsList)

Bases: object

This class is a container for training and evaulation datasets, useful for serialising items to and from disk

classmethod from_disk(folder: str): Deserialise and rebuild the container from a folder on the disk

classmethod from_list(items: List[Dataset])

to_disk(folder: str, fmt: str | None = None): Serialise all the items of the container to a folder on the disk

to_hf_dict() → Tuple[datasets.DatasetDict, Dict[str, List]]

class tuneapi.types.chats.EmbeddingGen(*, embedding: List[List[float]])

Bases: BaseModel

embedding: List[List[float]]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class tuneapi.types.chats.ImageGen(*, image: Image)

Bases: BaseModel

class Config

Bases: object

arbitrary_types_allowed = True

image: Image

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class tuneapi.types.chats.Message(value: str | List[Dict[str, Any]], role: str, images: List[str | Image] = [], id: str = None, **kwargs)

Bases: object

A message is the unit element of information in a thread. You should avoid using directly and use the convinience aliases tuneapi.types.chat. human/assistant/system/....

Parameters:

value (-) – this is generally a string or a list of dictionary objects for more advanced use cases
role (-) – the role who produced this information
images (-) – a list of PIL images or base64 strings

FUNCTION_CALL = 'function_call'

FUNCTION_RESP = 'function_resp'

GPT = 'gpt'

HUMAN = 'human'

KNOWN_ROLES = {'assistant': 'gpt', 'function-call': 'function_call', 'function-resp': 'function_resp', 'function_call': 'function_call', 'function_resp': 'function_resp', 'gpt': 'gpt', 'human': 'human', 'machine': 'gpt', 'sys': 'system', 'system': 'system', 'tool': 'function_resp', 'user': 'human'}: A map that contains the popularly known mappings to make life simpler

SYSTEM = 'system'

classmethod from_dict(data): Deserialise and construct a message from a dictionary

to_dict(format: str | None = None, meta: bool = False)

Serialise the Message into a dictionary of different formats:

format == ft then export to following format: {"from": "system/human/gpt", "value": "..."}
format == api then {"role": "system/user/assistant", "content": [{"type": "text", "text": {"value": "..."}]}. This is used with TuneAPI
format == full then {"id": 1234421123, "role": "system/user/assistant", "content": [{"type": "text", "text": {"value": "..."}]}
default: {"role": "system/user/assistant", "content": "..."}

class tuneapi.types.chats.ModelInterface

Bases: object

This is the generic interface implemented by all the model APIs

api_token: str: This is the API token for the model

base_url: str: This is the default URL that has to be pinged. This may not be the REST endpoint URL but anything

chat(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) → str | Dict[str, Any]: This is the blocking function to block chat with the model

async chat_async(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) → str | Dict[str, Any]: This is the async function to block chat with the model

distributed_chat(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, **kwargs): This is the blocking function to chat with the model in a distributed manner

async distributed_chat_async(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, **kwargs): This is the async function to chat with the model in a distributed manner

embedding(chats: Thread | List[str] | str, model: str, token: str | None, timeout: Tuple[int, int], raw: bool, extra_headers: Dict[str, str] | None) → EmbeddingGen: This is the blocking function to get embeddings for the chat

async embedding_async(chats: Thread | List[str] | str, model: str, token: str | None, timeout: Tuple[int, int], raw: bool, extra_headers: Dict[str, str] | None) → EmbeddingGen: This is the async function to get embeddings for the chat

extra_headers: Dict[str, Any]: This is the placeholder for any extra headers to be passed during request

image_gen(prompt: str, style: str, model: str, n: int, size: str, **kwargs) → ImageGen: This is the blocking function to generate images

async image_gen_async(prompt: str, style: str, model: str, n: int, size: str, **kwargs) → ImageGen: This is the async function to generate images

model_id: str: This is the model ID for the model

set_api_token(token: str) → None: This are used to set the API token for the model

speech_to_text(prompt: str, audio: str, model: str, timestamp_granularities: List[str], **kwargs) → Transcript: This is the blocking function to convert speech to text

async speech_to_text_async(prompt: str, audio: str, model: str, timestamp_granularities=['segment'], **kwargs) → Transcript: This is the async function to convert speech to text

stream_chat(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 60), raw: bool = False, debug: bool = False, extra_headers: Dict[str, str] | None = None): This is the blocking function to stream chat with the model where each token is iteratively generated

async stream_chat_async(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) → str | Dict[str, Any]: This is the async function to stream chat with the model where each token is iteratively generated

Bases: object

This is a container for a list of chat messages. This follows a similar interface to a list in python. See the methods below for more information.

Parameters:: *chats – List of chat Message objects

append(message: Message)

copy() → Thread

classmethod from_dict(data: Dict[str, Any]) → Thread

pop(message: Message = None)

to_dict(full: bool = False)

to_ft(id: Any = None, drop_last: bool = False) → Tuple[Dict[str, Any], Dict[str, Any]]

class tuneapi.types.chats.ThreadsList

Bases: list

This class implements some basic container methods for a list of Chat objects

add(x: Thread)

append(_ThreadsList__object: Thread) → None: Append object to the end of the list.

create_te_split(test_items: int | float = 0.1) → Tuple[ThreadsList, ...]

extend(_ThreadsList__iterable: Iterable) → None: Extend list by appending elements from the iterable.

classmethod from_dict(data)

classmethod from_disk(folder: str)

shuffle(seed: int | None = None) → None: Perform in place shuffle

table() → str

to_dict()

to_disk(folder: str, fmt: str | None = None, override: bool = False)

to_hf_dataset() → Tuple[datasets.Dataset, List]

class tuneapi.types.chats.ThreadsTree(*msgs: List[List | Message] | Message, id: str = None)

Bases: object

This is the tree representation of a thread, where each node is a Message object. Useful for regeneration and searching through a tree of conversations. This is a container providing all the necessary APIs.

class ROLLOUT

Bases: object

Continue = 'continue'

OneMoreRanker = 'one_more_ranker'

StopRollout = 'stop_rollout'

add(child: Message, to: Message = 'root') → ThreadsTree

property breadth: int

copy() → ThreadsTree

property degree_of_tree: int

delete(from_: Message) → ThreadsTree

classmethod from_dict(data: Dict[str, Any]) → ThreadsTree

property latest_message: Message

property latest_node: Node

pick(to: Message = None, from_: Message = None) → Thread: A poerful methods to get a thread from the Tree srtucture by telling to and from_ in the tree

regenerate(api: ModelInterface, /, from_: Message = None, prompt: str = None, dry: bool = False, **api_kwargs)

regenerate_stream(api: ModelInterface, /, from_: Message = None, prompt: str = None, dry: bool = False, **api_kwargs)

rollout(message_gen_fn: callable = None, value_fn: callable = None, from_: Message = None, max_rollouts: int = 20, depth: int = 5, children: int = 5, retry: int = 1)

property size: int

step(api: ModelInterface, /, from_: Message) → Message

step_stream(api: ModelInterface, /, from_: Message) → Generator[Message, None, None]

to_dict() → Dict[str, Any]

undo() → ThreadsTree

class tuneapi.types.chats.Tool(name: str, description: str, parameters: List[Prop])

Bases: object

A tool is a container for telling the LLM what it can do. This is a standard definition.

class Prop(name: str, description: str, type: str, required: bool = False, items: Dict | None = None, enum: List[str] | None = None)

Bases: object

An individual property is called a prop.

classmethod from_dict(x)

to_dict()

class tuneapi.types.chats.Transcript(*, segments: list[WebVTTCue])

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

segments: list[WebVTTCue]

property text

to(format: str = 'text')

class tuneapi.types.chats.Usage(input_tokens: int, output_tokens: int, cached_tokens: int | None = 0, **kwargs)

Bases: object

cost(input_token_per_million: float, cache_token_per_million: float, output_token_per_million: float) → float

to_json(*a, **k) → str

class tuneapi.types.chats.WebVTTCue(*, start: str, end: str, text: str)

Bases: BaseModel

end: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

start: str

text: str

tuneapi.types.chats.assistant = functools.partial(<class 'tuneapi.types.chats.Message'>, role='gpt'): Convinience for creating an assistant message

tuneapi.types.chats.function_call = functools.partial(<class 'tuneapi.types.chats.Message'>, role='function_call'): Convinience for creating a function call message

tuneapi.types.chats.function_resp = functools.partial(<class 'tuneapi.types.chats.Message'>, role='function_resp'): Convinience for creating a function response message

tuneapi.types.chats.get_transcript(text: str): Parses a WebVTT string and returns a list of WebVTTCue objects.

tuneapi.types.chats.human = functools.partial(<class 'tuneapi.types.chats.Message'>, role='human'): Convinience for creating a human message

tuneapi.types.chats.system = functools.partial(<class 'tuneapi.types.chats.Message'>, role='system'): Convinience for creating a system message

tuneapi.types.evals module

class tuneapi.types.evals.Evals

Bases: object

A simple class containing different evaluation metrics. Each function is self explanatory and returns a JSON logic object.

contains()

exactly()

is_function(**kwargs: dict)

tuneapi.types package

Submodules

tuneapi.types.chats module

tuneapi.types.evals module

Module contents