tuneapi.types package

Submodules

tuneapi.types.chats module

This file contains all the datatypes relevant for a chat conversation. In general this is the nomenclature that we follow:
  • Message: a unit of information produced by a role

  • Thread: a group of messages is called a thread. There can be many 2 types of threads, linear and tree based.

  • ThreadsList: a group of linear threads is called a threads list.

  • Dataset: a container for grouping threads lists is called a dataset

Almost all the classes contain to_dict and from_dict for serialisation and deserialisation.

class tuneapi.types.chats.Dataset(train: ThreadsList, eval: ThreadsList)

Bases: object

This class is a container for training and evaulation datasets, useful for serialising items to and from disk

classmethod from_disk(folder: str)

Deserialise and rebuild the container from a folder on the disk

classmethod from_list(items: List[Dataset])
to_disk(folder: str, fmt: str | None = None)

Serialise all the items of the container to a folder on the disk

to_hf_dict() Tuple[datasets.DatasetDict, Dict[str, List]]
class tuneapi.types.chats.EmbeddingGen(*, embedding: List[List[float]])

Bases: BaseModel

embedding: List[List[float]]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class tuneapi.types.chats.ImageGen(*, image: Image)

Bases: BaseModel

class Config

Bases: object

arbitrary_types_allowed = True
image: Image
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class tuneapi.types.chats.Message(value: str | List[Dict[str, Any]], role: str, images: List[str | Image] = [], id: str = None, **kwargs)

Bases: object

A message is the unit element of information in a thread. You should avoid using directly and use the convinience aliases tuneapi.types.chat. human/assistant/system/....

Parameters:
  • value (-) – this is generally a string or a list of dictionary objects for more advanced use cases

  • role (-) – the role who produced this information

  • images (-) – a list of PIL images or base64 strings

FUNCTION_CALL = 'function_call'
FUNCTION_RESP = 'function_resp'
GPT = 'gpt'
HUMAN = 'human'
KNOWN_ROLES = {'assistant': 'gpt', 'function-call': 'function_call', 'function-resp': 'function_resp', 'function_call': 'function_call', 'function_resp': 'function_resp', 'gpt': 'gpt', 'human': 'human', 'machine': 'gpt', 'sys': 'system', 'system': 'system', 'tool': 'function_resp', 'user': 'human'}

A map that contains the popularly known mappings to make life simpler

SYSTEM = 'system'
classmethod from_dict(data)

Deserialise and construct a message from a dictionary

to_dict(format: str | None = None, meta: bool = False)
Serialise the Message into a dictionary of different formats:
  • format == ft then export to following format: {"from": "system/human/gpt", "value": "..."}

  • format == api then {"role": "system/user/assistant", "content": [{"type": "text", "text": {"value": "..."}]}. This is used with TuneAPI

  • format == full then {"id": 1234421123, "role": "system/user/assistant", "content": [{"type": "text", "text": {"value": "..."}]}

  • default: {"role": "system/user/assistant", "content": "..."}

class tuneapi.types.chats.ModelInterface

Bases: object

This is the generic interface implemented by all the model APIs

api_token: str

This is the API token for the model

base_url: str

This is the default URL that has to be pinged. This may not be the REST endpoint URL but anything

chat(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) str | Dict[str, Any]

This is the blocking function to block chat with the model

async chat_async(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) str | Dict[str, Any]

This is the async function to block chat with the model

distributed_chat(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, **kwargs)

This is the blocking function to chat with the model in a distributed manner

async distributed_chat_async(prompts: List[Thread], post_logic: callable | None = None, max_threads: int = 10, retry: int = 3, pbar=True, **kwargs)

This is the async function to chat with the model in a distributed manner

embedding(chats: Thread | List[str] | str, model: str, token: str | None, timeout: Tuple[int, int], raw: bool, extra_headers: Dict[str, str] | None) EmbeddingGen

This is the blocking function to get embeddings for the chat

async embedding_async(chats: Thread | List[str] | str, model: str, token: str | None, timeout: Tuple[int, int], raw: bool, extra_headers: Dict[str, str] | None) EmbeddingGen

This is the async function to get embeddings for the chat

extra_headers: Dict[str, Any]

This is the placeholder for any extra headers to be passed during request

image_gen(prompt: str, style: str, model: str, n: int, size: str, **kwargs) ImageGen

This is the blocking function to generate images

async image_gen_async(prompt: str, style: str, model: str, n: int, size: str, **kwargs) ImageGen

This is the async function to generate images

model_id: str

This is the model ID for the model

set_api_token(token: str) None

This are used to set the API token for the model

speech_to_text(prompt: str, audio: str, model: str, timestamp_granularities: List[str], **kwargs) Transcript

This is the blocking function to convert speech to text

async speech_to_text_async(prompt: str, audio: str, model: str, timestamp_granularities=['segment'], **kwargs) Transcript

This is the async function to convert speech to text

stream_chat(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 60), raw: bool = False, debug: bool = False, extra_headers: Dict[str, str] | None = None)

This is the blocking function to stream chat with the model where each token is iteratively generated

async stream_chat_async(chats: Thread, model: str | None = None, max_tokens: int = 1024, temperature: float = 1, token: str | None = None, timeout=(5, 30), extra_headers: Dict[str, str] | None = None, **kwargs) str | Dict[str, Any]

This is the async function to stream chat with the model where each token is iteratively generated

class tuneapi.types.chats.Thread(*chats: List[Message] | Message, evals: Dict[str, Any] | None = None, model: str | None = None, id: str = '', title: str = '', tools: List[Tool] | None = None, schema: BaseModel | None = None, **kwargs)

Bases: object

This is a container for a list of chat messages. This follows a similar interface to a list in python. See the methods below for more information.

Parameters:

*chats – List of chat Message objects

append(message: Message)
copy() Thread
classmethod from_dict(data: Dict[str, Any]) Thread
pop(message: Message = None)
to_dict(full: bool = False)
to_ft(id: Any = None, drop_last: bool = False) Tuple[Dict[str, Any], Dict[str, Any]]
class tuneapi.types.chats.ThreadsList

Bases: list

This class implements some basic container methods for a list of Chat objects

add(x: Thread)
append(_ThreadsList__object: Thread) None

Append object to the end of the list.

create_te_split(test_items: int | float = 0.1) Tuple[ThreadsList, ...]
extend(_ThreadsList__iterable: Iterable) None

Extend list by appending elements from the iterable.

classmethod from_dict(data)
classmethod from_disk(folder: str)
shuffle(seed: int | None = None) None

Perform in place shuffle

table() str
to_dict()
to_disk(folder: str, fmt: str | None = None, override: bool = False)
to_hf_dataset() Tuple[datasets.Dataset, List]
class tuneapi.types.chats.ThreadsTree(*msgs: List[List | Message] | Message, id: str = None)

Bases: object

This is the tree representation of a thread, where each node is a Message object. Useful for regeneration and searching through a tree of conversations. This is a container providing all the necessary APIs.

class ROLLOUT

Bases: object

Continue = 'continue'
OneMoreRanker = 'one_more_ranker'
StopRollout = 'stop_rollout'
add(child: Message, to: Message = 'root') ThreadsTree
property breadth: int
copy() ThreadsTree
property degree_of_tree: int
delete(from_: Message) ThreadsTree
classmethod from_dict(data: Dict[str, Any]) ThreadsTree
property latest_message: Message
property latest_node: Node
pick(to: Message = None, from_: Message = None) Thread

A poerful methods to get a thread from the Tree srtucture by telling to and from_ in the tree

regenerate(api: ModelInterface, /, from_: Message = None, prompt: str = None, dry: bool = False, **api_kwargs)
regenerate_stream(api: ModelInterface, /, from_: Message = None, prompt: str = None, dry: bool = False, **api_kwargs)
rollout(message_gen_fn: callable = None, value_fn: callable = None, from_: Message = None, max_rollouts: int = 20, depth: int = 5, children: int = 5, retry: int = 1)
property size: int
step(api: ModelInterface, /, from_: Message) Message
step_stream(api: ModelInterface, /, from_: Message) Generator[Message, None, None]
to_dict() Dict[str, Any]
undo() ThreadsTree
class tuneapi.types.chats.Tool(name: str, description: str, parameters: List[Prop])

Bases: object

A tool is a container for telling the LLM what it can do. This is a standard definition.

class Prop(name: str, description: str, type: str, required: bool = False, items: Dict | None = None, enum: List[str] | None = None)

Bases: object

An individual property is called a prop.

classmethod from_dict(x)
to_dict()
class tuneapi.types.chats.Transcript(*, segments: list[WebVTTCue])

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

segments: list[WebVTTCue]
property text
to(format: str = 'text')
class tuneapi.types.chats.Usage(input_tokens: int, output_tokens: int, cached_tokens: int | None = 0, **kwargs)

Bases: object

cost(input_token_per_million: float, cache_token_per_million: float, output_token_per_million: float) float
to_json(*a, **k) str
class tuneapi.types.chats.WebVTTCue(*, start: str, end: str, text: str)

Bases: BaseModel

end: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

start: str
text: str
tuneapi.types.chats.assistant = functools.partial(<class 'tuneapi.types.chats.Message'>, role='gpt')

Convinience for creating an assistant message

tuneapi.types.chats.function_call = functools.partial(<class 'tuneapi.types.chats.Message'>, role='function_call')

Convinience for creating a function call message

tuneapi.types.chats.function_resp = functools.partial(<class 'tuneapi.types.chats.Message'>, role='function_resp')

Convinience for creating a function response message

tuneapi.types.chats.get_transcript(text: str)

Parses a WebVTT string and returns a list of WebVTTCue objects.

tuneapi.types.chats.human = functools.partial(<class 'tuneapi.types.chats.Message'>, role='human')

Convinience for creating a human message

tuneapi.types.chats.system = functools.partial(<class 'tuneapi.types.chats.Message'>, role='system')

Convinience for creating a system message

tuneapi.types.evals module

class tuneapi.types.evals.Evals

Bases: object

A simple class containing different evaluation metrics. Each function is self explanatory and returns a JSON logic object.

contains()
exactly()
is_function(**kwargs: dict)

Module contents