Introduction
Pay for performance not power, Access to dozens of LLms through a single API.
Discover the best prices for each model from dozens of providers with Kolank
Concentrate on improving your product rather than searching for optimal models.
Quickstart
Authentication
Easily integrate Kolank, Just replace your OpenAI settings with the following:
Update BASE_URL
to https://kolank.com/api/v1
Update OPENAI_API_KEY
with your KOLANK_API_KEY
Update MODEL
with any of Kolank's listed model names from supported models.
Making requests
from openai import OpenAI
client = OpenAI(
base_url="https://kolank.com/api/v1",
api_key="<YOUR KOLANK_API_KEY>" # get one from https://kolank.com/keys,
)
completion = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{
"role": "user",
"content": "What is the capital of France?",
}
],
)
print(completion.choices[0].message.content)
curl https://kolank.com/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KOLANK_API_KEY" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://kolank.com/api/v1',
apiKey: '<YOUR KOLANK_API_KEY>', //get one from https://kolank.com keys,
});
async function main() {
const completion = await openai.chat.completions.create({
model: 'openai/gpt-4o',
messages: [{ role: 'user', content: 'What is the capital of France?' }],
});
console.log(completion.choices[0].message);
}
main();
The above command returns JSON structured like this:
{
"id": "chatcmpl-9juedQHAO0LLaqqr1yRDYWg0afZhx",
"object": "chat.completion",
"created": 1720729567,
"model": "gpt-4o-2024-05-13",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Sure! Let's break down quantum computing into simpler terms ..."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 392,
"total_tokens": 405
},
"system_fingerprint": "fp_d33f7b429e"
}
Creates a model response for the given chat conversation.
Endpoint
POST https://kolank.com/api/v1/chat/completions
Request body
Parameter | Type | Required | Description |
---|---|---|---|
messages |
array | Required | List of messages in the conversation. |
model |
string | Required | ID of the model to use. See the model endpoint compatibility table for details on supported models. |
frequency_penalty |
number or null | Optional | Penalizes repeated tokens. Range: -2.0 to 2.0. |
logit_bias |
map | Optional | Adjusts token likelihood with JSON object. Range: -100 to 100. |
logprobs |
boolean or null | Optional | Returns log probabilities of output tokens if true. |
top_logprobs |
integer or null | Optional | Returns top token probabilities. Range: 0 to 20. |
max_tokens |
integer or null | Optional | Maximum tokens in chat completion. Limited by model context length. |
n |
integer or null | Optional | Number of completion choices per message. Defaults to 1 to minimize costs. |
presence_penalty |
number or null | Optional | Encourages diverse topic exploration. Range: -2.0 to 2.0. |
response_format |
object | Optional | Specifies output format for the model. Supports JSON mode. |
seed |
integer or null | Optional | Beta feature. Optional seed for deterministic sampling. |
service_tier |
string or null | Optional | Specifies the latency tier for request processing. Relevant for scale tier service. |
stop |
string / array / null | Optional | Up to 4 sequences where the API stops generating tokens. |
stream |
boolean or null | Optional | Enables partial message deltas in streaming mode. |
stream_options |
object or null | Optional | Options for streaming response. |
temperature |
number or null | Optional | Controls output randomness. Range: 0 to 2. Defaults to 1. |
top_p |
number or null | Optional | Uses nucleus sampling to control token diversity. Range: 0 to 1. Defaults to 1. |
tools |
array | Optional | List of tools (functions) the model may call. Up to 128 functions supported. |
tool_choice |
string or object | Optional | Controls model behavior regarding tool usage. Defaults to "none". |
parallel_tool_calls |
boolean | Optional | Whether to enable parallel function calling during tool use. Defaults to true. |
user |
string | Optional | Unique identifier for end-user monitoring and abuse detection. |
Supported LLMs
Model Name | $ Per 1M input tokens | $ Per 1M output tokens | Context |
---|---|---|---|
01-ai/Yi-34B-Chat |
$0.2 | $0.2 | 4096 |
alpindale/goliath-120b |
$15 | $15 | 6144 |
anthropic/claude-1 |
$8 | $24 | 100000 |
anthropic/claude-1.2 |
$8 | $24 | 100000 |
anthropic/claude-2 |
$8 | $24 | 200000 |
anthropic/claude-2.0 |
$8 | $24 | 100000 |
anthropic/claude-2.1 |
$8 | $24 | 200000 |
Anthropic/claude-3-haiku |
$0.25 | $1.25 | 200000 |
Anthropic/claude-3-opus |
$15 | $75 | 200000 |
Anthropic/claude-3-sonnet |
$3 | $15 | 200000 |
anthropic/claude-instant-1 |
$0.8 | $2.4 | 100000 |
anthropic/claude-instant-1.0 |
$0.8 | $2.4 | 100000 |
anthropic/claude-instant-1.1 |
$0.8 | $2.4 | 100000 |
anthropic/claude-instant-1.2 |
$0.8 | $2.4 | 100000 |
Austism/chronos-hermes-13b-v2 |
$0.13 | $0.13 | 4096 |
brucethemoose/yi-34b-200k-capybara |
$0.9 | $0.9 | 200000 |
codellama/CodeLlama-34b-Instruct-hf |
$0.8 | $0.8 | 32768 |
codellama/CodeLlama-70b-Instruct-hf |
$0.9 | $0.9 | 4096 |
cognitivecomputations/dolphin-2.6-mixtral-8x7b |
$0.24 | $0.24 | 32768 |
Cohere/command |
$1 | $2 | 4096 |
Cohere/command-r |
$0.5 | $1.5 | 128000 |
Cohere/command-r-plus |
$3 | $15 | 128000 |
Databricks/dbrx-instruct |
$1.2 | $1.2 | 32768 |
deepseek-ai/deepseek-coder-33b-instruct |
$0.8 | $0.8 | 16384 |
fireworks/firefunction-v2 |
$0.9 | $0.9 | 8192 |
fw/firellava-13b |
$0.2 | $0.2 | 4096 |
garage-bAInd/Platypus2-70B-instruct |
$0.9 | $0.9 | 4096 |
google/gemini-flash-1.5 |
$0.25 | $0.75 | 2800000 |
google/gemini-pro |
$0.13 | $0.38 | 91000 |
google/gemini-pro-1.5 |
$2.6 | $7.6 | 2800000 |
google/gemini-pro-vision |
$0.13 | $0.38 | 40000 |
google/gemma-7b-it |
$0.07 | $0.07 | 8192 |
google/palm-2-chat-bison |
$0.3 | $0.5 | 20000 |
google/palm-2-chat-bison-32k |
$0.3 | $0.5 | 32000 |
google/palm-2-codechat-bison |
$0.3 | $0.5 | 28000 |
google/palm-2-codechat-bison-32k |
$0.3 | $0.5 | 80000 |
Gryphe/MythoMax-L2-13b |
$0.13 | $0.13 | 4096 |
Gryphe/MythoMist-7b |
$0.6 | $0.6 | 32768 |
jondurbin/airoboros-l2-70b-gpt4-1.4.1 |
$0.7 | $0.9 | 4096 |
lizpreciatior/lzlv_70b_fp16_hf |
$0.59 | $0.79 | 4096 |
lmsys/vicuna-13b-v1.5 |
$0.3 | $0.3 | 4096 |
lmsys/vicuna-7b-v1.5 |
$0.13 | $0.13 | 4096 |
Mancer/weaver-alpha |
$3.6 | $3.6 | 8000 |
meta-llama/Llama-2-13b-chat-hf |
$0.13 | $0.13 | 4096 |
meta-llama/Llama-2-70b-chat-hf |
$0.64 | $0.8 | 4096 |
meta-llama/Llama-2-7b-chat-hf |
$0.2 | $0.2 | 4096 |
Meta-llama/Meta-Llama-3-70B-Instruct |
$0.59 | $0.79 | 8192 |
Meta-llama/Meta-Llama-3-8B-Instruct |
$0.08 | $0.08 | 8192 |
Microsoft/WizardLM-2-7B |
$0.07 | $0.07 | 32000 |
Microsoft/WizardLM-2-8x22B |
$0.65 | $0.65 | 64000 |
mistralai/Mistral-7B-Instruct-v0.1 |
$0.07 | $0.07 | 32768 |
mistralai/Mistral-7B-Instruct-v0.2 |
$0.07 | $0.07 | 32768 |
mistralai/mistral-7b-instruct-v3 |
$0.07 | $0.07 | 32768 |
Mistralai/mistral-large |
$8 | $24 | 32000 |
Mistralai/mistral-small |
$1 | $3 | 32000 |
Mistralai/Mixtral-8x22B-Instruct-v0.1 |
$0.6 | $0.6 | 64000 |
mistralai/Mixtral-8x7B-Instruct-v0.1 |
$0.2 | $0.2 | 32768 |
NeverSleep/Noromaid-20b-v0.1.1 |
$2.4 | $3.6 | 8192 |
NousResearch/Nous-Capybara-7B-V1p9 |
$0.2 | $0.2 | 32768 |
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO |
$0.5 | $0.5 | 32768 |
NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT |
$0.6 | $0.6 | 32768 |
NousResearch/Nous-Hermes-2-Yi-34B |
$0.13 | $0.13 | 4096 |
NousResearch/Nous-Hermes-llama-2-7b |
$0.2 | $0.2 | 4096 |
NousResearch/Nous-Hermes-Llama2-13b |
$0.3 | $0.3 | 4096 |
Open-Orca/Mistral-7B-OpenOrca |
$0.2 | $0.2 | 32768 |
openai/gpt-3.5-turbo |
$3 | $6 | 16385 |
openai/gpt-3.5-turbo-0125 |
$0.13 | $0.13 | 16000 |
openai/gpt-3.5-turbo-0613 |
$1.5 | $2 | 4096 |
openai/gpt-3.5-turbo-1106 |
$1 | $2 | 16385 |
openai/gpt-3.5-turbo-16k |
$3 | $4 | 16385 |
openai/gpt-3.5-turbo-16k-0613 |
$3 | $4 | 16385 |
openai/gpt-3.5-turbo-instruct |
$1.5 | $2 | 4096 |
openai/gpt-4 |
$30 | $60 | 8192 |
openai/gpt-4-0125-preview |
$10 | $30 | 128000 |
openai/gpt-4-0613 |
$30 | $60 | 8192 |
openai/gpt-4-1106-preview |
$10 | $30 | 128000 |
openai/gpt-4-32k |
$60 | $120 | 32000 |
openai/gpt-4-32k-0613 |
$60 | $120 | 32000 |
openai/gpt-4-turbo-2024-04-09 |
$10 | $30 | 128000 |
openai/gpt-4-turbo-preview |
$10 | $30 | 128000 |
openai/gpt-4-vision-preview |
$10 | $30 | 128000 |
openai/gpt-4o |
$5 | $15 | 128000 |
openai/gpt-4o-2024-05-13 |
$5 | $15 | 128000 |
openchat/openchat-3.6-8b |
$0.08 | $0.08 | 8192 |
openchat/openchat_3.5 |
$0.07 | $0.07 | 8192 |
Phind/Phind-CodeLlama-34B-v2 |
$0.6 | $0.6 | 16384 |
PygmalionAI/mythalion-13b |
$1.8 | $1.8 | 8192 |
Qwen/Qwen1.5-0.5B-Chat |
$0.1 | $0.1 | 32768 |
Qwen/Qwen1.5-1.8B-Chat |
$0.1 | $0.1 | 32768 |
Qwen/Qwen1.5-14B-Chat |
$0.2 | $0.2 | 32000 |
Qwen/Qwen1.5-4B-Chat |
$0.1 | $0.1 | 32768 |
Qwen/Qwen1.5-72B-Chat |
$0.9 | $0.9 | 32768 |
Qwen/Qwen1.5-7B-Chat |
$0.1 | $0.1 | 32768 |
snorkelai/Snorkel-Mistral-PairRM-DPO |
$0.2 | $0.2 | 32768 |
teknium/OpenHermes-2-Mistral-7B |
$0.2 | $0.2 | 32768 |
teknium/OpenHermes-2p5-Mistral-7B |
$0.2 | $0.2 | 32768 |
togethercomputer/StripedHyena-Nous-7B |
$0.2 | $0.2 | 32768 |
Undi95/ReMM-SLERP-L2-13B |
$0.3 | $0.3 | 4096 |
Undi95/Toppy-M-7B |
$0.2 | $0.2 | 4096 |
Xwin-LM/Xwin-LM-70B-V0.1 |
$6 | $6 | 8192 |