Authentication and setup: do this before writing any code
Go to platform.openai.com. Create an account (separate billing from ChatGPT, though same email works). Navigate to API Keys. Click "Create new secret key." Name it descriptively ("Make.com Production"). Copy it immediately — it is shown only once. Store in a password manager or environment variable; never in source code.
Set a spending limit first: Go to Settings → Limits. Set a Monthly budget ($10–$25 to start). When the cap is hit, API calls fail cleanly with a 429 error rather than generating unbounded charges. Configure email notifications at 50% and 90% of your budget. This step takes 2 minutes and prevents every unpleasant surprise.
import openai, os, json
client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Your system prompt."},
{"role": "user", "content": "The input to process."}
],
temperature=0.2,
max_tokens=300,
response_format={"type": "json_object"}
)
result = json.loads(response.choices[0].message.content)Chat Completions API: parameter reference for automation
Key parameters with automation-specific guidance
| Parameter | Recommended values | What it does |
|---|---|---|
| model | gpt-4o-mini (volume), gpt-4o (quality) | Choose the model — 33x cost difference matters |
| temperature | 0.0–0.2 for classification; 0.4–0.7 for generation | Controls output randomness; lower = more consistent |
| max_tokens | 150–300 for classification; 500–1500 for generation | Sets maximum output length — prevents unexpectedly long responses |
| response_format | {"type":"json_object"} for structured tasks | Forces valid JSON; eliminates markdown fences in output |
| seed | Any fixed integer during testing | Reproducible outputs for debugging (not guaranteed in production) |
The model selection decision: Use gpt-4o-mini for all classification, extraction, routing, and summarisation tasks where you process more than a few hundred items/month. Use gpt-4o for complex reasoning, nuanced content generation, long documents, and agentic workflows. Test both models on 20 real examples for your specific task before committing — quality differences for simple tasks are often negligible at 33x cost difference.
Other API endpoints for automation
Embeddings: the foundation of semantic search and RAG
The Embeddings API converts text to vectors that capture semantic meaning. Two texts with similar meaning have similar vectors — enabling semantic search, similarity-based classification, and as the retrieval layer in RAG pipelines. text-embedding-3-small at $0.02/million tokens handles most use cases.
r = client.embeddings.create(
model="text-embedding-3-small",
input=["Text to convert to vector"]
)
vector = r.data[0].embedding # List of 1536 floatsWhisper: audio transcription at $0.006/minute
Transcribes audio to text with high accuracy. Used in meeting summarisation pipelines, voicemail processing, and voice-to-task workflows. Accepts MP3, M4A, WAV, and other common formats.
with open("meeting.mp3", "rb") as f:
result = client.audio.transcriptions.create(
model="whisper-1", file=f, language="en"
)
transcript = result.text # Cost: $0.006 per minute of audioVision: document extraction from images
GPT-4o and GPT-4o-mini accept image inputs alongside text. Used for invoice extraction from scanned documents, screenshot analysis, and any document that cannot be easily converted to plain text.
import base64
with open("invoice.jpg","rb") as f:
img = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role":"user","content":[
{"type":"text","text":"Extract invoice data as JSON."},
{"type":"image_url","image_url":{"url":f"data:image/jpeg;base64,{img}"}}
]}],
response_format={"type":"json_object"}
)Error handling: production automation error taxonomy
Complete error reference with recommended actions
| HTTP | Error | Cause | Action |
|---|---|---|---|
| 400 | Bad Request | Invalid params, bad model name, wrong format | Log and halt — this is a code bug, not transient |
| 401 | Auth Error | Invalid or revoked API key | Alert immediately — automation cannot proceed |
| 429 (rate) | Rate Limited | Too many requests per minute | Retry with exponential backoff: 1s, 2s, 4s, 8s |
| 429 (quota) | Budget Exceeded | Monthly spending cap reached | Halt and alert — don't retry until next billing cycle |
| 500 | Server Error | Transient OpenAI infrastructure issue | Retry 3 times with backoff; then skip item and log |
| 503 | Unavailable | OpenAI degradation/maintenance | Retry with 60s delay; check status.openai.com |
import time
from openai import RateLimitError, APIStatusError, APIConnectionError
def api_with_retry(messages, model="gpt-4o-mini", max_retries=3):
for attempt in range(max_retries):
try:
r = client.chat.completions.create(
model=model, messages=messages, temperature=0.2,
response_format={"type":"json_object"}
)
return r.choices[0].message.content
except RateLimitError as e:
if "insufficient_quota" in str(e): raise # Budget gone — don't retry
time.sleep(2 ** attempt)
except APIStatusError as e:
if e.status_code == 400: raise # Code bug — don't retry
time.sleep(2 ** attempt * 2)
except APIConnectionError:
time.sleep(2 ** attempt * 5)
raise Exception(f"API failed after {max_retries} attempts")Cost tracking and the Batch API
Logging token costs per call
response = client.chat.completions.create(...)
usage = response.usage
# Prices Nov 2024 — verify current at platform.openai.com/pricing
PRICES = {
"gpt-4o-mini-2024-07-18": {"in": 0.15/1e6, "out": 0.60/1e6},
"gpt-4o-2024-08-06": {"in": 2.50/1e6, "out": 10.0/1e6},
}
m = response.model
cost = usage.prompt_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["in"] + usage.completion_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["out"]
print(f" Tokens: {usage.total_tokens} | Cost: ${cost:.5f}")The Batch API: 50% discount for non-real-time workloads
Submit a JSONL file of requests and receive results within 24 hours at 50% of the standard API price. Perfect for: nightly lead scoring batches, weekly content analysis, monthly report generation, and any automation task without real-time requirements. No change in output quality — same models, same results, half the cost.
import json
# Build JSONL request file
requests_data = []
for item in items_to_process:
requests_data.append(json.dumps({
"custom_id": item["id"],
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": item["content"]}
],
"response_format": {"type": "json_object"}
}
}))
with open("batch.jsonl", "w") as f:
f.write("
".join(requests_data))
# Submit
with open("batch.jsonl", "rb") as f:
bf = client.files.create(file=f, purpose="batch")
batch = client.batches.create(
input_file_id=bf.id, endpoint="/v1/chat/completions", completion_window="24h"
)
print(f"Batch submitted: {batch.id} — results within 24 hours")Frequently asked questions
No. ChatGPT is a web application for manual use. The OpenAI API is a programmatic interface that software calls directly — it is what powers Make.com's OpenAI modules, n8n's AI nodes, and all Python-based AI automation. They have separate billing accounts, though you can use the same email. ChatGPT Plus does not include API access; sign up separately at platform.openai.com.
GPT-4o has a 128,000-token context window (approximately 96,000 words) — effectively unlimited for most business automation. If you do encounter long inputs: truncate to the relevant sections before the API call (a preprocessing step extracting only what the AI needs); chunk long documents and process in overlapping segments; or use RAG to retrieve only the relevant chunks. For most business use cases, simple truncation to the first N tokens is the simplest reliable approach.
OpenAI's API does not use inputs to train its models by default, per their API data usage policy. For sensitive data (healthcare, financial, legal), review their data processing agreement and your specific regulatory requirements. Options for higher sensitivity: data minimisation (send only the fields needed); OpenAI's zero data retention option; or self-hosted open-source models via Ollama where no data leaves your infrastructure.
Keep building your AI expertise
The complete guide covers every tool, architecture, and strategy.
Complete AI Automation Guide →ThinkForAI Editorial Team
All code verified in production. Updated November 2024.
