OpenAI API Automation: Complete Production Reference

The OpenAI API is the engine behind most business AI automation — what Make.com calls when you use its OpenAI modules, what n8n uses in its AI nodes, and what every Python-based AI workflow connects to directly. This complete reference covers everything you need to use it reliably in production.

Authentication and setup: do this before writing any code

Go to platform.openai.com. Create an account (separate billing from ChatGPT, though same email works). Navigate to API Keys. Click "Create new secret key." Name it descriptively ("Make.com Production"). Copy it immediately — it is shown only once. Store in a password manager or environment variable; never in source code.

Set a spending limit first: Go to Settings → Limits. Set a Monthly budget ($10–$25 to start). When the cap is hit, API calls fail cleanly with a 429 error rather than generating unbounded charges. Configure email notifications at 50% and 90% of your budget. This step takes 2 minutes and prevents every unpleasant surprise.

Basic API call in Python

import openai, os, json
client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Your system prompt."},
        {"role": "user", "content": "The input to process."}
    ],
    temperature=0.2,
    max_tokens=300,
    response_format={"type": "json_object"}
)
result = json.loads(response.choices[0].message.content)

Chat Completions API: parameter reference for automation

Key parameters with automation-specific guidance

Parameter	Recommended values	What it does
model	gpt-4o-mini (volume), gpt-4o (quality)	Choose the model — 33x cost difference matters
temperature	0.0–0.2 for classification; 0.4–0.7 for generation	Controls output randomness; lower = more consistent
max_tokens	150–300 for classification; 500–1500 for generation	Sets maximum output length — prevents unexpectedly long responses
response_format	{"type":"json_object"} for structured tasks	Forces valid JSON; eliminates markdown fences in output
seed	Any fixed integer during testing	Reproducible outputs for debugging (not guaranteed in production)

The model selection decision: Use gpt-4o-mini for all classification, extraction, routing, and summarisation tasks where you process more than a few hundred items/month. Use gpt-4o for complex reasoning, nuanced content generation, long documents, and agentic workflows. Test both models on 20 real examples for your specific task before committing — quality differences for simple tasks are often negligible at 33x cost difference.

Other API endpoints for automation

Embeddings: the foundation of semantic search and RAG

The Embeddings API converts text to vectors that capture semantic meaning. Two texts with similar meaning have similar vectors — enabling semantic search, similarity-based classification, and as the retrieval layer in RAG pipelines. text-embedding-3-small at $0.02/million tokens handles most use cases.

r = client.embeddings.create(
    model="text-embedding-3-small",
    input=["Text to convert to vector"]
)
vector = r.data[0].embedding  # List of 1536 floats

Whisper: audio transcription at $0.006/minute

Transcribes audio to text with high accuracy. Used in meeting summarisation pipelines, voicemail processing, and voice-to-task workflows. Accepts MP3, M4A, WAV, and other common formats.

with open("meeting.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-1", file=f, language="en"
    )
transcript = result.text  # Cost: $0.006 per minute of audio

Vision: document extraction from images

GPT-4o and GPT-4o-mini accept image inputs alongside text. Used for invoice extraction from scanned documents, screenshot analysis, and any document that cannot be easily converted to plain text.

import base64
with open("invoice.jpg","rb") as f:
    img = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role":"user","content":[
        {"type":"text","text":"Extract invoice data as JSON."},
        {"type":"image_url","image_url":{"url":f"data:image/jpeg;base64,{img}"}}
    ]}],
    response_format={"type":"json_object"}
)

Error handling: production automation error taxonomy

Complete error reference with recommended actions

HTTP	Error	Cause	Action
400	Bad Request	Invalid params, bad model name, wrong format	Log and halt — this is a code bug, not transient
401	Auth Error	Invalid or revoked API key	Alert immediately — automation cannot proceed
429 (rate)	Rate Limited	Too many requests per minute	Retry with exponential backoff: 1s, 2s, 4s, 8s
429 (quota)	Budget Exceeded	Monthly spending cap reached	Halt and alert — don't retry until next billing cycle
500	Server Error	Transient OpenAI infrastructure issue	Retry 3 times with backoff; then skip item and log
503	Unavailable	OpenAI degradation/maintenance	Retry with 60s delay; check status.openai.com

import time
from openai import RateLimitError, APIStatusError, APIConnectionError

def api_with_retry(messages, model="gpt-4o-mini", max_retries=3):
    for attempt in range(max_retries):
        try:
            r = client.chat.completions.create(
                model=model, messages=messages, temperature=0.2,
                response_format={"type":"json_object"}
            )
            return r.choices[0].message.content
        except RateLimitError as e:
            if "insufficient_quota" in str(e): raise  # Budget gone — don't retry
            time.sleep(2 ** attempt)
        except APIStatusError as e:
            if e.status_code == 400: raise  # Code bug — don't retry
            time.sleep(2 ** attempt * 2)
        except APIConnectionError:
            time.sleep(2 ** attempt * 5)
    raise Exception(f"API failed after {max_retries} attempts")

Cost tracking and the Batch API

Logging token costs per call

response = client.chat.completions.create(...)
usage = response.usage

# Prices Nov 2024 — verify current at platform.openai.com/pricing
PRICES = {
    "gpt-4o-mini-2024-07-18": {"in": 0.15/1e6, "out": 0.60/1e6},
    "gpt-4o-2024-08-06":      {"in": 2.50/1e6, "out": 10.0/1e6},
}
m = response.model
cost = usage.prompt_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["in"] +        usage.completion_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["out"]
print(f"  Tokens: {usage.total_tokens} | Cost: ${cost:.5f}")

The Batch API: 50% discount for non-real-time workloads

Submit a JSONL file of requests and receive results within 24 hours at 50% of the standard API price. Perfect for: nightly lead scoring batches, weekly content analysis, monthly report generation, and any automation task without real-time requirements. No change in output quality — same models, same results, half the cost.

import json

# Build JSONL request file
requests_data = []
for item in items_to_process:
    requests_data.append(json.dumps({
        "custom_id": item["id"],
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o-mini",
            "messages": [
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": item["content"]}
            ],
            "response_format": {"type": "json_object"}
        }
    }))

with open("batch.jsonl", "w") as f:
    f.write("
".join(requests_data))

# Submit
with open("batch.jsonl", "rb") as f:
    bf = client.files.create(file=f, purpose="batch")
batch = client.batches.create(
    input_file_id=bf.id, endpoint="/v1/chat/completions", completion_window="24h"
)
print(f"Batch submitted: {batch.id} — results within 24 hours")

Frequently asked questions

Is the OpenAI API the same as ChatGPT?

No. ChatGPT is a web application for manual use. The OpenAI API is a programmatic interface that software calls directly — it is what powers Make.com's OpenAI modules, n8n's AI nodes, and all Python-based AI automation. They have separate billing accounts, though you can use the same email. ChatGPT Plus does not include API access; sign up separately at platform.openai.com.

How do I handle very long inputs?

GPT-4o has a 128,000-token context window (approximately 96,000 words) — effectively unlimited for most business automation. If you do encounter long inputs: truncate to the relevant sections before the API call (a preprocessing step extracting only what the AI needs); chunk long documents and process in overlapping segments; or use RAG to retrieve only the relevant chunks. For most business use cases, simple truncation to the first N tokens is the simplest reliable approach.

Is it safe to send customer data to the OpenAI API?

OpenAI's API does not use inputs to train its models by default, per their API data usage policy. For sensitive data (healthcare, financial, legal), review their data processing agreement and your specific regulatory requirements. Options for higher sensitivity: data minimisation (send only the fields needed); OpenAI's zero data retention option; or self-hosted open-source models via Ollama where no data leaves your infrastructure.

Keep building your AI expertise

The complete guide covers every tool, architecture, and strategy.

Complete AI Automation Guide →

⚡

ThinkForAI Editorial Team

All code verified in production. Updated November 2024.