⚙️ Technical Depth

OpenAI API Automation: Complete Production Reference

The OpenAI API powers most business AI automation. This complete reference covers authentication, Chat Completions parameters, Embeddings, Vision, Whisper, comprehensive error handling, token cost tracking, and the Batch API for 50% cost reduction.

Technical·By ThinkForAI Editorial Team·November 2024·~20 min read
The OpenAI API is the engine behind most business AI automation — what Make.com calls when you use its OpenAI modules, what n8n uses in its AI nodes, and what every Python-based AI workflow connects to directly. This complete reference covers everything you need to use it reliably in production.
Sponsored

Authentication and setup: do this before writing any code

Go to platform.openai.com. Create an account (separate billing from ChatGPT, though same email works). Navigate to API Keys. Click "Create new secret key." Name it descriptively ("Make.com Production"). Copy it immediately — it is shown only once. Store in a password manager or environment variable; never in source code.

Set a spending limit first: Go to Settings → Limits. Set a Monthly budget ($10–$25 to start). When the cap is hit, API calls fail cleanly with a 429 error rather than generating unbounded charges. Configure email notifications at 50% and 90% of your budget. This step takes 2 minutes and prevents every unpleasant surprise.

Basic API call in Python
import openai, os, json
client = openai.OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Your system prompt."},
        {"role": "user", "content": "The input to process."}
    ],
    temperature=0.2,
    max_tokens=300,
    response_format={"type": "json_object"}
)
result = json.loads(response.choices[0].message.content)

Chat Completions API: parameter reference for automation

Key parameters with automation-specific guidance

ParameterRecommended valuesWhat it does
modelgpt-4o-mini (volume), gpt-4o (quality)Choose the model — 33x cost difference matters
temperature0.0–0.2 for classification; 0.4–0.7 for generationControls output randomness; lower = more consistent
max_tokens150–300 for classification; 500–1500 for generationSets maximum output length — prevents unexpectedly long responses
response_format{"type":"json_object"} for structured tasksForces valid JSON; eliminates markdown fences in output
seedAny fixed integer during testingReproducible outputs for debugging (not guaranteed in production)

The model selection decision: Use gpt-4o-mini for all classification, extraction, routing, and summarisation tasks where you process more than a few hundred items/month. Use gpt-4o for complex reasoning, nuanced content generation, long documents, and agentic workflows. Test both models on 20 real examples for your specific task before committing — quality differences for simple tasks are often negligible at 33x cost difference.

Other API endpoints for automation

Embeddings: the foundation of semantic search and RAG

The Embeddings API converts text to vectors that capture semantic meaning. Two texts with similar meaning have similar vectors — enabling semantic search, similarity-based classification, and as the retrieval layer in RAG pipelines. text-embedding-3-small at $0.02/million tokens handles most use cases.

r = client.embeddings.create(
    model="text-embedding-3-small",
    input=["Text to convert to vector"]
)
vector = r.data[0].embedding  # List of 1536 floats

Whisper: audio transcription at $0.006/minute

Transcribes audio to text with high accuracy. Used in meeting summarisation pipelines, voicemail processing, and voice-to-task workflows. Accepts MP3, M4A, WAV, and other common formats.

with open("meeting.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-1", file=f, language="en"
    )
transcript = result.text  # Cost: $0.006 per minute of audio

Vision: document extraction from images

GPT-4o and GPT-4o-mini accept image inputs alongside text. Used for invoice extraction from scanned documents, screenshot analysis, and any document that cannot be easily converted to plain text.

import base64
with open("invoice.jpg","rb") as f:
    img = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role":"user","content":[
        {"type":"text","text":"Extract invoice data as JSON."},
        {"type":"image_url","image_url":{"url":f"data:image/jpeg;base64,{img}"}}
    ]}],
    response_format={"type":"json_object"}
)

Error handling: production automation error taxonomy

Complete error reference with recommended actions

HTTPErrorCauseAction
400Bad RequestInvalid params, bad model name, wrong formatLog and halt — this is a code bug, not transient
401Auth ErrorInvalid or revoked API keyAlert immediately — automation cannot proceed
429 (rate)Rate LimitedToo many requests per minuteRetry with exponential backoff: 1s, 2s, 4s, 8s
429 (quota)Budget ExceededMonthly spending cap reachedHalt and alert — don't retry until next billing cycle
500Server ErrorTransient OpenAI infrastructure issueRetry 3 times with backoff; then skip item and log
503UnavailableOpenAI degradation/maintenanceRetry with 60s delay; check status.openai.com
import time
from openai import RateLimitError, APIStatusError, APIConnectionError

def api_with_retry(messages, model="gpt-4o-mini", max_retries=3):
    for attempt in range(max_retries):
        try:
            r = client.chat.completions.create(
                model=model, messages=messages, temperature=0.2,
                response_format={"type":"json_object"}
            )
            return r.choices[0].message.content
        except RateLimitError as e:
            if "insufficient_quota" in str(e): raise  # Budget gone — don't retry
            time.sleep(2 ** attempt)
        except APIStatusError as e:
            if e.status_code == 400: raise  # Code bug — don't retry
            time.sleep(2 ** attempt * 2)
        except APIConnectionError:
            time.sleep(2 ** attempt * 5)
    raise Exception(f"API failed after {max_retries} attempts")

Cost tracking and the Batch API

Logging token costs per call

response = client.chat.completions.create(...)
usage = response.usage

# Prices Nov 2024 — verify current at platform.openai.com/pricing
PRICES = {
    "gpt-4o-mini-2024-07-18": {"in": 0.15/1e6, "out": 0.60/1e6},
    "gpt-4o-2024-08-06":      {"in": 2.50/1e6, "out": 10.0/1e6},
}
m = response.model
cost = usage.prompt_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["in"] +        usage.completion_tokens * PRICES.get(m, PRICES["gpt-4o-mini-2024-07-18"])["out"]
print(f"  Tokens: {usage.total_tokens} | Cost: ${cost:.5f}")

The Batch API: 50% discount for non-real-time workloads

Submit a JSONL file of requests and receive results within 24 hours at 50% of the standard API price. Perfect for: nightly lead scoring batches, weekly content analysis, monthly report generation, and any automation task without real-time requirements. No change in output quality — same models, same results, half the cost.

import json

# Build JSONL request file
requests_data = []
for item in items_to_process:
    requests_data.append(json.dumps({
        "custom_id": item["id"],
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o-mini",
            "messages": [
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": item["content"]}
            ],
            "response_format": {"type": "json_object"}
        }
    }))

with open("batch.jsonl", "w") as f:
    f.write("
".join(requests_data))

# Submit
with open("batch.jsonl", "rb") as f:
    bf = client.files.create(file=f, purpose="batch")
batch = client.batches.create(
    input_file_id=bf.id, endpoint="/v1/chat/completions", completion_window="24h"
)
print(f"Batch submitted: {batch.id} — results within 24 hours")

Frequently asked questions

Is the OpenAI API the same as ChatGPT?

No. ChatGPT is a web application for manual use. The OpenAI API is a programmatic interface that software calls directly — it is what powers Make.com's OpenAI modules, n8n's AI nodes, and all Python-based AI automation. They have separate billing accounts, though you can use the same email. ChatGPT Plus does not include API access; sign up separately at platform.openai.com.

How do I handle very long inputs?

GPT-4o has a 128,000-token context window (approximately 96,000 words) — effectively unlimited for most business automation. If you do encounter long inputs: truncate to the relevant sections before the API call (a preprocessing step extracting only what the AI needs); chunk long documents and process in overlapping segments; or use RAG to retrieve only the relevant chunks. For most business use cases, simple truncation to the first N tokens is the simplest reliable approach.

Is it safe to send customer data to the OpenAI API?

OpenAI's API does not use inputs to train its models by default, per their API data usage policy. For sensitive data (healthcare, financial, legal), review their data processing agreement and your specific regulatory requirements. Options for higher sensitivity: data minimisation (send only the fields needed); OpenAI's zero data retention option; or self-hosted open-source models via Ollama where no data leaves your infrastructure.

Keep building your AI expertise

The complete guide covers every tool, architecture, and strategy.

Complete AI Automation Guide →

ThinkForAI Editorial Team

All code verified in production. Updated November 2024.