How to Connect ChatGPT to Your App

Wire OpenAI into a real product — keys on the server, the Chat Completions call, error handling, and the rate-limit traps that break demos.

Written by

BSH Technologies

Published on2026-04-18

How do you connect ChatGPT to your app?

You connect ChatGPT to an app by calling OpenAI's Chat Completions API from your own backend with a server-side API key, never from the browser. Your frontend sends the user's message to your server, your server adds the key and forwards it to OpenAI, and the model's reply comes back through the same path. That one rule — the key lives on the server — is what separates a safe integration from a leaked credential and a surprise bill.

The model behind the ChatGPT product is reached through the API as a model identifier such as gpt-4o or gpt-4o-mini. You are not embedding the chat website; you are talking to the same underlying model over HTTPS and shaping its behaviour yourself.

Get a key and make the first call

Create an account at platform.openai.com, generate a secret key, and store it as an environment variable like OPENAI_API_KEY. The official openai SDK for Node.js or Python reads that variable automatically.

Install the SDK and load the key from the environment, not from a hardcoded string in your source.
Send a request to the Chat Completions endpoint with a messages array and a model name.
Read the assistant's reply from the first choice in the response and return it to your frontend.

The messages array is the heart of the call. It is an ordered list of turns, each tagged with a role: a system message that sets the assistant's behaviour, user messages for what the person typed, and assistant messages for prior model replies. To hold a conversation, you resend the running history each time — the API itself is stateless and remembers nothing between requests.

The system prompt is where your product's personality lives

A generic integration uses no system prompt and gets generic answers. A good one opens every request with a system message that defines tone, scope, and boundaries: who the assistant is, what it should refuse, and how it should format replies. This is the cheapest, highest-leverage lever you have, and it costs nothing extra per call.

Treat the system prompt as configuration, not an afterthought. Version it, review changes to it, and test it like any other part of your product.

Handle the failures that demos always ignore

The happy path is five lines of code. Production is everything around it. OpenAI returns HTTP 429 when you exceed your rate limit and occasional 500-class errors when the service is busy, and a request can simply time out. Wrap every call in error handling that degrades gracefully.

Retry transient failures with exponential backoff and a sensible ceiling, so a brief blip does not surface as an error to the user.
Set a request timeout so a slow response cannot hang your own endpoint indefinitely.
Cap the number of tokens you send and request, both to control cost and to avoid hitting the model's context limit.
Show the user a calm fallback message when the model is unavailable, never a raw stack trace.

Cost is the other quiet trap. You pay per token for input and output, so a chat that naively resends a thousand-message history on every turn gets expensive fast. Trim or summarise old turns, and watch your usage dashboard in the first week rather than discovering the bill at the end of the month.

Ship it behind your own endpoint

Expose a single endpoint on your backend — something like a chat route — that takes the user's message, attaches your system prompt and any conversation history, calls OpenAI, and returns the reply. Putting the model behind your own API means you can swap models, add logging, enforce per-user rate limits, and inject safety checks without touching the frontend. It also keeps the key where it belongs: on the server, out of reach of anyone inspecting network traffic.

That indirection pays off the moment you need to change something. Want to try a cheaper model, add content moderation, or cap how many requests a single user can make per minute? All of that lives in your endpoint, invisible to the client. You can also log every prompt and response for debugging and quality review, which becomes invaluable the first time a user reports a strange answer and you need to see exactly what the model was sent.

A short checklist before you call it done

Before you consider the integration finished, walk through the basics that separate a prototype from something you would put in front of customers.

The API key is an environment variable on the server, never in client code or source control.
Every call is wrapped in error handling with retries and a timeout.
A system prompt sets behaviour, scope, and a graceful way to decline out-of-scope requests.
Conversation history is trimmed or summarised so token cost stays predictable.
Usage alerts are set so a runaway loop cannot quietly drain your account.

Tick those off and you have an integration that behaves under load, not just in a happy-path demo.

Prefer it built for you?

Connecting ChatGPT to an app is straightforward in a demo and full of sharp edges in production — key safety, retries, cost control, and conversation state all have to be right. If you would rather skip the trial and error, talk to BSH Technologies about our software engineering services and we will wire OpenAI into your product the way it should be done.

Frequently asked questions

Can I call the OpenAI API directly from the browser?

No. Calling OpenAI from frontend JavaScript exposes your secret API key to anyone who opens the network tab, and they can run up your bill or abuse your account. Always route requests through your own backend, which holds the key as an environment variable and forwards calls to OpenAI server-side.

Is the ChatGPT API the same as ChatGPT?

They share the underlying models but are different products. ChatGPT is the website and app; the API lets you call the same models, such as gpt-4o, from your own code. With the API you control the system prompt, conversation history, and behaviour, rather than embedding the chat interface.

How much does it cost to connect ChatGPT to an app?

OpenAI bills per token for both the text you send and the text the model returns, with smaller models like gpt-4o-mini costing far less than larger ones. There is no fixed monthly fee for API access, so a low-traffic app can cost a few dollars a month while a busy one scales with usage.

How do I keep a conversation going across messages?

The API is stateless, so it does not remember earlier turns. To maintain context, your backend resends the relevant message history with each request. Store the conversation per user and trim or summarise older messages to stay within the context window and control token cost.

From the blog

View all posts

Applied AI

How to Build an AI Agent for Free in 2026

You can build a working AI agent for free in 2026 using n8n, open-source frameworks, and a free LLM tier. Here is the exact stack and the steps.

BSH Technologies · 2026-06-17