How to Use LLM Function Calling (Tools)

Connect an LLM to real data and actions with function calling — defining tools, the two-step loop, and validating arguments before you trust them.

Written by

BSH Technologies

Published on2026-04-15

What is LLM function calling and how do you use it?

Function calling, also called tool use, lets a language model request that your code run a specific function with structured arguments, then continue its answer using the result. You use it by describing your available functions to the model as a list of tools with names, descriptions, and parameter schemas. When the model decides a tool is needed, it returns the function name and arguments instead of a final answer; your code runs the function and sends the result back, and the model produces its reply grounded in real data.

This is how an LLM stops being a closed text box and starts checking live inventory, booking appointments, or querying your database. Both the OpenAI and Claude APIs support it, and the pattern is the same on each.

The call loop has two passes, not one

The crucial mental model is that a single user message can take two round trips to the model.

You send the user's message plus the list of tool definitions.
The model replies either with a final answer or with a request to call one or more tools, including the arguments it chose.
Your code executes the requested function — hitting a database, an API, whatever it maps to — and captures the result.
You send that result back to the model in a follow-up message, and the model returns its final answer using it.

The model never runs your code; it only asks you to. You stay firmly in control of what actually executes, which is exactly where the safety boundary belongs.

Describe your tools so the model picks the right one

The model decides which tool to call purely from the descriptions you give it, so those descriptions are not documentation for humans — they are instructions for the model.

Give each tool a clear, action-oriented name like get_order_status rather than something vague.
Write a one-line description of exactly when to use it, and when not to.
Define each parameter with a type and a description using JSON Schema, marking which are required.
Keep the tool set small and distinct; overlapping tools confuse the model into picking the wrong one.

If the model keeps calling the wrong function, fix the descriptions before you touch anything else. The selection is only as good as what you told it.

Never trust the arguments blindly

The model generates the function arguments, which means they are model output and must be validated like any untrusted input. It can hallucinate an ID that does not exist, omit a required field, or invent a value. Before you execute anything:

Validate every argument against a schema — a library like Zod or Pydantic makes this clean — and reject malformed calls.
Enforce permissions in your own code; never let a tool call bypass the access checks a normal request would face.
Treat destructive actions with extra care, ideally requiring confirmation rather than firing on the model's say-so.

If validation fails, you can send the error back to the model and let it correct itself on the next pass — a robust loop that turns a bad call into a retry rather than a crash.

Where function calling shines

The technique unlocks the integrations people actually want from AI: a support assistant that looks up a real order, a scheduling bot that checks a live calendar and books a slot, an internal tool that answers questions by querying your warehouse. It is also the foundation of agentic systems, where the model chains several tool calls together to complete a multi-step task. Start with one or two well-described, well-validated tools and grow from there; a small reliable tool set beats a sprawling one the model cannot navigate.

Watch out for loops and runaway chains

Once a model can call tools, and especially once it can chain them, you need guardrails so a single request cannot spiral. A confused model can call the same tool over and over, or string together far more steps than the task warrants, and each call costs tokens and time.

Cap the number of tool calls allowed per user request, and stop cleanly when the cap is hit.
Set timeouts on the tools themselves so a slow external API does not freeze the whole interaction.
Log the full chain of calls and arguments, so when something goes wrong you can see exactly what the model did and why.

These limits cost little to add and save you from the most common ways tool-using systems misbehave in production.

Prefer it built for you?

Function calling is powerful but unforgiving — argument validation, permissions, and the call loop all have to be solid before you connect a model to real actions. Talk to BSH Technologies about our software engineering services and we will build tool-using AI that touches your real systems safely.

Frequently asked questions

Does the LLM run my function itself?

No. The model only requests a call by returning the function name and arguments it chose. Your own code decides whether and how to execute it, then sends the result back. This keeps you in control of what actually runs and is where you enforce validation, permissions, and safety checks.

Why does the model keep calling the wrong tool?

The model chooses tools purely from the names and descriptions you provide, so wrong selections almost always trace back to vague or overlapping descriptions. Write a clear, action-oriented name and a one-line description of exactly when to use each tool, and keep the tool set small and distinct.

Do I need to validate the arguments the model generates?

Yes, always. Function arguments are model output and can be incorrect, incomplete, or hallucinated. Validate every argument against a schema using a library like Zod or Pydantic before executing, enforce your normal permission checks, and require confirmation for destructive actions rather than firing on the model output alone.

Do OpenAI and Claude both support function calling?

Yes. Both APIs let you define tools with names, descriptions, and parameter schemas, and both follow the same two-pass loop: the model requests a call, your code runs it, and you return the result for the final answer. The JSON formats differ slightly, but the concept is identical across providers.

From the blog

View all posts

Applied AI

How to Build an AI Agent for Free in 2026

You can build a working AI agent for free in 2026 using n8n, open-source frameworks, and a free LLM tier. Here is the exact stack and the steps.

BSH Technologies · 2026-06-17