Why Exa in a chatbot?
Whether you are building an internal chatbot for your employees, a customer-facing chatbot to field questions, or as a personal project, imbuing Exa yields massive gains:
- Model agnostic: Works with OpenAI, Anthropic, or any open-source model
- Superior search: Faster, more relevant, and more comprehensive than model search calling
- Always current: Real-time information instead of stale training data
- Configurable: Exa's model parameters can dynamically be adjusted for any use case
Get Started
Install dependencies
npm install exa-js openaiGet your Exa API key from the Exa Dashboard.
You'll also need an API key from your model provider (OpenAI, OpenRouter, etc.).
Initialize clients
import Exa from "exa-js";
import OpenAI from "openai";
const exa = new Exa(process.env.EXA_API_KEY);
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });Define the search tool
Give the model a tool it can call when it needs web information. The tool accepts 1-5 parallel searches:
const searchTool = {
type: "function",
function: {
name: "web_search",
description: `Search the web via Exa. Write queries as natural language.
RESULT COUNT:
- Default: numResults = 5 (use this for most queries)
- Complex queries needing depth: use multiple focused searches with numResults = 5 each
CATEGORIES - Use sparingly:
- company: ONLY for "what does X company do" or company research
- people: ONLY for non-public figures (finding someone's LinkedIn)
- research_paper: ONLY for academic papers or arxiv
For news, sports, general facts, quotes - DO NOT use a category.`,
parameters: {
type: "object",
properties: {
searches: {
type: "array",
items: {
type: "object",
properties: {
query: { type: "string" },
numResults: { type: "number", default: 5 },
category: {
type: "string",
enum: ["company", "people", "research_paper"],
}
},
required: ["query"]
},
description: "1-5 searches to run in parallel.",
maxItems: 5,
},
},
required: ["searches"],
},
},
};Create the search function
When the model calls the tool, execute an Exa search:
async function searchExa(query, numResults = 5) {
const response = await exa.search(query, {
numResults,
highlights: true,
type: "auto",
});
return response.results.map(r => ({
title: r.title,
url: r.url,
text: r.highlights?.join("\n").substring(0, 2000),
}));
}We use highlights: true to get relevant page snippets along with search results—no separate scraping needed.
Write the system prompt
The system prompt tells the model when to search. This is one example—adjust for your use case:
The prompt guides the model on when to search vs answer directly. Customize this based on your chatbot's purpose.
Implement the chat flow
The core pattern: call the model with the tool available, execute parallel searches if requested, then stream the final answer:
async function chat(userMessage) {
const messages = [
{ role: "system", content: systemPrompt },
{ role: "user", content: userMessage },
];
// First call: model decides if it needs to search
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
tools: [searchTool],
stream: true,
});
const assistantMsg = response.choices[0].message;
// No search needed—return direct answer
if (!assistantMsg.tool_calls) {
return assistantMsg.content;
}
// Execute parallel searches
const args = JSON.parse(assistantMsg.tool_calls[0].function.arguments);
const searchPromises = args.searches.map(s =>
searchExa(s.query, s.category, s.numResults)
);
const allResults = await Promise.all(searchPromises);
// Second call: answer with search context
messages.push(assistantMsg, {
role: "tool",
tool_call_id: assistantMsg.tool_calls[0].id,
content: JSON.stringify(allResults.flat()),
});
const final = await client.chat.completions.create({
model: "gpt-4o",
messages,
stream: true,
});
return final.choices[0].message.content;
}The model can request 1-5 parallel searches for complex queries. Streaming is supported for both the initial response and the final answer.
Showing Citations
Exa returns source metadata alongside every search result. You can use this to show users exactly where information came from.
What Exa returns
Each result from exa.search includes title, url, publishedDate, and author. These are your citations.
// Each Exa result includes citation metadata
const sources = response.results.map(r => ({
title: r.title, // "OpenAI announces GPT-5"
url: r.url, // "https://openai.com/blog/gpt-5"
publishedDate: r.publishedDate, // "2026-02-15"
author: r.author, // "OpenAI"
}));How we display them
In this demo, we pass the source metadata to the frontend separately from the LLM response. After the model finishes answering, we render the sources as expandable cards grouped by search query — each showing the title, domain, date, and a link to the original page.
// Send citation metadata to the frontend
const citationData = searchResults.map(({ query, results }) => ({
query,
sources: results.map(r => ({
title: r.title,
url: r.url,
date: r.publishedDate,
author: r.author,
})),
}));Instead of showing all sources in a list, you could have the LLM cite inline (e.g. [1], [2]) by instructing it to reference specific URLs from the Exa results. You could also have the model report how many sources it actually used in its answer, giving users a confidence signal without cluttering the UI.
That's it! The model now decides when to search, executes Exa queries for real-time information, and synthesizes answers with citations.
Get started with Exa for free.