Exa | Chatbot

Why Exa in a chatbot?

Whether you are building an internal chatbot for your employees, a customer-facing chatbot to field questions, or as a personal project, imbuing Exa yields massive gains:

Model agnostic: Works with OpenAI, Anthropic, or any open-source model
Superior search: Faster, more relevant, and more comprehensive than model search calling
Always current: Real-time information instead of stale training data
Configurable: Exa's model parameters can dynamically be adjusted for any use case

Get Started

Install dependencies

npm install exa-js openai

Get your Exa API key from the Exa Dashboard.

You'll also need an API key from your model provider (OpenAI, OpenRouter, etc.).

Initialize clients

import Exa from "exa-js";
import OpenAI from "openai";

const exa = new Exa(process.env.EXA_API_KEY);
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

Define the search tool

Give the model a tool it can call when it needs web information. The tool accepts 1-5 parallel searches:

const searchTool = {
  type: "function",
  function: {
    name: "web_search",
    description: `Search the web via Exa. Write queries as natural language.

RESULT COUNT:
- Default: numResults = 5 (use this for most queries)
- Complex queries needing depth: use multiple focused searches with numResults = 5 each

CATEGORIES - Use sparingly:
- company: ONLY for "what does X company do" or company research
- people: ONLY for non-public figures (finding someone's LinkedIn)
- research_paper: ONLY for academic papers or arxiv

For news, sports, general facts, quotes - DO NOT use a category.`,
    parameters: {
      type: "object",
      properties: {
        searches: {
          type: "array",
          items: {
            type: "object",
            properties: {
              query: { type: "string" },
              numResults: { type: "number", default: 5 },
              category: {
                type: "string",
                enum: ["company", "people", "research_paper"],
              }
            },
            required: ["query"]
          },
          description: "1-5 searches to run in parallel.",
          maxItems: 5,
        },
      },
      required: ["searches"],
    },
  },
};

Create the search function

When the model calls the tool, execute an Exa search:

async function searchExa(query, numResults = 5) {
  const response = await exa.search(query, {
    numResults,
    highlights: true,
    type: "auto",
  });
  return response.results.map(r => ({
    title: r.title,
    url: r.url,
    text: r.highlights?.join("\n").substring(0, 2000),
  }));
}

We use highlights: true to get relevant page snippets along with search results—no separate scraping needed.

Write the system prompt

The system prompt tells the model when to search. This is one example—adjust for your use case:

The prompt guides the model on when to search vs answer directly. Customize this based on your chatbot's purpose.

Implement the chat flow

The core pattern: call the model with the tool available, execute parallel searches if requested, then stream the final answer:

async function chat(userMessage) {
  const messages = [
    { role: "system", content: systemPrompt },
    { role: "user", content: userMessage },
  ];

  // First call: model decides if it needs to search
  const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    tools: [searchTool],
    stream: true,
  });

  const assistantMsg = response.choices[0].message;

  // No search needed—return direct answer
  if (!assistantMsg.tool_calls) {
    return assistantMsg.content;
  }

  // Execute parallel searches
  const args = JSON.parse(assistantMsg.tool_calls[0].function.arguments);
  const searchPromises = args.searches.map(s =>
    searchExa(s.query, s.category, s.numResults)
  );
  const allResults = await Promise.all(searchPromises);

  // Second call: answer with search context
  messages.push(assistantMsg, {
    role: "tool",
    tool_call_id: assistantMsg.tool_calls[0].id,
    content: JSON.stringify(allResults.flat()),
  });

  const final = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });

  return final.choices[0].message.content;
}

The model can request 1-5 parallel searches for complex queries. Streaming is supported for both the initial response and the final answer.

Showing Citations

Exa returns source metadata alongside every search result. You can use this to show users exactly where information came from.

What Exa returns

Each result from exa.search includes title, url, publishedDate, and author. These are your citations.

// Each Exa result includes citation metadata
const sources = response.results.map(r => ({
  title: r.title,         // "OpenAI announces GPT-5"
  url: r.url,             // "https://openai.com/blog/gpt-5"
  publishedDate: r.publishedDate, // "2026-02-15"
  author: r.author,       // "OpenAI"
}));

How we display them

In this demo, we pass the source metadata to the frontend separately from the LLM response. After the model finishes answering, we render the sources as expandable cards grouped by search query — each showing the title, domain, date, and a link to the original page.

// Send citation metadata to the frontend
const citationData = searchResults.map(({ query, results }) => ({
  query,
  sources: results.map(r => ({
    title: r.title,
    url: r.url,
    date: r.publishedDate,
    author: r.author,
  })),
}));

Instead of showing all sources in a list, you could have the LLM cite inline (e.g. [1], [2]) by instructing it to reference specific URLs from the Exa results. You could also have the model report how many sources it actually used in its answer, giving users a confidence signal without cluttering the UI.

That's it! The model now decides when to search, executes Exa queries for real-time information, and synthesizes answers with citations.

Get started with Exa for free.