---
title: "Instrument AI Agents"
description: "Learn how to manually instrument your code to use Sentry's Agents module."
url: https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module/
---

# Instrument AI Agents | Sentry for Rack Middleware

With [Sentry AI Agent Monitoring](https://docs.sentry.io/ai/monitoring/agents/dashboards.md), you can monitor and debug your AI systems with full-stack context. You'll be able to track key insights like token usage, latency, tool usage, and error rates. AI Agent Monitoring data will be fully connected to your other Sentry data like logs, errors, and traces.

As a prerequisite to setting up AI Agent Monitoring with Ruby, you'll need to first [set up tracing](https://docs.sentry.io/platforms/ruby/guides/rack/tracing.md). Once this is done, you can use the custom instrumentation described below to capture AI agent spans.

## [Manual Instrumentation](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#manual-instrumentation)

For your AI agents data to show up in the Sentry [AI Agents Insights](https://sentry.io/orgredirect/organizations/:orgslug/insights/ai/agents/), at least one of the AI spans needs to be created and have well-defined names and data attributes. See details below.

Make sure that there's a transaction running when you create the spans. If you're using a web framework like Rails those transactions will be created for you automatically.

## [Spans](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#spans)

### [AI Request span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#ai-request-span)

This span represents a request to a LLM model or service that generates a response based on the input prompt.

AI Request span attributes

* The span `op` MUST be `"gen_ai.{gen_ai.operation.name}"`. (e.g. `"gen_ai.request"`)
* The span `name` SHOULD be `{gen_ai.operation.name} {gen_ai.request.model}"`. (e.g. `"chat o3-mini"`)
* All [Common Span Attributes](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#common-span-attributes) SHOULD be set (all `required` common attributes MUST be set).

Additional attributes on the span:

| Data Attribute                          | Type   | Requirement Level | Description                                                                          | Example                                                                                                           |
| --------------------------------------- | ------ | ----------------- | ------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------- |
| `gen_ai.request.available_tools`        | string | optional          | List of objects describing the available tools. **\[0]**                             | `"[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]"` |
| `gen_ai.request.frequency_penalty`      | float  | optional          | Model configuration parameter.                                                       | `0.5`                                                                                                             |
| `gen_ai.request.max_tokens`             | int    | optional          | Model configuration parameter.                                                       | `500`                                                                                                             |
| `gen_ai.request.messages`               | string | optional          | List of objects describing the messages (prompts) sent to the LLM **\[0]**, **\[1]** | `"[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]"`                  |
| `gen_ai.request.presence_penalty`       | float  | optional          | Model configuration parameter.                                                       | `0.5`                                                                                                             |
| `gen_ai.request.temperature`            | float  | optional          | Model configuration parameter.                                                       | `0.1`                                                                                                             |
| `gen_ai.request.top_p`                  | float  | optional          | Model configuration parameter.                                                       | `0.7`                                                                                                             |
| `gen_ai.response.tool_calls`            | string | optional          | The tool calls in the model's response. **\[0]**                                     | `"[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]"`                          |
| `gen_ai.response.text`                  | string | optional          | The text representation of the model's responses. **\[0]**                           | `"[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]"`                                       |
| `gen_ai.usage.input_tokens.cache_write` | int    | optional          | The number of tokens written to the cache when processing the AI input (prompt).     | `100`                                                                                                             |
| `gen_ai.usage.input_tokens.cached`      | int    | optional          | The number of cached tokens used in the AI input (prompt)                            | `50`                                                                                                              |
| `gen_ai.usage.input_tokens`             | int    | optional          | The number of tokens used in the AI input (prompt).                                  | `10`                                                                                                              |
| `gen_ai.usage.output_tokens.reasoning`  | int    | optional          | The number of tokens used for reasoning.                                             | `30`                                                                                                              |
| `gen_ai.usage.output_tokens`            | int    | optional          | The number of tokens used in the AI response.                                        | `100`                                                                                                             |
| `gen_ai.usage.total_tokens`             | int    | optional          | The total number of tokens used to process the prompt. (input and output)            | `190`                                                                                                             |

* **\[0]:** Span attributes only allow primitive data types. This means you need to use a stringified version of a list of dictionaries. Do NOT set `[{"foo": "bar"}]` but rather the string `"[{\"foo\": \"bar\"}]"`.
* **\[1]:** Each message item uses the format `{role:"", content:""}`. The `role` can be `"user"`, `"assistant"`, or `"system"`. The `content` can be either a string or a list of dictionaries.

#### [Example AI Request span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#example-ai-request-span)

```ruby
require 'json'

messages = [{ role: 'user', content: 'Tell me a joke' }]

Sentry.with_child_span(op: 'gen_ai.request', description: 'chat o3-mini') do |span|
  span.set_data('gen_ai.request.model', 'o3-mini')
  span.set_data('gen_ai.request.messages', messages.to_json)
  span.set_data('gen_ai.operation.name', 'invoke_agent')

  # Call your LLM here
  result = client.chat(model: 'o3-mini', messages: messages)

  span.set_data('gen_ai.response.text', [result.choices[0].message.content].to_json)
  # Set token usage
  span.set_data('gen_ai.usage.input_tokens', result.usage.prompt_tokens)
  span.set_data('gen_ai.usage.output_tokens', result.usage.completion_tokens)
end
```

### [Invoke Agent Span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#invoke-agent-span)

This span represents the execution of an AI agent, capturing the full lifecycle from receiving a task to producing a final response.

Invoke Agent span attributes

Describes AI agent invocation.

* The spans `op` MUST be `"gen_ai.invoke_agent"`.
* The span `name` SHOULD be `"invoke_agent {gen_ai.agent.name}"`.
* The `gen_ai.operation.name` attribute MUST be `"invoke_agent"`.
* The `gen_ai.agent.name` attribute SHOULD be set to the agent's name. (e.g. `"Weather Agent"`)
* All [Common Span Attributes](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#common-span-attributes) SHOULD be set (all `required` common attributes MUST be set).

Additional attributes on the span:

| Data Attribute                          | Type   | Requirement Level | Description                                                                          | Example                                                                                                           |
| --------------------------------------- | ------ | ----------------- | ------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------- |
| `gen_ai.request.available_tools`        | string | optional          | List of objects describing the available tools. **\[0]**                             | `"[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]"` |
| `gen_ai.request.frequency_penalty`      | float  | optional          | Model configuration parameter.                                                       | `0.5`                                                                                                             |
| `gen_ai.request.max_tokens`             | int    | optional          | Model configuration parameter.                                                       | `500`                                                                                                             |
| `gen_ai.request.messages`               | string | optional          | List of objects describing the messages (prompts) sent to the LLM **\[0]**, **\[1]** | `"[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]"`                  |
| `gen_ai.request.presence_penalty`       | float  | optional          | Model configuration parameter.                                                       | `0.5`                                                                                                             |
| `gen_ai.request.temperature`            | float  | optional          | Model configuration parameter.                                                       | `0.1`                                                                                                             |
| `gen_ai.request.top_p`                  | float  | optional          | Model configuration parameter.                                                       | `0.7`                                                                                                             |
| `gen_ai.response.tool_calls`            | string | optional          | The tool calls in the model’s response. **\[0]**                                     | `"[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]"`                          |
| `gen_ai.response.text`                  | string | optional          | The text representation of the model's responses. **\[0]**                           | `"[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]"`                                       |
| `gen_ai.usage.input_tokens.cache_write` | int    | optional          | The number of tokens written to the cache when processing the AI input (prompt).     | `100`                                                                                                             |
| `gen_ai.usage.input_tokens.cached`      | int    | optional          | The number of cached tokens used in the AI input (prompt)                            | `50`                                                                                                              |
| `gen_ai.usage.input_tokens`             | int    | optional          | The number of tokens used in the AI input (prompt).                                  | `10`                                                                                                              |
| `gen_ai.usage.output_tokens.reasoning`  | int    | optional          | The number of tokens used for reasoning.                                             | `30`                                                                                                              |
| `gen_ai.usage.output_tokens`            | int    | optional          | The number of tokens used in the AI response.                                        | `100`                                                                                                             |
| `gen_ai.usage.total_tokens`             | int    | optional          | The total number of tokens used to process the prompt. (input and output)            | `190`                                                                                                             |

* **\[0]:** Span attributes only allow primitive data types (like `int`, `float`, `boolean`, `string`). This means you need to use a stringified version of a list of dictionaries. Do NOT set `[{"foo": "bar"}]` but rather the string `"[{\"foo\": \"bar\"}]"`.
* **\[1]:** Each message item uses the format `{role:"", content:""}`. The `role` can be `"user"`, `"assistant"`, or `"system"`. The `content` can be either a string or a list of dictionaries.

#### [Example of an Invoke Agent Span:](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#example-of-an-invoke-agent-span)

```ruby
Sentry.with_child_span(op: 'gen_ai.invoke_agent', description: 'invoke_agent Weather Agent') do |span|
  span.set_data('gen_ai.request.model', 'o3-mini')
  span.set_data('gen_ai.agent.name', 'Weather Agent')

  # Run the agent
  result = my_agent.run

  span.set_data('gen_ai.response.text', result.to_s)
  # Set token usage
  span.set_data('gen_ai.usage.input_tokens', result.usage.input_tokens)
  span.set_data('gen_ai.usage.output_tokens', result.usage.output_tokens)
end
```

### [Execute Tool Span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#execute-tool-span)

This span represents the execution of a tool or function that was requested by an AI model, including the input arguments and resulting output.

Execute Tool span attributes

Describes a tool execution.

* The span `op` MUST be `"gen_ai.execute_tool"`.
* The span `name` SHOULD be `"execute_tool {gen_ai.tool.name}"`. (e.g. `"execute_tool query_database"`)
* The `gen_ai.tool.name` attribute SHOULD be set to the name of the tool. (e.g. `"query_database"`)
* All [Common Span Attributes](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#common-span-attributes) SHOULD be set (all `required` common attributes MUST be set).

Additional attributes on the span:

| Data Attribute            | Type   | Requirement Level | Description                                          | Example                                    |
| ------------------------- | ------ | ----------------- | ---------------------------------------------------- | ------------------------------------------ |
| `gen_ai.tool.description` | string | optional          | Description of the tool executed.                    | `"Tool returning a random number"`         |
| `gen_ai.tool.input`       | string | optional          | Input that was given to the executed tool as string. | `"{\"max\":10}"`                           |
| `gen_ai.tool.name`        | string | optional          | Name of the tool executed.                           | `"random_number"`                          |
| `gen_ai.tool.output`      | string | optional          | The output from the tool.                            | `"7"`                                      |
| `gen_ai.tool.type`        | string | optional          | The type of the tools.                               | `"function"`; `"extension"`; `"datastore"` |

#### [Example Execute Tool Span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#example-execute-tool-span)

```ruby
require 'json'

Sentry.with_child_span(op: 'gen_ai.execute_tool', description: 'execute_tool get_weather') do |span|
  span.set_data('gen_ai.tool.name', 'get_weather')
  span.set_data('gen_ai.tool.input', { location: 'Paris' }.to_json)

  # Call the tool
  result = get_weather(location: 'Paris')

  span.set_data('gen_ai.tool.output', result.to_json)
end
```

### [Handoff Span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#handoff-span)

This span marks the transition of control from one agent to another, typically when the current agent determines another agent is better suited to handle the task.

Handoff span attributes

A span that describes the handoff from one agent to another.

* The spans `op` MUST be `"gen_ai.handoff"`.
* The spans `name` SHOULD be `"handoff from {from_agent} to {to_agent}"`.
* All [Common Span Attributes](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#common-span-attributes) SHOULD be set.

#### [Example of a Handoff Span](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#example-of-a-handoff-span)

```ruby
Sentry.with_child_span(op: 'gen_ai.handoff', description: 'handoff from Weather Agent to Travel Agent') do |span|
  # Handoff span just marks the transition
end

Sentry.with_child_span(op: 'gen_ai.invoke_agent', description: 'invoke_agent Travel Agent') do |span|
  # Run the target agent here
end
```

## [Common Span Attributes](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#common-span-attributes)

Some attributes are common to all AI Agents spans:

| Data Attribute          | Type   | Requirement Level | Description                                          | Example           |
| ----------------------- | ------ | ----------------- | ---------------------------------------------------- | ----------------- |
| `gen_ai.request.model`  | string | required          | The name of the AI model a request is being made to. | `"o3-mini"`       |
| `gen_ai.operation.name` | string | optional          | The name of the operation being performed.           | `"summarize"`     |
| `gen_ai.agent.name`     | string | optional          | The name of the agent this span belongs to.          | `"Weather Agent"` |

## [Token Usage and Cost Gotchas](https://docs.sentry.io/platforms/ruby/guides/rack/tracing/instrumentation/custom-instrumentation/ai-agents-module.md#token-usage-and-cost-gotchas)

When manually setting token attributes, be aware of how Sentry uses them to [calculate model costs](https://docs.sentry.io/ai/monitoring/agents/costs.md).

**Cached and reasoning tokens are subsets, not separate counts.** `gen_ai.usage.input_tokens` is the **total** input token count that already includes any cached tokens. Similarly, `gen_ai.usage.output_tokens` already includes reasoning tokens. Sentry subtracts the cached/reasoning counts from the totals to compute the "raw" portion, so reporting them incorrectly can produce wrong or negative costs.

For example, say your LLM call uses 100 input tokens total, 90 of which were served from cache. Using a standard rate of $0.01 per token and a cached rate of $0.001 per token:

**Correct** — `input_tokens` is the total (includes cached):

* `gen_ai.usage.input_tokens = 100`
* `gen_ai.usage.input_tokens.cached = 90`
* Sentry calculates: `(100 - 90) × $0.01 + 90 × $0.001` = `$0.10 + $0.09` = **$0.19** ✓

**Wrong** — `input_tokens` set to only the non-cached tokens, making cached larger than total:

* `gen_ai.usage.input_tokens = 10`
* `gen_ai.usage.input_tokens.cached = 90`
* Sentry calculates: `(10 - 90) × $0.01 + 90 × $0.001` = `−$0.80 + $0.09` = **−$0.71**

Because `input_tokens.cached` (90) is larger than `input_tokens` (10), the subtraction goes negative, resulting in a negative total cost.

The same applies to `gen_ai.usage.output_tokens` and `gen_ai.usage.output_tokens.reasoning`.
