> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getnetra.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Tracing CrewAI Pipelines

> Trace CrewAI multi-agent pipelines with Netra. Monitor agent handoffs, track per-agent costs, and debug task delegation in collaborative AI workflows.

This cookbook shows you how to add **complete observability** to CrewAI multi-agent pipelines—tracing agent-to-agent handoffs, measuring individual agent performance, and tracking per-agent costs.

<Card title="Open in Google Colab" icon="google" href="https://colab.research.google.com/github/KeyValueSoftwareSystems/netra-cookbooks/blob/master/Tracing_CrewAI_Pipelines.ipynb">
  Run the complete notebook in your browser
</Card>

<Note>
  All company names (ContentCraft) and scenarios in this cookbook are entirely fictional and used for demonstration purposes only.
</Note>

## What You'll Learn

<CardGroup cols={2}>
  <Card title="Trace Agent Handoffs" icon="arrows-turn-right">
    Capture the message flow between agents as tasks pass through the pipeline
  </Card>

  <Card title="Track Per-Agent Costs" icon="coins">
    Monitor token usage and costs for each agent role to identify cost drivers
  </Card>

  <Card title="Debug Multi-Agent Flows" icon="bug">
    Understand why agents made specific decisions and where quality degrades
  </Card>

  <Card title="Compare Configurations" icon="scale-balanced">
    Run experiments with different model assignments to find the cost/quality sweet spot
  </Card>
</CardGroup>

<Info>
  **Prerequisites:**

  * Python >=3.10, \<3.14
  * OpenAI API key
  * Netra API key ([Get started here](/quick-start/Overview))
  * CrewAI installed
</Info>

***

## Why Trace Multi-Agent Systems?

Multi-agent systems introduce complexity that single-agent workflows don't have:

| Failure Mode        | Symptom              | What Tracing Reveals            |
| ------------------- | -------------------- | ------------------------------- |
| Agent bottleneck    | Pipeline slow        | Which agent takes longest       |
| Handoff failure     | Context lost         | Message content between agents  |
| Cost explosion      | Budget exceeded      | Which agent uses most tokens    |
| Quality degradation | Poor output          | Where quality drops in pipeline |
| Model mismatch      | Inconsistent results | Which model for which role      |

Without per-agent visibility, you can't optimize individual roles or identify where the pipeline breaks down.

***

## CrewAI Architecture

CrewAI organizes multi-agent work into three components:

| Component | Description                                    | Example                                 |
| --------- | ---------------------------------------------- | --------------------------------------- |
| **Agent** | Autonomous unit with role, goal, backstory     | Research Specialist, Content Writer     |
| **Task**  | Work item with description and expected output | "Research the topic", "Write the draft" |
| **Crew**  | Team of agents executing tasks                 | Content creation team                   |

**Processes:**

* **Sequential**: Tasks execute one after another (A → B → C)
* **Hierarchical**: Manager agent delegates to workers

***

## Building an Example Pipeline

### Installation

```bash theme={null}
pip install netra-sdk crewai crewai-tools openai langchain-openai
```

### Environment Setup

```bash theme={null}
export NETRA_API_KEY="your-netra-api-key"
export NETRA_OTLP_ENDPOINT="your-netra-otlp-endpoint"
export OPENAI_API_KEY="your-openai-api-key"
```

### Define the Agents

Create a 4-agent content pipeline: Researcher → Writer → Editor → SEO:

```python theme={null}
from crewai import Agent
from langchain_openai import ChatOpenAI

def create_agents(config: dict = None):
    """Create the content team agents with configurable models."""
    config = config or {
        "researcher": "gpt-4o",
        "writer": "gpt-4o",
        "editor": "gpt-3.5-turbo",
        "seo": "gpt-3.5-turbo",
    }

    researcher = Agent(
        role="Research Specialist",
        goal="Gather accurate facts, statistics, and expert opinions",
        backstory="Expert researcher with 10 years of experience in content research.",
        llm=ChatOpenAI(model=config["researcher"]),
        verbose=True,
    )

    writer = Agent(
        role="Content Writer",
        goal="Write engaging, well-structured blog articles",
        backstory="Professional copywriter with expertise in compelling content.",
        llm=ChatOpenAI(model=config["writer"]),
        verbose=True,
    )

    editor = Agent(
        role="Quality Editor",
        goal="Polish articles for clarity, grammar, and flow",
        backstory="Senior editor with a keen eye for detail.",
        llm=ChatOpenAI(model=config["editor"]),
        verbose=True,
    )

    seo_specialist = Agent(
        role="SEO Optimizer",
        goal="Optimize content for search engines",
        backstory="SEO expert who balances keywords with readability.",
        llm=ChatOpenAI(model=config["seo"]),
        verbose=True,
    )

    return {
        "researcher": researcher,
        "writer": writer,
        "editor": editor,
        "seo": seo_specialist,
    }
```

### Define the Tasks

Create tasks that chain together:

```python theme={null}
from crewai import Task

def create_tasks(agents: dict, topic: str):
    """Create the content pipeline tasks."""

    research_task = Task(
        description=f"Research the topic: '{topic}'. Find key facts and statistics.",
        expected_output="Research brief with facts, statistics, and sources",
        agent=agents["researcher"],
    )

    writing_task = Task(
        description="Write a 800-1000 word blog article based on the research.",
        expected_output="Draft blog article in markdown format",
        agent=agents["writer"],
        context=[research_task],
    )

    editing_task = Task(
        description="Edit the article for grammar, flow, and clarity.",
        expected_output="Polished blog article with improved clarity",
        agent=agents["editor"],
        context=[writing_task],
    )

    seo_task = Task(
        description="Optimize the article for SEO with meta description and keywords.",
        expected_output="SEO-optimized article with metadata",
        agent=agents["seo"],
        context=[editing_task],
    )

    return [research_task, writing_task, editing_task, seo_task]
```

### Create the Crew

```python theme={null}
from crewai import Crew, Process

def run_content_crew(topic: str, config: dict = None):
    """Execute the content creation pipeline."""
    agents = create_agents(config)
    tasks = create_tasks(agents, topic)

    crew = Crew(
        agents=list(agents.values()),
        tasks=tasks,
        process=Process.sequential,
        verbose=True,
    )

    return crew.kickoff()
```

***

## Adding Netra Observability

### Initialize Netra with Auto-Instrumentation

Netra provides auto-instrumentation for CrewAI that captures agent execution automatically:

```python theme={null}
from netra import Netra
from netra.instrumentation.instruments import InstrumentSet

# Initialize Netra with CrewAI and OpenAI instrumentation
Netra.init(
    app_name="contentcraft",
    environment="development",
    trace_content=True,
    instruments={InstrumentSet.CREWAI, InstrumentSet.OPENAI},
)
```

With auto-instrumentation enabled, Netra automatically captures:

* Agent execution spans with role and backstory
* Task execution with descriptions and outputs
* LLM calls with prompts, completions, and token usage
* Cost calculations per agent

### Using the Workflow Decorator

For more control, wrap your pipeline with the `@workflow` decorator:

```python theme={null}
from netra.decorators import workflow

@workflow(name="content-pipeline")
def create_article(topic: str, config_name: str = "default", config: dict = None):
    """Run the content creation pipeline with full tracing."""

    # Set custom attributes for filtering and analysis
    Netra.set_custom_attributes(key="topic", value=topic)
    Netra.set_custom_attributes(key="config_name", value=config_name)

    # Run the crew
    result = run_content_crew(topic, config)

    return {
        "topic": topic,
        "config": config_name,
        "output": result.raw,
    }
```

### Adding Custom Span Attributes

Track additional metadata for each pipeline run:

```python theme={null}
from netra import Netra, SpanType

@workflow(name="content-pipeline-detailed")
def create_article_detailed(topic: str, config_name: str, config: dict):
    """Run pipeline with detailed custom tracing."""

    with Netra.start_span("pipeline-setup") as setup_span:
        setup_span.set_attribute("topic", topic)
        setup_span.set_attribute("config_name", config_name)
        setup_span.set_attribute("model.researcher", config["researcher"])
        setup_span.set_attribute("model.writer", config["writer"])
        setup_span.set_attribute("model.editor", config["editor"])
        setup_span.set_attribute("model.seo", config["seo"])

        agents = create_agents(config)
        tasks = create_tasks(agents, topic)

    with Netra.start_span("pipeline-execution", as_type=SpanType.AGENT) as exec_span:
        crew = Crew(agents=list(agents.values()), tasks=tasks, process=Process.sequential)
        result = crew.kickoff()
        exec_span.set_attribute("output_length", len(result.raw))

    return {"topic": topic, "config": config_name, "output": result.raw}
```

***

## Viewing Traces in Netra

After running the pipeline, navigate to **Observability → Traces** in Netra.

### What the Trace Shows

<Frame>
  <img src="https://mintcdn.com/netra/bBpyEHE94z6nKy3p/images/crewai-pipeline-trace.png?fit=max&auto=format&n=bBpyEHE94z6nKy3p&q=85&s=352f9e8e3fc914b53fa830cef197eb91" alt="Netra trace view showing multi-agent pipeline" width="1920" height="1080" data-path="images/crewai-pipeline-trace.png" />
</Frame>

The trace shows:

* **Pipeline span**: Overall execution time
* **Agent spans**: Each agent's task execution
* **LLM calls**: Nested under each agent with prompts and completions
* **Token usage**: Per-agent and total

{/* ### Per-Agent Cost Breakdown

Filter traces to see cost attribution by agent:

<Frame>
<img src="/images/agent-cost-breakdown.png" alt="Per-agent cost breakdown" />
</Frame>

| Agent | Model | Avg Tokens | Avg Cost |
|-------|-------|------------|----------|
| Researcher | gpt-4o | ~1,200 | ~$0.05 |
| Writer | gpt-4o | ~2,000 | ~$0.08 |
| Editor | gpt-3.5-turbo | ~1,500 | ~$0.01 |
| SEO | gpt-3.5-turbo | ~800 | ~$0.005 |

--- */}

## Running Configuration Experiments

Test different model configurations to find the optimal cost/quality balance.

### Define Configurations

```python theme={null}
CONFIGS = {
    "premium": {
        "researcher": "gpt-4o",
        "writer": "gpt-4o",
        "editor": "gpt-4o",
        "seo": "gpt-4o",
    },
    "budget": {
        "researcher": "gpt-4o",
        "writer": "gpt-4o",
        "editor": "gpt-3.5-turbo",
        "seo": "gpt-3.5-turbo",
    },
    "economy": {
        "researcher": "gpt-4o",
        "writer": "gpt-3.5-turbo",
        "editor": "gpt-3.5-turbo",
        "seo": "gpt-3.5-turbo",
    },
}
```

### Run Experiments

```python theme={null}
# Test each configuration
for config_name, config in CONFIGS.items():
    print(f"Running {config_name} configuration...")

    result = create_article(
        topic="The Future of AI in Healthcare",
        config_name=config_name,
        config=config,
    )

    print(f"{config_name}: {len(result['output'])} characters")
```

### Compare in Dashboard

After running all configurations, compare costs and latency:

| Config  | Total Cost | Total Latency | Output Quality |
| ------- | ---------- | ------------- | -------------- |
| Premium | \~\$0.19   | \~45s         | Highest        |
| Budget  | \~\$0.145  | \~40s         | Good           |
| Economy | \~\$0.085  | \~35s         | Acceptable     |

***

## Debugging Multi-Agent Issues

### Common Problems and Solutions

| Problem                         | What to Look For             | Solution                            |
| ------------------------------- | ---------------------------- | ----------------------------------- |
| **Slow pipeline**               | High latency on one agent    | Use faster model or shorter prompts |
| **Context lost between agents** | Missing info in task outputs | Improve task descriptions           |
| **Editor making no changes**    | Low edit delta               | Improve editor prompts              |
| **High total cost**             | One agent dominating         | Downgrade non-critical agents       |

### Using Traces to Debug

1. **Find slow agents**: Sort spans by duration
2. **Trace context flow**: Check task outputs passed between agents
3. **Identify cost drivers**: Filter by token usage
4. **Compare successful vs failed**: Look for pattern differences

***

## Summary

You've learned how to add comprehensive observability to CrewAI pipelines:

* **Auto-instrumentation** captures agent execution with minimal code
* **Per-agent tracing** reveals costs, latency, and token usage
* **Custom attributes** enable filtering by topic, config, and more
* **Configuration experiments** find the optimal cost/quality balance

### Key Takeaways

1. Multi-agent systems need per-agent visibility to identify bottlenecks
2. Cost allocation by role reveals which agents benefit from premium models
3. Trace context flow to debug handoff issues
4. Use configuration experiments for data-driven model selection

***

## See Also

<CardGroup cols={2}>
  <Card title="CrewAI Integration" icon="users" href="/Integrations/orchestrators/CrewAI">
    Complete CrewAI instrumentation guide
  </Card>

  <Card title="Agents Documentation" icon="robot" href="/Observability/Agents">
    Deep dive into agent observability features
  </Card>

  <Card title="Usage APIs" icon="chart-simple" href="/usage/usage-utilities">
    Query cost and usage data programmatically
  </Card>
</CardGroup>
