> ## Documentation Index > Fetch the complete documentation index at: https://docs.getnetra.ai/llms.txt > Use this file to discover all available pages before exploring further. # Multi-Tenant Cost Tracking > Track AI costs per customer in B2B apps with Netra tenant observability. Monitor per-tenant spending, enforce SLAs, and generate usage reports. This cookbook shows you how to build **comprehensive multi-tenant observability** for B2B AI platforms—tracking costs per customer, monitoring SLA compliance, and attributing usage across your entire customer base. Run the complete notebook in your browser All company names (MeetingMind, Apex Legal, Stratex Consulting, TechStart Inc) and scenarios in this cookbook are entirely fictional and used for demonstration purposes only. ## What You'll Learn Use Netra's native tenant tracking to attribute all traces to specific customers Query usage and cost data per tenant via API or dashboard Set up tier-specific alerts that trigger on latency or error rate breaches Understand session and user behavior within each tenant **Prerequisites:** * Python >=3.10, \< 3.14 * OpenAI API key * Netra API key ([Get your key here](/quick-start/Overview)) *** ## The MeetingMind Scenario **MeetingMind** is a fictional B2B SaaS platform that provides AI-powered meeting summarization. The platform serves customers with different needs and budgets: | Customer | Industry | Tier | | ---------------------- | ------------ | ------------ | | **Apex Legal** | Law Firm | Enterprise | | **Stratex Consulting** | Consulting | Professional | | **TechStart Inc** | Tech Startup | Starter | Each tier uses a different configuration and has different SLA commitments: | Tier | Model | Latency SLA | Rate Limit | | ------------ | ----------- | ----------- | ------------ | | Enterprise | GPT-4o-mini | P95 \< 2s | 60 calls/min | | Professional | GPT-4o-mini | P95 \< 3s | 30 calls/min | | Starter | GPT-4o-mini | Best effort | 10 calls/min | *** ## Step 1: Install Packages ```bash Python theme={null} pip install netra-sdk openai ``` ```bash TypeScript theme={null} npm install netra-sdk openai ``` ## Step 2: Set Environment Variables ```bash Python theme={null} export NETRA_API_KEY="your-netra-api-key" export NETRA_OTLP_ENDPOINT="your-netra-otlp-endpoint" export OPENAI_API_KEY="your-openai-api-key" ``` ```bash TypeScript theme={null} export NETRA_API_KEY="your-netra-api-key" export NETRA_OTLP_ENDPOINT="your-netra-otlp-endpoint" export OPENAI_API_KEY="your-openai-api-key" ``` ## Step 3: Initialize Netra for Multi-Tenant Tracking Initialize Netra at application startup with auto-instrumentation for OpenAI: ```python Python theme={null} from netra import Netra from netra.instrumentation.instruments import InstrumentSet # Initialize Netra for multi-tenant observability Netra.init( app_name="meetingmind", headers=f"x-api-key={os.getenv('NETRA_API_KEY')}", environment="production", trace_content=True, instruments={InstrumentSet.OPENAI}, ) ``` ```typescript TypeScript theme={null} import { Netra, NetraInstruments } from "netra-sdk"; // Initialize Netra for multi-tenant observability await Netra.init({ appName: "meetingmind", headers: `x-api-key=${process.env.NETRA_API_KEY}`, environment: "production", traceContent: true, instruments: new Set([NetraInstruments.OPENAI]), }); ``` ## Step 4: Define Tenant Configuration Configure tier-specific settings for each customer: ```python Python theme={null} from dataclasses import dataclass from typing import List, Optional @dataclass class TenantConfig: """Configuration for a tenant's service tier.""" tenant_id: str tier: str model: str features: List[str] latency_sla_ms: Optional[int] max_calls_per_minute: int # Tenant configurations TENANT_CONFIGS = { "apex-legal": TenantConfig( tenant_id="apex-legal", tier="enterprise", model="gpt-4", features=["summary", "action_items", "decisions", "custom_reports"], latency_sla_ms=2000, max_calls_per_minute=60 ), "stratex-consulting": TenantConfig( tenant_id="stratex-consulting", tier="professional", model="gpt-4-turbo", features=["summary", "action_items"], latency_sla_ms=3000, max_calls_per_minute=30 ), "techstart-inc": TenantConfig( tenant_id="techstart-inc", tier="starter", model="gpt-3.5-turbo", features=["summary"], latency_sla_ms=None, # Best effort max_calls_per_minute=10 ), } ``` ```typescript TypeScript theme={null} interface TenantConfig { tenantId: string; tier: string; model: string; features: string[]; latencySlaMs: number | null; maxCallsPerMinute: number; } const TENANT_CONFIGS: Record = { "apex-legal": { tenantId: "apex-legal", tier: "enterprise", model: "gpt-4", features: ["summary", "action_items", "decisions", "custom_reports"], latencySlaMs: 2000, maxCallsPerMinute: 60, }, "stratex-consulting": { tenantId: "stratex-consulting", tier: "professional", model: "gpt-4-turbo", features: ["summary", "action_items"], latencySlaMs: 3000, maxCallsPerMinute: 30, }, "techstart-inc": { tenantId: "techstart-inc", tier: "starter", model: "gpt-3.5-turbo", features: ["summary"], latencySlaMs: null, // Best effort maxCallsPerMinute: 10, }, }; ``` ## Step 5: Create Multi-Tenant Meeting Summarizer Build a service that tracks costs per tenant. This class handles tenant context setting, prompt building based on feature tiers, cost calculation, and SLA compliance checking — all within Netra spans. ```python Python theme={null} from openai import OpenAI import time import uuid import os from netra import Netra, SpanType, UsageModel class MultiTenantMeetingSummarizer: """Meeting summarization service with per-tenant cost tracking.""" def __init__(self): self.openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) self.tenant_usage = {} # Track usage per tenant def summarize_meeting(self, tenant_id: str, meeting_transcript: str, user_id: str = None) -> dict: """Summarize a meeting for a specific tenant with cost tracking.""" # Validate tenant if tenant_id not in TENANT_CONFIGS: return {"error": f"Unknown tenant: {tenant_id}"} config = TENANT_CONFIGS[tenant_id] # Set tenant context - this is the key for multi-tenant observability Netra.set_tenant_id(tenant_id) Netra.set_session_id(str(uuid.uuid4())) if user_id: Netra.set_user_id(user_id) # Build the prompt prompt = f"Summarize this meeting transcript into:\n" if "summary" in config.features: prompt += "- Executive Summary (2-3 paragraphs)\n" if "action_items" in config.features: prompt += "- Action Items (numbered list)\n" if "decisions" in config.features: prompt += "- Key Decisions Made\n" if "custom_reports" in config.features: prompt += "- Recommendations for Follow-up\n" prompt += f"\nMeeting Transcript:\n{meeting_transcript}" # Start a span for the summarization operation with Netra.start_span("meeting-summarization") as span: span.set_attribute("tenant_id", tenant_id) span.set_attribute("tier", config.tier) span.set_attribute("model", config.model) start_time = time.time() # Call the API (auto-traced) response = self.openai_client.chat.completions.create( model=config.model, messages=[ {"role": "system", "content": "You are an expert meeting summarizer."}, {"role": "user", "content": prompt} ], temperature=0.3 ) latency_ms = (time.time() - start_time) * 1000 summary = response.choices[0].message.content # Calculate cost (simplified pricing model) # GPT-4o-mini pricing (approx): $0.15/1M input, $0.60/1M output input_price = 0.15 / 1_000_000 output_price = 0.60 / 1_000_000 prompt_tokens = response.usage.prompt_tokens completion_tokens = response.usage.completion_tokens total_tokens = response.usage.total_tokens cost = (prompt_tokens * input_price) + (completion_tokens * output_price) # Record detailed usage and cost in the span span.set_usage([ UsageModel( model=config.model, cost_in_usd=cost, usage_type="chat", units_used=total_tokens ) ]) # Check SLA compliance sla_compliant = True if config.latency_sla_ms: sla_compliant = latency_ms <= config.latency_sla_ms span.set_attribute("sla_met", sla_compliant) if not sla_compliant: span.add_event("sla-breach", { "actual_ms": latency_ms, "sla_ms": config.latency_sla_ms }) span.set_success() # Local tracking if tenant_id not in self.tenant_usage: self.tenant_usage[tenant_id] = {"count": 0, "tokens": 0, "total_cost": 0.0, "total_latency": 0} self.tenant_usage[tenant_id]["count"] += 1 self.tenant_usage[tenant_id]["tokens"] += total_tokens self.tenant_usage[tenant_id]["total_cost"] += cost self.tenant_usage[tenant_id]["total_latency"] += latency_ms return { "tenant_id": tenant_id, "tier": config.tier, "summary": summary, "token_usage": { "prompt": prompt_tokens, "completion": completion_tokens, "total": total_tokens }, "latency_ms": latency_ms, "sla_compliant": sla_compliant, "cost": cost } def print_usage_summary(self): """Print usage summary by tenant.""" for tenant_id, usage in self.tenant_usage.items(): print(f"\n{tenant_id}:") print(f" Calls: {usage['count']}") print(f" Total Tokens: {usage['tokens']}") print(f" Total Cost: ${usage['total_cost']:.4f}") print(f" Avg Latency: {usage['total_latency']/usage['count']:.0f}ms") ``` ```typescript TypeScript theme={null} import { Netra, SpanType } from "netra-sdk"; import OpenAI from "openai"; import { v4 as uuidv4 } from "uuid"; class MultiTenantMeetingSummarizer { private openaiClient: OpenAI; private tenantUsage: Record< string, { count: number; tokens: number; totalCost: number; totalLatency: number } > = {}; constructor() { this.openaiClient = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); } async summarizeMeeting( tenantId: string, meetingTranscript: string, userId?: string ) { // Validate tenant if (!TENANT_CONFIGS[tenantId]) { return { error: `Unknown tenant: ${tenantId}` }; } const config = TENANT_CONFIGS[tenantId]; // Set tenant context - this is the key for multi-tenant observability Netra.setTenantId(tenantId); Netra.setSessionId(uuidv4()); if (userId) { Netra.setUserId(userId); } // Build the prompt let prompt = "Summarize this meeting transcript into:\n"; if (config.features.includes("summary")) { prompt += "- Executive Summary (2-3 paragraphs)\n"; } if (config.features.includes("action_items")) { prompt += "- Action Items (numbered list)\n"; } if (config.features.includes("decisions")) { prompt += "- Key Decisions Made\n"; } if (config.features.includes("custom_reports")) { prompt += "- Recommendations for Follow-up\n"; } prompt += `\nMeeting Transcript:\n${meetingTranscript}`; // Start a span for the summarization operation const span = Netra.startSpan("meeting-summarization").start(); span.setAttribute("tenant_id", tenantId); span.setAttribute("tier", config.tier); span.setAttribute("model", config.model); const startTime = Date.now(); try { // Call the API (auto-traced) const response = await this.openaiClient.chat.completions.create({ model: config.model, messages: [ { role: "system", content: "You are an expert meeting summarizer." }, { role: "user", content: prompt }, ], temperature: 0.3, }); const latencyMs = Date.now() - startTime; const summary = response.choices[0].message.content; // Calculate cost (GPT-4o-mini pricing: $0.15/1M input, $0.60/1M output) const inputPrice = 0.15 / 1_000_000; const outputPrice = 0.60 / 1_000_000; const promptTokens = response.usage?.prompt_tokens || 0; const completionTokens = response.usage?.completion_tokens || 0; const totalTokens = promptTokens + completionTokens; const cost = promptTokens * inputPrice + completionTokens * outputPrice; // Record detailed usage and cost in the span span.setUsage([ { model: config.model, costInUsd: cost, usageType: "chat", unitsUsed: totalTokens, }, ]); // Check SLA compliance let slaCompliant = true; if (config.latencySlaMs) { slaCompliant = latencyMs <= config.latencySlaMs; span.setAttribute("sla_met", slaCompliant); if (!slaCompliant) { span.addEvent("sla-breach", { actual_ms: latencyMs, sla_ms: config.latencySlaMs, }); } } span.setSuccess(); // Local tracking if (!this.tenantUsage[tenantId]) { this.tenantUsage[tenantId] = { count: 0, tokens: 0, totalCost: 0, totalLatency: 0, }; } this.tenantUsage[tenantId].count += 1; this.tenantUsage[tenantId].tokens += totalTokens; this.tenantUsage[tenantId].totalCost += cost; this.tenantUsage[tenantId].totalLatency += latencyMs; return { tenantId, tier: config.tier, summary, tokenUsage: { prompt: promptTokens, completion: completionTokens, total: totalTokens, }, latencyMs, slaCompliant, cost, }; } finally { span.end(); } } printUsageSummary() { for (const [tenantId, usage] of Object.entries(this.tenantUsage)) { console.log(`\n${tenantId}:`); console.log(` Calls: ${usage.count}`); console.log(` Total Tokens: ${usage.tokens}`); console.log(` Total Cost: $${usage.totalCost.toFixed(4)}`); console.log( ` Avg Latency: ${Math.round(usage.totalLatency / usage.count)}ms` ); } } } ``` The key pattern here is calling `set_tenant_id()` early in the request lifecycle. This ensures all subsequent traces — including auto-instrumented OpenAI calls — are automatically attributed to the correct tenant. ## Step 6: Test with Sample Meetings Simulate meeting summarization requests from different tenants: ```python Python theme={null} # Initialize summarizer summarizer = MultiTenantMeetingSummarizer() # Enterprise tier (Apex Legal) - legal meeting sample_meeting = """ Attendees: John (Partner), Sarah (Associate), Mike (Paralegal) Duration: 45 minutes Topic: Case Strategy for Smith v. Jones John: Let's discuss our approach for the Smith case. The deposition is in 3 weeks. Sarah: I've reviewed the discovery documents. The key issue is the contract's ambiguity around the liability clause. Mike: I've created a timeline. The critical events are on pages 45-67 of the evidence log. John: Good. Sarah, can you draft a summary of our position by Friday? Sarah: I'll have it ready. Should I include recommendations for discovery? John: Yes, especially around vendor communications. Mike, check if we have all related emails. Mike: I'll pull those by tomorrow. John: This looks solid. Let's reconvene next week after Sarah finishes the draft. """ result1 = summarizer.summarize_meeting( tenant_id="apex-legal", meeting_transcript=sample_meeting, user_id="john.smith@apexlegal.com" ) print(f"Tier: {result1['tier']}") print(f"SLA Compliant: {result1['sla_compliant']}") print(f"Latency: {result1['latency_ms']:.0f}ms") print(f"Tokens Used: {result1['token_usage']['total']}") # Professional tier (Stratex Consulting) - strategy meeting meeting_transcript_2 = """ Team sync for Q2 strategy planning. Attendees: CEO, CFO, Head of Product CEO: Let's review our market position and Q2 targets. CFO: Revenue is up 15% YoY. We're tracking to beat forecast. Head of Product: New features launched last month show strong adoption. CEO: Great! What are our risks? CFO: Supply chain delays could impact timeline. Head of Product: We need to hire 3 more engineers to meet roadmap. CEO: Let's make that happen. Budget approved. """ result2 = summarizer.summarize_meeting( tenant_id="stratex-consulting", meeting_transcript=meeting_transcript_2, user_id="cfo@stratex.com" ) # Starter tier (TechStart Inc) - standup meeting_transcript_3 = """ Daily standup Attendees: Dev team Tom: I finished the API integration yesterday. Lisa: I'm working on the UI components. Chris: Testing is on track for Thursday release. Tom: Good. Any blockers? Lisa: Waiting for design approval on the dashboard. Chris: Should be done today. """ result3 = summarizer.summarize_meeting( tenant_id="techstart-inc", meeting_transcript=meeting_transcript_3, user_id="dev@techstart.io" ) ``` ```typescript TypeScript theme={null} // Initialize summarizer const summarizer = new MultiTenantMeetingSummarizer(); // Enterprise tier (Apex Legal) - legal meeting const sampleMeeting = ` Attendees: John (Partner), Sarah (Associate), Mike (Paralegal) Duration: 45 minutes Topic: Case Strategy for Smith v. Jones John: Let's discuss our approach for the Smith case. The deposition is in 3 weeks. Sarah: I've reviewed the discovery documents. The key issue is the contract's ambiguity around the liability clause. Mike: I've created a timeline. The critical events are on pages 45-67 of the evidence log. John: Good. Sarah, can you draft a summary of our position by Friday? Sarah: I'll have it ready. Should I include recommendations for discovery? John: Yes, especially around vendor communications. Mike, check if we have all related emails. Mike: I'll pull those by tomorrow. John: This looks solid. Let's reconvene next week after Sarah finishes the draft. `; const result1 = await summarizer.summarizeMeeting( "apex-legal", sampleMeeting, "john.smith@apexlegal.com" ); console.log(`Tier: ${result1.tier}`); console.log(`SLA Compliant: ${result1.slaCompliant}`); console.log(`Latency: ${Math.round(result1.latencyMs)}ms`); console.log(`Tokens Used: ${result1.tokenUsage.total}`); // Professional tier (Stratex Consulting) - strategy meeting const meetingTranscript2 = ` Team sync for Q2 strategy planning. Attendees: CEO, CFO, Head of Product CEO: Let's review our market position and Q2 targets. CFO: Revenue is up 15% YoY. We're tracking to beat forecast. Head of Product: New features launched last month show strong adoption. CEO: Great! What are our risks? CFO: Supply chain delays could impact timeline. Head of Product: We need to hire 3 more engineers to meet roadmap. CEO: Let's make that happen. Budget approved. `; const result2 = await summarizer.summarizeMeeting( "stratex-consulting", meetingTranscript2, "cfo@stratex.com" ); // Starter tier (TechStart Inc) - standup const meetingTranscript3 = ` Daily standup Attendees: Dev team Tom: I finished the API integration yesterday. Lisa: I'm working on the UI components. Chris: Testing is on track for Thursday release. Tom: Good. Any blockers? Lisa: Waiting for design approval on the dashboard. Chris: Should be done today. `; const result3 = await summarizer.summarizeMeeting( "techstart-inc", meetingTranscript3, "dev@techstart.io" ); ``` ## Step 7: Review Usage and Cost Breakdown Analyze per-tenant usage patterns and costs: ```python Python theme={null} # Print usage summary summarizer.print_usage_summary() # Calculate estimated costs (rough approximation) # GPT-4o-mini pricing (approximate): $0.15/1M input tokens, $0.60/1M output tokens input_price_per_token = 0.15 / 1_000_000 output_price_per_token = 0.60 / 1_000_000 for tenant_id, usage in summarizer.tenant_usage.items(): # Rough split: assume 70% input, 30% output tokens input_tokens = int(usage['tokens'] * 0.7) output_tokens = int(usage['tokens'] * 0.3) cost = (input_tokens * input_price_per_token) + (output_tokens * output_price_per_token) print(f"\n{tenant_id}:") print(f" Total Tokens: {usage['tokens']}") print(f" Estimated Cost: ${cost:.4f}") print(f" Cost per Call: ${cost/usage['count']:.4f}") ``` ```typescript TypeScript theme={null} // Print usage summary summarizer.printUsageSummary(); // Calculate estimated costs (rough approximation) // GPT-4o-mini pricing (approximate): $0.15/1M input tokens, $0.60/1M output tokens const inputPricePerToken = 0.15 / 1_000_000; const outputPricePerToken = 0.60 / 1_000_000; for (const [tenantId, usage] of Object.entries(summarizer.tenantUsage)) { // Rough split: assume 70% input, 30% output tokens const inputTokens = Math.floor(usage.tokens * 0.7); const outputTokens = Math.floor(usage.tokens * 0.3); const cost = inputTokens * inputPricePerToken + outputTokens * outputPricePerToken; console.log(`\n${tenantId}:`); console.log(` Total Tokens: ${usage.tokens}`); console.log(` Estimated Cost: $${cost.toFixed(4)}`); console.log(` Cost per Call: $${(cost / usage.count).toFixed(4)}`); } ``` ## Step 8: SLA Monitoring Check which tenants are meeting their SLA commitments: ```python Python theme={null} sla_results = [ ("apex-legal", result1['sla_compliant'], result1['latency_ms']), ("stratex-consulting", result2['sla_compliant'], result2['latency_ms']), ("techstart-inc", result3['sla_compliant'], result3['latency_ms']), ] for tenant_id, compliant, latency in sla_results: config = TENANT_CONFIGS[tenant_id] status = "PASS" if compliant else "FAIL" sla_text = f"{config.latency_sla_ms}ms" if config.latency_sla_ms else "Best effort" print(f"\n{tenant_id} ({config.tier}):") print(f" SLA Target: {sla_text}") print(f" Actual Latency: {latency:.0f}ms") print(f" Status: {status}") ``` ```typescript TypeScript theme={null} const slaResults = [ { tenantId: "apex-legal", ...result1 }, { tenantId: "stratex-consulting", ...result2 }, { tenantId: "techstart-inc", ...result3 }, ]; for (const result of slaResults) { const config = TENANT_CONFIGS[result.tenantId]; const status = result.slaCompliant ? "PASS" : "FAIL"; const slaText = config.latencySlaMs ? `${config.latencySlaMs}ms` : "Best effort"; console.log(`\n${result.tenantId} (${config.tier}):`); console.log(` SLA Target: ${slaText}`); console.log(` Actual Latency: ${Math.round(result.latencyMs)}ms`); console.log(` Status: ${status}`); } ``` ### Setting Up Tenant-Specific Alerts In the Netra dashboard, navigate to **Alert Rules** and create tenant-filtered alerts: Click **Create Alert Rule** and name it "Enterprise Latency SLA Breach" * **Scope**: Trace (monitor end-to-end requests) * **Metric**: Latency Add a filter for `tenant_id = apex-legal` to only monitor Enterprise tier requests * **Condition**: Greater than 2000ms * **Time Window**: 5 minutes (to avoid alerting on single slow requests) Select your Slack channel or email for notifications