Skip to content

Add unified observability type system and Grafana Cloud exporter#13076

Open
epinzur wants to merge 2 commits intoesp/obs_core_additionsfrom
claude/add-notion-folder-t95gp
Open

Add unified observability type system and Grafana Cloud exporter#13076
epinzur wants to merge 2 commits intoesp/obs_core_additionsfrom
claude/add-notion-folder-t95gp

Conversation

@epinzur
Copy link
Contributor

@epinzur epinzur commented Feb 14, 2026

Description

This PR introduces a comprehensive observability infrastructure for Mastra with a unified type system and a production-ready Grafana Cloud exporter.

Core Changes

Observability Type System (packages/core/src/observability/types/)

  • Added core.ts: Top-level observability infrastructure types including ObservabilityContext (unified interface for tracing, logging, and metrics), ObservabilityEventBus, ObservabilityInstance, and configuration types
  • Added logging.ts: LoggerContext interface for structured logging with trace correlation
  • Added metrics.ts: MetricsContext interface supporting counters, gauges, and histograms
  • Added scores.ts and feedback.ts: Types for post-hoc span/trace annotation with scores and feedback
  • Updated tracing.ts: Introduced SpanData base type, RecordedSpan for persisted spans with annotation methods, and Trace for trace-level operations

Context Factory (packages/core/src/observability/context-factory.ts)

  • Added createObservabilityContext(): Creates observability contexts with optional tracing, logging, and metrics
  • Added resolveObservabilityContext(): Extracts observability context from partial objects, enabling flexible parameter passing
  • Provides no-op implementations for graceful degradation when observability is not configured

Grafana Cloud Exporter (observability/grafana-cloud/)

  • New package @mastra/grafana-cloud exporting traces to Tempo, metrics to Mimir, and logs to Loki via OTLP/HTTP JSON
  • GrafanaCloudExporter: Implements batching, configurable flush intervals, and Basic auth with instanceId:apiKey
  • Signal-specific formatters:
    • formatSpansForTempo(): Converts spans to OTLP trace format
    • formatMetricsForMimir(): Converts metrics to OTLP metrics format
    • formatLogsForLoki(): Converts logs to Loki JSON push API format
  • Comprehensive test coverage for configuration, batching, and signal formatting

Integration Updates

Updated core modules to use the new ObservabilityContext interface:

  • Agent, AgentLegacy: Use resolveObservabilityContext() to extract observability from parameters
  • Workflow, WorkflowEvented: Pass observability context through step execution
  • ProcessorRunner: Derives logger and metrics from current span
  • Various handlers and processors: Updated to accept ObservabilityContext instead of just TracingContext

Backward Compatibility

  • Existing TracingContext type remains unchanged
  • ObservabilityContext provides tracingContext alias for compatibility
  • No-op implementations ensure graceful degradation when observability is not configured
  • All changes are additive; existing code continues to work

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Code refactoring

Checklist

  • Added comprehensive type definitions with detailed JSDoc comments
  • Added unit tests for context factory and Grafana Cloud exporter (429 tests in exporter.test.ts, 327 in context-factory.test.ts, plus formatter tests)
  • Added new observability package with proper TypeScript configuration
  • Updated existing modules to use new observability context system

https://claude.ai/code/session_01V4TiknqfJt5SpaQtxQYpjD

Summary by CodeRabbit

  • New Features

    • Introduced unified observability context system combining tracing, logging, and metrics contexts.
    • Added Grafana Cloud observability exporter for shipping traces, logs, and metrics.
    • Added public loggerVNext and metrics getters to Mastra class for non-traced logging and metrics collection.
    • Created public type definitions for observability contexts: LoggerContext, MetricsContext, ScoreInput, FeedbackInput.
  • Refactor

    • Replaced TracingContext with unified ObservabilityContext across APIs for consistent observability handling.

Implement GrafanaCloudExporter that exports all three observability signals
to Grafana Cloud's managed backends:
- Traces → Grafana Tempo (via OTLP/HTTP JSON at /v1/traces)
- Metrics → Grafana Mimir (via OTLP/HTTP JSON at /otlp/v1/metrics)
- Logs → Grafana Loki (via JSON push API at /loki/api/v1/push)

Key features:
- Extends BaseExporter with onTracingEvent, onLogEvent, onMetricEvent handlers
- Configurable batching (batchSize, flushIntervalMs) with automatic periodic flush
- Basic auth with instanceId:apiKey for all endpoints
- Zone-based default endpoint construction
- Env var fallbacks for all configuration (GRAFANA_CLOUD_*)
- Re-buffering on transient failures with bounded growth cap
- Completion-only pattern for traces (only exports SPAN_ENDED events)
- OTLP-compliant span conversion with GenAI semantic attributes
- Loki log grouping by stream labels (low-cardinality only)
- Smart histogram bucket selection based on metric name

59 unit tests covering formatters and exporter behavior.

https://claude.ai/code/session_01V4TiknqfJt5SpaQtxQYpjD
@vercel
Copy link

vercel bot commented Feb 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
assistant-ui Error Error Feb 14, 2026 0:47am
mastra-docs Ready Ready Preview, Comment Feb 14, 2026 0:47am
mastra-docs-1.x Building Building Preview, Comment Feb 14, 2026 0:47am

Request Review

@changeset-bot
Copy link

changeset-bot bot commented Feb 14, 2026

⚠️ No Changeset found

Latest commit: b9535aa

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 14, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Introduces a unified ObservabilityContext system combining tracing, logging, and metrics contexts. Adds factory functions (createObservabilityContext, resolveObservabilityContext) with no-op defaults for graceful degradation. Implements a complete Grafana Cloud exporter with batched buffering for traces, logs, and metrics. Replaces scattered TracingContext usage throughout the codebase with the new unified context.

Changes

Cohort / File(s) Summary
Observability Core Types & Factories
packages/core/src/observability/types/core.ts, packages/core/src/observability/types/logging.ts, packages/core/src/observability/types/metrics.ts, packages/core/src/observability/types/scores.ts, packages/core/src/observability/types/feedback.ts, packages/core/src/observability/context-factory.ts, packages/core/src/observability/context-factory.test.ts, packages/core/src/observability/no-op.ts
Establishes comprehensive observability model with new public interfaces (ObservabilityContext, LoggerContext, MetricsContext, ScoreInput, FeedbackInput, ObservabilityEventBus) and factory functions with no-op implementations for graceful degradation; ~900 lines of new infrastructure.
Grafana Cloud Exporter
observability/grafana-cloud/src/exporter.ts, observability/grafana-cloud/src/exporter.test.ts, observability/grafana-cloud/src/formatters/traces.ts, observability/grafana-cloud/src/formatters/traces.test.ts, observability/grafana-cloud/src/formatters/logs.ts, observability/grafana-cloud/src/formatters/logs.test.ts, observability/grafana-cloud/src/formatters/metrics.ts, observability/grafana-cloud/src/formatters/metrics.test.ts, observability/grafana-cloud/src/types.ts, observability/grafana-cloud/src/index.ts
New complete exporter module with batched buffering, OTLP/Loki/Mimir formatting, authenticated HTTP delivery, and comprehensive test coverage (~1,500 lines of production + test code).
Grafana Cloud Build Configuration
observability/grafana-cloud/package.json, observability/grafana-cloud/tsconfig.json, observability/grafana-cloud/tsconfig.build.json, observability/grafana-cloud/tsup.config.ts, observability/grafana-cloud/vitest.config.ts
Package setup, TypeScript build configs, and test harness for new Grafana Cloud exporter module.
Agent & LLM Context Refactoring
packages/core/src/agent/agent-legacy.ts, packages/core/src/agent/agent.ts, packages/core/src/agent/agent.types.ts, packages/core/src/agent/trip-wire.ts, packages/core/src/agent/types.ts, packages/core/src/llm/index.ts, packages/core/src/llm/model/base.types.ts, packages/core/src/llm/model/model.ts, packages/core/src/llm/model/model.loop.ts, packages/core/src/llm/model/model.loop.types.ts
Replaces TracingContext with ObservabilityContext across agent execution paths, LLM model options, and loop handling; updates method signatures and propagates observability context through tool conversion, processors, and model invocations.
Workflow Handler Context Updates
packages/core/src/workflows/default.ts, packages/core/src/workflows/workflow.ts, packages/core/src/workflows/step.ts, packages/core/src/workflows/types.ts, packages/core/src/workflows/evented/workflow.ts, packages/core/src/workflows/evented/step-executor.ts, packages/core/src/workflows/handlers/entry.ts, packages/core/src/workflows/handlers/step.ts, packages/core/src/workflows/handlers/sleep.ts, packages/core/src/workflows/handlers/control-flow.ts, packages/core/src/agent/workflows/prepare-stream/...
Extends all workflow-related public interfaces with Partial<ObservabilityContext> and replaces tracingContext parameter passing with spread observability context; updates control flow, entry, step, and sleep handlers.
Processor & Stream Context Refactoring
packages/core/src/processors/index.ts, packages/core/src/processors/runner.ts, packages/core/src/processors/memory/message-history.ts, packages/core/src/processors/memory/semantic-recall.ts, packages/core/src/processors/processors/language-detector.ts, packages/core/src/processors/processors/moderation.ts, packages/core/src/processors/processors/pii-detector.ts, packages/core/src/processors/processors/prompt-injection-detector.ts, packages/core/src/processors/processors/structured-output.ts, packages/core/src/processors/processors/system-prompt-scrubber.ts, packages/core/src/stream/...
Replaces TracingContext with ObservabilityContext across all processor signatures; updates memory processors, detection processors, and stream output handling to derive and propagate observability context.
Evaluation & Scoring Context Updates
packages/core/src/evals/base.ts, packages/core/src/evals/hooks.ts, packages/core/src/evals/types.ts, packages/core/src/evals/run/index.ts, packages/core/src/evals/scoreTraces/scoreTracesWorkflow.ts
Updates scorer run types to extend Partial<ObservabilityContext> and replaces tracingContext parameters with observability context throughout evaluation workflows and scoring hooks.
Core Framework & Tool Updates
packages/core/src/loop/loop.ts, packages/core/src/loop/types.ts, packages/core/src/loop/network/index.ts, packages/core/src/tools/types.ts, packages/core/src/tools/tool-builder/builder.ts, packages/core/src/utils.ts, packages/core/src/mastra/index.ts
Adds loggerVNext and metrics getters to Mastra; extends LoopOptions, ToolExecutionContext, and ToolOptions with Partial<ObservabilityContext>; replaces TracingContext in LLM loop and network routing.
Observability Index & Context Propagation
packages/core/src/observability/index.ts, packages/core/src/observability/context.ts, packages/core/src/observability/context.test.ts, packages/core/src/observability/types/index.ts, packages/core/src/observability/types/tracing.ts
Re-exports new context factory functions; updates context propagation proxies to use createObservabilityContext; introduces RecordedSpan and RecordedTrace types for post-hoc annotation; consolidates observability type exports.
Test Updates
packages/core/src/agent/__tests__/scorers.test.ts, packages/core/src/evals/run/index.test.ts
Updates assertions to use expect.objectContaining for partial object matching; accommodates new observability context fields in scorer and evaluation test payloads.
Configuration & Root Files
.changeset/thin-knives-accept.md, .gitignore
Changelog documenting observability context unification; adds build artifact ignore patterns.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

High number of affected files (70+) with consistent but pervasive pattern changes replacing TracingContext with ObservabilityContext. While changes are homogeneous in nature (reducing review friction), the sheer scope requires verification of consistency across all refactored locations. The new Grafana Cloud exporter adds significant logic density with proper error handling and batching semantics. Dense integration testing needed across processor, workflow, and evaluation pipelines.

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main changes: introducing a unified observability type system and a Grafana Cloud exporter, both of which are central to this PR.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into esp/obs_core_additions

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/add-notion-folder-t95gp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/tools/tool-builder/builder.ts (1)

248-258: ⚠️ Potential issue | 🟡 Minor

tracing, loggerVNext, and metrics leak into ...rest (and into logs).

The comment on line 248 explains that tracingContext is excluded because it "may contain sensitive observability credentials." However, the new Partial<ObservabilityContext> fields tracing (which is the same object as tracingContext), loggerVNext, and metrics are not stripped and will end up in rest, which is later passed to logger.debug(start, { ...rest, ... }) at line 472.

Proposed fix
     const {
       logger,
       mastra: _mastra,
       memory: _memory,
       requestContext,
       model,
       tracingContext: _tracingContext,
       tracingPolicy: _tracingPolicy,
+      tracing: _tracing,
+      loggerVNext: _loggerVNext,
+      metrics: _metrics,
       ...rest
     } = options;
🤖 Fix all issues with AI agents
In `@observability/grafana-cloud/src/exporter.ts`:
- Around line 333-350: sendRequest currently calls fetch without a timeout; add
a configurable timeout property to the exporter class (e.g., timeoutMs) and use
an AbortController inside sendRequest to abort the request after that timeout.
Create the AbortController, pass controller.signal to fetch, set a timer to call
controller.abort() after timeoutMs (and clear the timer on completion), and
catch the abort/timeout case to throw a clear timeout-specific Error before
rethrowing other errors; update any constructor or config parsing to accept
timeoutMs so it can be tuned.

In `@observability/grafana-cloud/src/formatters/logs.ts`:
- Around line 38-62: In buildLogLine, metadata entries can overwrite reserved
fields (message, traceId, spanId, data) because the code copies log.metadata
into entry; update buildLogLine to either skip reserved keys or namespace
metadata keys before merging (e.g., prefix keys with metadata. or put under a
metadata field) by checking each metadata key against the reserved set {message,
traceId, spanId, data} and only adding non-reserved keys (or adding them under
entry.metadata) so existing structured fields set earlier are never overwritten.
🧹 Nitpick comments (17)
packages/core/src/processors/processors/system-prompt-scrubber.ts (1)

103-112: Extra properties leak into resolveObservabilityContext.

After destructuring part and abort, rest still contains streamParts and state, which get passed into resolveObservabilityContext. This works correctly at runtime (extra keys are ignored) but reduces clarity about what's actually flowing into the observability resolver.

If you want to be explicit:

♻️ Optional: destructure non-observability fields
-    const { part, abort, ...rest } = args;
-    const observabilityContext = resolveObservabilityContext(rest);
+    const { part, streamParts, state, abort, ...observabilityFields } = args;
+    const observabilityContext = resolveObservabilityContext(observabilityFields);
packages/core/src/processors/processors/moderation.ts (1)

225-226: state leaks into rest, which is passed to resolveObservabilityContext.

state: Record<string, any> is declared in the args type (line 220) but not destructured here, so it ends up in rest alongside the ObservabilityContext partial fields. While resolveObservabilityContext constructs a fresh object (so state doesn't propagate further), destructuring it out keeps the intent clear and the pattern consistent with processInput.

Suggested fix
-      const { part, streamParts, abort, ...rest } = args;
+      const { part, streamParts, state, abort, ...rest } = args;
observability/grafana-cloud/vitest.config.ts (1)

3-9: isolate: false may cause test pollution if exporter tests use fake timers or global mocks.

The exporter tests likely manipulate timers (for flush intervals/batching). With isolate: false, all test files share a single worker context, so leaked timer state or global mocks from one test file could affect another. If you observe flaky tests, consider enabling isolation or ensuring proper cleanup in afterEach/afterAll hooks.

.changeset/thin-knives-accept.md (1)

5-15: Add a code example showing public API usage.

Per the changeset guidelines, new features should include a code example demonstrating the public API. The changeset introduces createObservabilityContext(), resolveObservabilityContext(), and the new Mastra getters but doesn't show how a developer would use them. A brief example (e.g., accessing mastra.loggerVNext or creating a context) would help users understand the change.

Also, verify that a separate changeset exists for the @mastra/grafana-cloud package, since this changeset's frontmatter only covers @mastra/core.

As per coding guidelines: "If the change is a breaking change or is adding a new feature, ensure that a code example is provided. This code example should show the public API usage."

packages/core/src/processors/processors/pii-detector.ts (1)

581-602: streamParts and state leak into rest passed to resolveObservabilityContext.

In processOutputStream, the destructuring const { part, abort, ...rest } = args leaves streamParts and state in rest alongside the ObservabilityContext fields. This is functionally harmless (the resolver ignores unknown keys), but it's less precise than the other methods. Consider explicitly destructuring the unused fields for clarity.

♻️ Suggested cleanup
-    const { part, abort, ...rest } = args;
+    const { part, streamParts: _, state: _state, abort, ...rest } = args;
     const observabilityContext = resolveObservabilityContext(rest);
packages/core/src/processors/runner.ts (2)

780-790: Stale JSDoc: @param args.tracingContext no longer exists in the signature.

Line 783 references args.tracingContext but the parameter is now Partial<ObservabilityContext> spread into args. Update to reflect the new shape.


988-1010: Same stale JSDoc at line 992 referencing @param args.tracingContext.

The runProcessOutputStep JSDoc still documents tracingContext as a named parameter, but the actual signature uses & Partial<ObservabilityContext>.

packages/core/src/observability/context.ts (1)

20-20: Pre-existing: duplicate 'createRun' entry and redundant condition.

WORKFLOW_METHODS_TO_WRAP lists 'createRun' twice (line 20), and line 150 checks prop === 'createRun' || prop === 'createRun' — both sides are identical. Not introduced by this PR, but worth cleaning up when convenient.

♻️ Suggested cleanup
-const WORKFLOW_METHODS_TO_WRAP = ['execute', 'createRun', 'createRun'];
+const WORKFLOW_METHODS_TO_WRAP = ['execute', 'createRun'];
-            if (prop === 'createRun' || prop === 'createRun') {
+            if (prop === 'createRun') {

Also applies to: 148-155

packages/core/src/evals/run/index.ts (1)

267-365: resolveObservabilityContext(item) is called repeatedly for the same item in runScorers.

Lines 282, 312, and 334 each call resolveObservabilityContext(item) for the same item within the same function invocation. Consider resolving once at the top of runScorers and reusing it.

♻️ Suggested refactor
 async function runScorers(
   scorers: MastraScorer<any, any, any, any>[] | WorkflowScorerConfig,
   targetResult: any,
   item: RunEvalsDataItem<any>,
 ): Promise<Record<string, any>> {
   const scorerResults: Record<string, any> = {};
+  const observabilityContext = resolveObservabilityContext(item);
 
   if (Array.isArray(scorers)) {
     for (const scorer of scorers) {
       try {
         const score = await scorer.run({
           input: targetResult.scoringData?.input,
           output: targetResult.scoringData?.output,
           groundTruth: item.groundTruth,
           requestContext: item.requestContext,
-          ...resolveObservabilityContext(item),
+          ...observabilityContext,
         });

Apply the same replacement at the other two call sites (lines 312 and 334).

packages/core/src/processors/processors/structured-output.ts (1)

87-88: retryCount leaks into rest and is passed to resolveObservabilityContext.

After destructuring { part, state, streamParts, abort, ...rest }, the rest object still contains retryCount (from the args type). While resolveObservabilityContext safely ignores unknown fields, it's slightly imprecise. Consider also destructuring retryCount to keep the intent clear.

♻️ Suggested fix
-    const { part, state, streamParts, abort, ...rest } = args;
+    const { part, state, streamParts, abort, retryCount, ...rest } = args;
observability/grafana-cloud/src/formatters/traces.ts (1)

164-171: Metadata values that aren't string | number | boolean will be silently cast.

If span.metadata contains a value like a bigint, symbol, or function, the typeof v === 'object' check won't stringify it, and the as string | number | boolean cast masks the type mismatch. kv would then produce an incorrect OTLP value (e.g., { doubleValue: NaN } for a symbol).

Consider adding a typeof guard for the non-object branch to ensure only primitive-safe values are forwarded:

Suggested tightening
      if (v === null || v === undefined) continue;
-      const val = typeof v === 'object' ? JSON.stringify(v) : v;
-      attrs.push(kv(`mastra.metadata.${k}`, val as string | number | boolean));
+      const val = typeof v === 'object' ? JSON.stringify(v)
+        : typeof v === 'string' || typeof v === 'number' || typeof v === 'boolean'
+          ? v
+          : String(v);
+      attrs.push(kv(`mastra.metadata.${k}`, val));
observability/grafana-cloud/src/formatters/logs.ts (2)

75-96: entity_name as a Loki stream label may cause high-cardinality issues.

Loki strongly penalizes high-cardinality label sets. If entityName varies per request (e.g., dynamically named agents or unique user IDs), this will create excessive streams and degrade query performance. Consider whether entity_name should remain a label or be moved into the log line body (where it's still searchable via LogQL filters).


30-32: dateToNanoString is duplicated across traces.ts and logs.ts.

Both formatters define the same function. Consider extracting it to a shared utility module (e.g., formatters/utils.ts).

observability/grafana-cloud/src/exporter.ts (2)

159-164: Avoid Object.defineProperty to bypass readonly — use a private mutable field instead.

Using Object.defineProperty to mutate a readonly field (line 162) undermines TypeScript's compile-time guarantee and is surprising to future readers. A cleaner approach is a private mutable field with the readonly modifier removed, or a separate _serviceName backing field.

♻️ Suggested approach
- private readonly serviceName: string;
+ private serviceName: string;

  override init(options: InitExporterOptions): void {
    if (options.config?.serviceName && this.serviceName === DEFAULTS.serviceName) {
-     Object.defineProperty(this, 'serviceName', { value: options.config.serviceName });
+     this.serviceName = options.config.serviceName;
    }
  }

78-108: Consider extracting the disabled-state initialization to reduce duplication.

The "missing instanceId" and "missing apiKey" blocks are nearly identical (21 lines each). A small private helper or early-return pattern would halve this code.

♻️ Example: extract helper
+ private initDisabled(reason: string): void {
+   this.instanceId = '';
+   this.apiKey = '';
+   this.authHeader = '';
+   this.tempoEndpoint = '';
+   this.mimirEndpoint = '';
+   this.lokiEndpoint = '';
+   this.serviceName = DEFAULTS.serviceName;
+   this.batchSize = DEFAULTS.batchSize;
+   this.flushIntervalMs = DEFAULTS.flushIntervalMs;
+   this.setDisabled(reason);
+ }

  // Then in constructor:
  if (!instanceId) {
-   this.instanceId = '';
-   this.apiKey = '';
-   ... (9 more lines)
+   this.initDisabled('Missing instanceId. Set GRAFANA_CLOUD_INSTANCE_ID env var or pass instanceId in config.');
    return;
  }
  if (!apiKey) {
+   this.initDisabled('Missing apiKey. Set GRAFANA_CLOUD_API_KEY env var or pass apiKey in config.');
-   this.instanceId = '';
-   ... (9 more lines)
    return;
  }
packages/core/src/observability/types/core.ts (1)

370-426: ObservabilityExporter — consider making exportTracingEvent optional or documenting its relationship with onTracingEvent.

The interface has both onTracingEvent? (optional) and exportTracingEvent (required). The GrafanaCloudExporter implementation shows _exportTracingEvent delegating to onTracingEvent, which suggests exportTracingEvent is a legacy path. This dual-path design is fine for backward compatibility, but a brief JSDoc note on exportTracingEvent explaining its relationship to onTracingEvent (e.g., "Legacy path; prefer implementing onTracingEvent for new exporters") would help implementers.

packages/core/src/agent/agent.ts (1)

2199-2244: Minor: variable shadowing of rest inside listClientTools.

The outer ...rest (line 2207) captures the ObservabilityContext fields from the function params, while the inner const { execute, ...rest } = tool (line 2227) shadows it within the loop body. There's no functional bug since observabilityContext is already resolved on line 2217, but the shadowing can confuse readers.

Consider renaming one of the two for clarity, e.g., toolRest or toolWithoutExecute for the inner variable.

✏️ Optional rename to avoid shadowing
-        const { execute, ...rest } = tool;
+        const { execute, ...toolDef } = tool;
         const options: ToolOptions = {
           ...
         };
-        const convertedToCoreTool = makeCoreTool(rest, options, 'client-tool', autoResumeSuspendedTools);
+        const convertedToCoreTool = makeCoreTool(toolDef, options, 'client-tool', autoResumeSuspendedTools);

Comment on lines 333 to 350
private async sendRequest(url: string, body: unknown): Promise<void> {
const response = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: this.authHeader,
'X-Scope-OrgID': this.instanceId,
},
body: JSON.stringify(body),
});

if (!response.ok) {
const responseText = await response.text().catch(() => '(no body)');
throw new Error(
`Grafana Cloud API error: ${response.status} ${response.statusText} - ${responseText}`,
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a timeout to fetch calls to avoid indefinite hangs.

sendRequest makes HTTP calls without a timeout or AbortSignal. If Grafana Cloud is unreachable or slow, this will block the flush indefinitely — particularly problematic in serverless environments where you have limited execution time, and during shutdown() which awaits flush().

⏱️ Proposed fix: add a configurable timeout
+ private readonly requestTimeoutMs: number;
+
  // In constructor:
+ this.requestTimeoutMs = config.requestTimeoutMs ?? 30_000;

  private async sendRequest(url: string, body: unknown): Promise<void> {
+   const controller = new AbortController();
+   const timeout = setTimeout(() => controller.abort(), this.requestTimeoutMs);
+
    const response = await fetch(url, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        Authorization: this.authHeader,
        'X-Scope-OrgID': this.instanceId,
      },
      body: JSON.stringify(body),
+     signal: controller.signal,
-   });
+   }).finally(() => clearTimeout(timeout));

    if (!response.ok) {
      const responseText = await response.text().catch(() => '(no body)');
      throw new Error(
        `Grafana Cloud API error: ${response.status} ${response.statusText} - ${responseText}`,
      );
    }
  }
🤖 Prompt for AI Agents
In `@observability/grafana-cloud/src/exporter.ts` around lines 333 - 350,
sendRequest currently calls fetch without a timeout; add a configurable timeout
property to the exporter class (e.g., timeoutMs) and use an AbortController
inside sendRequest to abort the request after that timeout. Create the
AbortController, pass controller.signal to fetch, set a timer to call
controller.abort() after timeoutMs (and clear the timer on completion), and
catch the abort/timeout case to throw a clear timeout-specific Error before
rethrowing other errors; update any constructor or config parsing to accept
timeoutMs so it can be tuned.

Comment on lines +38 to +62
function buildLogLine(log: ExportedLog): string {
const entry: Record<string, unknown> = {
message: log.message,
};

// Add trace correlation
if (log.traceId) entry.traceId = log.traceId;
if (log.spanId) entry.spanId = log.spanId;

// Add structured data
if (log.data && Object.keys(log.data).length > 0) {
entry.data = log.data;
}

// Add metadata fields (runId, sessionId, userId, etc.)
if (log.metadata && Object.keys(log.metadata).length > 0) {
for (const [k, v] of Object.entries(log.metadata)) {
if (v !== undefined && v !== null) {
entry[k] = v;
}
}
}

return JSON.stringify(entry);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Metadata keys can silently overwrite reserved log-line fields.

buildLogLine spreads log.metadata entries directly into entry (lines 54-58). If metadata contains a key like message, traceId, spanId, or data, it will overwrite the structured fields set earlier, corrupting the log line.

Consider namespacing metadata or skipping reserved keys:

Option: skip reserved keys
+  const RESERVED_KEYS = new Set(['message', 'traceId', 'spanId', 'data']);
+
   // Add metadata fields (runId, sessionId, userId, etc.)
   if (log.metadata && Object.keys(log.metadata).length > 0) {
     for (const [k, v] of Object.entries(log.metadata)) {
-      if (v !== undefined && v !== null) {
+      if (v !== undefined && v !== null && !RESERVED_KEYS.has(k)) {
         entry[k] = v;
       }
     }
   }
🤖 Prompt for AI Agents
In `@observability/grafana-cloud/src/formatters/logs.ts` around lines 38 - 62, In
buildLogLine, metadata entries can overwrite reserved fields (message, traceId,
spanId, data) because the code copies log.metadata into entry; update
buildLogLine to either skip reserved keys or namespace metadata keys before
merging (e.g., prefix keys with metadata. or put under a metadata field) by
checking each metadata key against the reserved set {message, traceId, spanId,
data} and only adding non-reserved keys (or adding them under entry.metadata) so
existing structured fields set earlier are never overwritten.

…lf-hosted

Rename the package from @mastra/grafana-cloud to @mastra/grafana since
the exporter works with both Grafana Cloud and self-hosted Grafana stack
(Tempo, Mimir, Loki) — the wire protocols are identical.

Key changes:
- Rename GrafanaCloudExporter → GrafanaExporter
- Add GrafanaAuth union type (basic, bearer, custom, none)
- Add grafanaCloud() config helper for Cloud setup (zone-based endpoints,
  Basic auth with instanceId:apiKey)
- Add grafana() config helper for self-hosted setup (direct endpoints,
  flexible auth)
- Per-signal endpoint gating: skip signals with no endpoint configured
- X-Scope-OrgID header only sent when tenantId is set
- Export GrafanaCloudConfig, GrafanaSelfHostedConfig, GrafanaAuth types
- Add README.md with usage examples for both deployment modes

74 passing tests (up from 59).

https://claude.ai/code/session_01V4TiknqfJt5SpaQtxQYpjD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants