Core Trail Kit logo

Core Trail Labs

Core Trail Kit

Docs

CTK Education Layer: from runtime signals to trusted operational decisions.

Most tools show signals. CTK helps explain what those signals mean, why they conflict, and what decisions are safe. Use this docs system to learn how to read CTK before you install it into production-adjacent workflows.

Evidence LedgerContradiction DetectionConfidence EvolutionTruth AdjudicationDeterministic ReplaySafety Gate

Documentation Map

Category onboarding before implementation details

Read CTK in this order: model first, decisions second, interface third, then runtime connectors and rollout details.

1. Start Here

Core idea, category boundary, and first outcomes.

Open section

2. How CTK Thinks

Central reasoning lifecycle from signals to replayable operational memory.

Open section

3. Mental Model

How CTK reasons from evidence to decisions.

Open section

4. Understanding Decisions

Why CTK blocked, allowed, or downgraded a decision.

Open section

5. Reading the Desktop

How to read Timeline, Signals, Incidents, and Safety Gate states.

Open section

6. Replay & Incident Analysis

Deterministic replay flows and confidence transitions.

Open section

7. Runtime Modes & Connectors

Observation scope for local, Docker, cloud, and cluster runtimes.

Open section

8. Trust & Safety

Local-first boundaries, access model, and safety policy.

Open section

9. Walkthroughs

Operational stories with actionable operator outcomes.

Open section

10. Terminology

Semantic foundation for replayable operational decisions.

Open section

11. Incident Stories

Case studies with contradiction and outcome.

Open section

1. Start Here

Understand CTK in plain operational language

This section answers what CTK is, what it changes after install, and where its boundaries are.

What is Core Trail Kit?

Core Trail Kit is deterministic operational intelligence: it turns conflicting runtime signals into replayable, confidence-scored decisions.

  • One shared operational truth per evidence state
  • Clear reason for every allowed or blocked action
  • Replayable decision lineage for postmortems and handoffs

CTK in 5 minutes

Attach workspace context, detect services, build topology, generate timeline moments, then replay how decisions were adjudicated.

  • You see why a contradiction exists
  • You see confidence evolution over time
  • You see whether Safety Gate should block action

From signals to trusted decisions

Most tools show logs, metrics, or traces separately. CTK connects them into one reasoning chain and emits operator-safe conclusions.

  • Signals are not treated as isolated alerts
  • Conflicts are modeled as contradictions
  • Actions are risk-aware by default

What CTK is NOT

CTK is not a black-box AI incident bot, not an ingest-priced observability clone, and not an autonomous production mutation agent.

  • No hidden action logic
  • No blind auto-remediation
  • Evidence before recommendation

2. How CTK Thinks

Central reasoning lifecycle: from signals to operational memory

Use this as the canonical mental model page for CTK reasoning. It explains how signals become evidence, how contradictions affect confidence, and how decisions become replayable operational memory.

Signals -> Evidence -> Contradictions -> Confidence Evolution -> Truth Adjudication -> Safety Gate -> Replay Chain -> Operational Memory

This chain is where CTK differs from dashboards. Each transition is explicit, deterministic, and auditable by operators.

Open the full model

Read the complete lifecycle, operator examples, and what CTK does vs does not do.

Open “How CTK Thinks”

3. Core Mental Model

How CTK thinks

Each concept explains: definition, why it exists, what traditional tooling misses, how CTK handles it, a real scenario, and what you see in desktop.

Operational Truth

Definition: The current state CTK can defend with explicit evidence right now.

Why this exists: Teams need one trusted conclusion, not competing dashboards.

Traditional gap: Signal surfaces disagree but tooling rarely commits to one narrative.

CTK handles it: CTK adjudicates final posture from evidence quality, contradictions, and confidence.

Scenario: Health check is green while queue lag and runtime latency degrade.

Desktop view: Appears as adjudicated state in Timeline, Replay Chain, and Recommendations.

Related: Evidence Ledger • Truth Adjudication • Confidence Evolution

Related walkthrough

Evidence Ledger

Definition: Structured record of all observed runtime evidence used in decisions.

Why this exists: Operators must audit why CTK concluded something.

Traditional gap: Most incident notes lose source quality and ordering.

CTK handles it: Each decision points to evidence records with freshness and source context.

Scenario: A blocked restart links to stale probe + fresh latency evidence.

Desktop view: Visible as supporting evidence under Timeline moments and replay steps.

Related: Operational Memory • Deterministic Replay

Related walkthrough

Contradiction Detection

Definition: Modeling of conflicting signals that cannot be trusted equally.

Why this exists: Green probes can hide real runtime degradation.

Traditional gap: Conflicting telemetry is treated as noise instead of risk.

CTK handles it: CTK creates contradiction objects that reduce confidence and influence gates.

Scenario: Runtime throughput drops while readiness remains healthy.

Desktop view: Shown in Analyze surfaces, timeline chains, and consistency findings.

Related: Confidence Evolution • Safety Gate

Related walkthrough

Confidence Evolution

Definition: Continuous trust score changes based on evidence quality and conflict pressure.

Why this exists: Incident risk changes faster than static severity labels.

Traditional gap: Most tools show severity but not trust drift.

CTK handles it: Confidence rises with corroboration, drops with stale or conflicting evidence.

Scenario: Confidence falls 0.82 -> 0.41 while contradiction persists.

Desktop view: Displayed as confidence chips, trend transitions, and gate inputs.

Related: Evidence Ledger • Safety Gate • Truth Adjudication

Truth Adjudication

Definition: Deterministic resolution step that emits final operational posture.

Why this exists: Teams need action-ready conclusions during pressure.

Traditional gap: Operators manually arbitrate competing signals with no traceable logic.

CTK handles it: CTK evaluates evidence + contradiction + confidence and commits one state.

Scenario: Service marked degraded despite healthy endpoint due to runtime conflict.

Desktop view: Appears in Replay chain final nodes and recommendation context.

Related: Operational Truth • Deterministic Replay

Deterministic Replay

Definition: Re-openable chain: evidence -> contradiction -> adjudication -> decision.

Why this exists: Postmortems must explain exactly why actions happened.

Traditional gap: Chat + screenshots cannot reconstruct causal order.

CTK handles it: CTK stores stable replay chains for every key timeline decision.

Scenario: Blocked action later becomes allowed after confidence recovery.

Desktop view: Timeline card opens reasoning modal with full chain.

Related: Operational Memory • Decision Trust Layer

Related walkthrough

Safety Gate

Definition: Policy layer that blocks unsafe actions when trust is weak.

Why this exists: Fast actions under weak evidence can worsen incidents.

Traditional gap: Recommendation lists rarely encode action risk.

CTK handles it: Gate checks confidence, freshness, and contradiction state before allowing action.

Scenario: Unsafe restart blocked until stale topology evidence is refreshed.

Desktop view: Shown as allowed/review/blocked state with reason in replay and recommendations.

Related: Confidence Evolution • Contradiction Detection

Operational Memory

Definition: Shared replayable incident and decision history across shifts.

Why this exists: Without memory, every shift re-discovers root causes.

Traditional gap: Context reset at handoff causes repeated mistakes.

CTK handles it: Timeline moments persist decisions, evidence states, and rationale.

Scenario: Night shift replays day shift contradiction before acting.

Desktop view: Timeline, incidents, and replay panels show memory continuity.

Related: Deterministic Replay • Evidence Ledger

Related walkthrough

Topology Delta

Definition: Confidence-aware change tracking of service relationships over time.

Why this exists: Runtime drift often starts in dependency changes.

Traditional gap: Many tools show static topology snapshots with weak delta meaning.

CTK handles it: CTK highlights changed/removed/added entities and confidence movement.

Scenario: Consumer dependency edge disappears, contradiction pressure rises.

Desktop view: Topology Tree and Delta Details expose relation-level confidence shifts.

Related: Relationship Confidence • Contradiction Detection

Decision Trust Layer

Definition: Combined model of evidence quality, contradiction pressure, and action safety.

Why this exists: Operators need to know if a recommendation is safe right now.

Traditional gap: Action advice is often binary and context-free.

CTK handles it: CTK scores trust dynamically and surfaces risk posture before action.

Scenario: Recommendation visible but blocked because confidence remains low.

Desktop view: Trust badges, confidence labels, and gate outcomes in Analyze + Timeline.

Related: Safety Gate • Confidence Evolution • Operational Truth

What CTK Is Not

Category boundary for operational trust

CTK is a deterministic operational intelligence system. It is not generic monitoring software, and it does not mutate production autonomously.

CTK is not

  • Another metrics dashboard
  • A black-box AI incident bot
  • Autonomous production mutation
  • An ingest-priced telemetry SaaS clone
  • Generic monitoring software
  • AI magic without evidence lineage

CTK focuses on

  • Deterministic reasoning
  • Replayable evidence
  • Confidence-aware operations
  • Contradiction visibility
  • Safety-gated actions
  • Shared operational memory

4. Why Did CTK Do This?

Operator-facing reasoning patterns

Use this section when a blocked action, confidence drop, contradiction, or status mismatch appears and you need fast interpretation.

Why did CTK block a restart?

Scenario: Health checks were green, but queue latency and throughput degraded under load.

Evidence

  • health endpoint green
  • queue latency rising
  • consumer throughput dropping
  • topology snapshot stale

Contradiction: Service appears healthy by probe but behaves degraded in runtime behavior signals.

Confidence: 0.82 -> 0.41

Decision: Safety Gate blocked restart recommendation.

Operator meaning: Do not act on health checks alone; refresh evidence and inspect replay before disruptive actions.

Why did confidence drop?

Scenario: Incident pressure rose while one evidence source stopped refreshing.

Evidence

  • runtime status still available
  • logs became stale
  • contradiction remained unresolved

Contradiction: Signal freshness and behavior quality diverged, reducing trust in current posture.

Confidence: 0.74 -> 0.58

Decision: Recommendations moved from allowed to review state.

Operator meaning: Confidence drop means slower, safer action until trust is restored.

Why did CTK create a contradiction?

Scenario: Service marked healthy while incident pressure and latency trend escalated.

Evidence

  • probe healthy
  • latency critical
  • runtime throughput unstable

Contradiction: Status signals and behavior signals disagree materially.

Confidence: High -> Medium

Decision: Contradiction node created and propagated into recommendations.

Operator meaning: CTK prevents false certainty by modeling conflict directly.

Why was a recommendation allowed?

Scenario: Lag pressure reduced after fresh corroborating evidence arrived.

Evidence

  • logs fresh
  • runtime stable
  • topology path confirmed
  • contradiction resolved

Contradiction: Prior contradiction cleared with aligned fresh evidence.

Confidence: 0.63 -> 0.84

Decision: Safety Gate allowed lower-risk remediation action.

Operator meaning: Allowed does not mean automatic; it means trust threshold is now satisfied.

Why did topology confidence change?

Scenario: A dependency edge disappeared during deploy and reappeared later.

Evidence

  • delta removed relation
  • runtime logs showed fallback path
  • relation restored after rollback

Contradiction: Observed topology and expected topology were temporarily inconsistent.

Confidence: 0.77 -> 0.52 -> 0.73

Decision: CTK downgraded relation trust until evidence stabilized.

Operator meaning: Treat topology confidence shifts as runtime risk indicators, not UI noise.

Why is evidence marked stale?

Scenario: A mapped provider log stream stopped sending updates.

Evidence

  • no new log records
  • health unchanged
  • runtime pressure active

Contradiction: Stable status claims continue while key diagnostic source is stale.

Confidence: 0.69 -> 0.49

Decision: CTK added freshness penalty and moved actions toward review/blocked.

Operator meaning: Restore source freshness before trusting high-impact recommendations.

Why is service configured but inactive?

Scenario: Service exists in workspace config but has no active observation evidence.

Evidence

  • service mapped
  • runtime feed absent
  • no fresh health/log updates

Contradiction: Configured intent and observed runtime activity are misaligned.

Confidence: 0.61 -> 0.43

Decision: CTK raises consistency finding and gates risky actions.

Operator meaning: Confirm runtime is actually running before treating config as truth.

5. Reading the Desktop App

How to read dense CTK surfaces without confusion

Timeline is dense by design. These guides tell you where to focus first and what can be ignored unless deep debugging is needed.

How to read Timeline

What you see: Moments, primary event, supporting evidence, confidence state, and action branch.

Read order: Start from newest primary event -> contradiction notes -> confidence -> action outcome.

What matters first: First focus on blocked/review decisions; they carry highest operational risk.

Ignore unless deep debugging: Ignore deep supporting nodes unless you are debugging decision quality.

Related replay example

How to read Replay Chains

What you see: Deterministic causal chain from evidence ingestion to final operational decision.

Read order: Evidence -> Contradiction -> Adjudication -> Safety Gate -> Final.

What matters first: First verify contradiction and confidence transitions before reviewing recommendations.

Ignore unless deep debugging: Ignore low-impact supporting events during rapid incident response.

Related replay example

How to read Topology Tree

What you see: Service/entity relationships with confidence and delta direction.

Read order: Inspect changed/removed edges first, then stable edges.

What matters first: First focus on edges touching the impacted runtime path.

Ignore unless deep debugging: Ignore low-confidence inferred edges unless they intersect incident path.

How to read Relationship Inspector

What you see: Edge-level confidence trend, relation evidence, and historical updates.

Read order: Check relation status -> confidence movement -> evidence freshness.

What matters first: First validate whether relation drift maps to runtime symptom.

Ignore unless deep debugging: Ignore unchanged relation metadata during active outage triage.

How to read Recommendations

What you see: Action suggestions with confidence, contradiction context, and safety state.

Read order: Read risk context before action text.

What matters first: First focus on blocked/high-risk recommendations to understand guardrails.

Ignore unless deep debugging: Ignore lower-priority suggestions during critical incident containment.

How to read Incidents

What you see: Pressure-classified runtime anomalies tied to affected services.

Read order: Severity + confidence + contradiction posture, then detailed evidence.

What matters first: First isolate incidents with low confidence and high pressure.

Ignore unless deep debugging: Ignore stale/resolved incident cards in live response windows.

How to read Signals

What you see: Coverage quality, freshness posture, missing diagnostics, and signal class health.

Read order: Availability -> quality -> freshness -> critical missing signal.

What matters first: First remediate missing high-impact diagnostics (logs/health/runtime gaps).

Ignore unless deep debugging: Ignore cosmetic low-noise metrics when contradiction is active.

How to read Safety Gate badges

What you see: Allowed / Review / Blocked action readiness with reason metadata.

Read order: Badge state -> reason -> confidence trend -> required operator step.

What matters first: First resolve why block/review occurred; do not bypass.

Ignore unless deep debugging: Ignore optimistic action text when badge state remains blocked.

How to read confidence labels

What you see: Trust score labels mapped to evolving evidence quality and conflict pressure.

Read order: Current confidence -> change direction -> contradiction + freshness contributors.

What matters first: First inspect recent confidence drops before taking disruptive actions.

Ignore unless deep debugging: Ignore absolute confidence number without transition context.

6. Runtime Modes & Connectors

Observation scope, permissions, and boundaries

Connector pages explain what CTK observes, what it never mutates automatically, and the exact trust/security boundaries for each runtime model.

Already Running mode

What CTK observes: Existing runtime process/service signals without re-launching workloads.

What CTK does not modify: Does not auto-restart, reconfigure, or mutate running services.

Required permissions: Local runtime read access for mapped services and evidence sources.

Data flow: Runtime signals -> Evidence Ledger -> Contradiction + Confidence -> Timeline/Replay.

Security boundary: Workspace-scoped observation only.

Typical use case: Attach CTK to already running environments for immediate reasoning.

Limitations: Quality depends on available runtime signal mappings.

Start with Toolkit mode

What CTK observes: Processes/services launched through Toolkit startup flow.

What CTK does not modify: Not a CI/CD replacement or long-running process orchestrator.

Required permissions: Local execution rights for configured workspace startup commands.

Data flow: Toolkit launch context + runtime signals feed deterministic timeline.

Security boundary: Bound to local workspace session context.

Typical use case: Controlled local boot with immediate evidence baseline.

Limitations: Covers workloads started via Toolkit flow only.

Docker connector

What CTK observes: Container health, status, logs, and mapped service runtime behavior.

What CTK does not modify: Does not become your Docker orchestrator or deployment manager.

Required permissions: Docker context read access scoped to mapped workloads.

Data flow: Container signals are normalized into service-level evidence lineage.

Security boundary: No automatic host-wide crawling outside mapped services.

Typical use case: Containerized local/runtime systems requiring contradiction-aware reasoning.

Limitations: Coverage follows explicit service mapping quality.

Kubernetes connector

What CTK observes: Mapped cluster service/pod signals, logs, and relationship drift evidence.

What CTK does not modify: Does not replace kubectl/Helm or cluster operations.

Required permissions: Scoped kube context permissions (namespace/selector aware).

Data flow: Cluster runtime evidence is converted into workspace decision lineage.

Security boundary: Only mapped cluster scope is observed.

Typical use case: Service contradictions across k8s runtime behavior and probes.

Limitations: Requires explicit mapping for meaningful confidence.

AWS connector

What CTK observes: Mapped cloud runtime evidence (health endpoints, logs, runtime status).

What CTK does not modify: No infrastructure provisioning or mutation by default.

Required permissions: Provider-profile scoped access to mapped resources.

Data flow: AWS evidence enters local CTK reasoning chain via connector mappings.

Security boundary: Workspace service mappings constrain account visibility.

Typical use case: Local-connected reasoning for cloud-hosted services.

Limitations: Coverage depends on configured resource mappings and freshness.

Azure connector

What CTK observes: Mapped Azure runtime diagnostics and service-level health evidence.

What CTK does not modify: Does not act as deployment or infra control layer.

Required permissions: Scoped provider profile access for mapped resources.

Data flow: Azure signals are transformed into deterministic confidence-aware decisions.

Security boundary: No tenant-wide observation by default.

Typical use case: Cross-service contradiction and drift analysis in Azure-hosted runtimes.

Limitations: Requires explicit mapping for each critical service.

GCP connector

What CTK observes: Mapped GCP runtime logs/status/health evidence.

What CTK does not modify: Not a cloud operations replacement.

Required permissions: Scoped project/resource access via mapping profile.

Data flow: GCP evidence -> local CTK evidence ledger -> adjudicated decisions.

Security boundary: Only configured scope participates in analysis.

Typical use case: Operational truth for multi-service GCP environments.

Limitations: Unmapped services remain outside confidence model.

SSH / remote runtime model

What CTK observes: Mapped host/process-level runtime evidence where connector is enabled.

What CTK does not modify: Not a remote automation agent.

Required permissions: Host access scoped to explicit connection mappings.

Data flow: Remote runtime evidence is streamed into local-first reasoning session.

Security boundary: Connection and service mapping define strict observation boundary.

Typical use case: Legacy host-level services not exposed through orchestrators.

Limitations: Coverage varies by host instrumentation and mapping depth.

7. Trust Boundaries

Why CTK should never feel like a black-box agent touching production

This section defines local-first behavior, what stays scoped, and where CTK draws security and action boundaries.

Local-first architecture

CTK reasoning is anchored in your local session context before cloud scale concerns.

No cloud agent required

CTK does not require always-on external mutation agents to produce decisions.

Workspace-scoped intelligence

Evidence and decisions are scoped to active workspace mappings, not global blind crawl.

What data stays local

Connector configuration, local execution context, and session-level reasoning remain local-first.

What data may sync

Configured metadata and selected organizational state can sync for shared memory workflows.

Provider access model

Provider profiles and mappings define exact resource boundaries for observation.

Evidence retention model

Timeline moments and replay chains preserve auditability of key operational decisions.

Read-only vs action-capable behavior

CTK observes by default; action pathways are explicit, confidence-aware, and safety-gated.

8. Walkthroughs

Scenario-first operational learning

Walkthroughs are the shortest path to understanding how CTK removes operational pain in real runtime situations.

Kafka lag contradiction

What operators saw: Lag grew while service endpoint stayed green.

Traditional tools missed: No single tool connected queue pressure with runtime contradiction.

CTK detected: Conflict between throughput behavior and healthy probe posture.

Confidence changed: 0.79 -> 0.55 while contradiction persisted.

Blocked/allowed action: Unsafe restart blocked; safer consumer scaling allowed.

Operator learned: Avoid disruptive actions when trust is low but safer path is validated.

Replay chain summary: Replay showed why scaling was approved before restart trust recovered.

Green checks, degraded runtime

What operators saw: Health was green but latency and error bursts escalated.

Traditional tools missed: Alert streams lacked contradiction context.

CTK detected: Behavior-health mismatch contradiction with freshness penalties.

Confidence changed: 0.83 -> 0.46

Blocked/allowed action: Recommendation moved to review state until fresh corroboration.

Operator learned: Healthy checks are insufficient when runtime behavior diverges.

Replay chain summary: Replay chain linked confidence drop directly to unresolved mismatch.

Configured but inactive service

What operators saw: Service existed in config but showed no live evidence.

Traditional tools missed: Config looked complete, so inactivity was overlooked.

CTK detected: Configuration-runtime inconsistency with trust downgrade.

Confidence changed: 0.64 -> 0.43

Blocked/allowed action: CTK gated risky actions and raised consistency finding.

Operator learned: Configuration is not runtime truth.

Replay chain summary: Timeline exposed inactive window and missing evidence path.

Missing consumer contradiction

What operators saw: Producer throughput steady, consumer pipeline silently stalled.

Traditional tools missed: Signals looked healthy in isolation.

CTK detected: Cross-signal contradiction between ingress and processing egress.

Confidence changed: 0.71 -> 0.49

Blocked/allowed action: Scale action deferred until consumer evidence refreshed.

Operator learned: Contradiction detection prevents false positive stability.

Replay chain summary: Replay isolated exact divergence moment between producer and consumer state.

Topology drift investigation

What operators saw: Intermittent failures after deploy despite healthy base metrics.

Traditional tools missed: Topology change impact was unclear.

CTK detected: Critical relationship confidence dropped after edge removal.

Confidence changed: 0.76 -> 0.57

Blocked/allowed action: CTK prioritized dependency rollback path recommendation.

Operator learned: Topology delta is operational risk signal, not visual metadata.

Replay chain summary: Replay proved drift timing aligned with incident onset.

Safety Gate blocked risky restart

What operators saw: Pressure high, restart looked tempting.

Traditional tools missed: No risk-aware guard to prevent panic action.

CTK detected: Active contradiction + stale evidence made trust insufficient.

Confidence changed: 0.68 -> 0.42

Blocked/allowed action: Restart blocked with explicit operator guidance.

Operator learned: Safety Gate exists to protect production from weak-evidence actions.

Replay chain summary: Replay chain documented exact gate rule trigger.

Replayable postmortem handoff

What operators saw: Shift handoff during unresolved multi-service incident.

Traditional tools missed: Narrative-only handoff lost confidence context.

CTK detected: Preserved evidence/contradiction state as operational memory.

Confidence changed: 0.51 -> 0.73 after follow-up evidence.

Blocked/allowed action: Next shift resumed from replay chain, not from scratch.

Operator learned: Shared operational memory reduces repeated triage effort.

Replay chain summary: Deterministic lineage prevented context loss between teams.

9. Terminology Preview

Shared CTK language for teams

Use this preview to align product, SRE, and engineering conversations. For full terminology doctrine, open the dedicated terminology page.

Signal

Raw runtime observation from health/log/status/topology sources.

Evidence

Signal record accepted into CTK ledger with source and freshness context.

Freshness

Recency quality of evidence; stale evidence reduces confidence.

Contradiction

Conflict between signals that cannot be equally trusted.

Confidence

Trust score in the current operational truth.

Adjudication

Deterministic resolution that emits final state from evidence + conflict.

Replay Chain

Causal sequence from evidence to final action decision.

Safety Gate

Policy mechanism that allows/reviews/blocks risky actions.

Operational Memory

Replayable shared timeline of decisions and rationale.

Timeline Moment

A stored incident/decision event in chronological context.

Topology Delta

Confidence-aware change in service relationships over time.

Relationship Confidence

Trust score for a specific dependency edge.

Decision Lineage

Audit trail for why CTK made a recommendation or block.

Runtime Truth

Current behavior reality CTK can defend with evidence.

10. Incident Stories

See operational outcomes in real narratives

Use case studies and replay demos to understand how contradiction, confidence, and safety-gated decisions behave in production-like scenarios.

Docs to Trial Path

Turn concepts into live operational reasoning

1. Replay first

Open a deterministic replay to see evidence, contradiction, confidence, and gate state in one chain.

Open Replay Demo

2. Install desktop

Download CTK Desktop and run local-first observation for your workspace runtime boundaries.

Download Desktop

3. Initialize workspace

Connect repo + source mode, then start deterministic timeline and contradiction-aware recommendations.

Initialize Workspace