Features

Data Exploration

Automatically discover and map entities and relationships in your observability data through AI-driven intelligent exploration.

Data Exploration is one of Castrel's core features, helping teams quickly understand data structures, discover queryable IT resources and monitoring entities, and build reusable query knowledge after connecting new data sources. Whether you're integrating a new logging system or need to map out the structure of existing metrics data, Castrel helps you efficiently complete data governance tasks.

What is Data Exploration?

Data Exploration is an AI-driven data discovery and governance system that automatically identifies key entities in your observability data, establishes relationships between entities, and generates reusable query templates. When you connect a new data source (such as Elasticsearch, Prometheus, Loki, etc.), Castrel automatically scans the data structure, identifies services, instances, infrastructure, and other entities, and persists the findings as knowledge.

Unlike traditional manual data mapping approaches, Data Exploration can:

  • Automatically discover entities: Identify services, instances, hosts, and other key entities from massive datasets
  • Establish relationships: Map out association paths and query methods between entities
  • Generate query templates: Output directly reusable query statements, lowering the barrier for subsequent use

How to Use Data Exploration

1. Start Exploration

You can start Data Exploration in the following ways:

  • Auto-triggered after connecting a data source: After completing data source connection configuration, Castrel will prompt you to start data exploration
  • Manual trigger: Navigate to the Chat page and select the Data Exploration tab, then choose a connector you want to explore. You can configure the following option:
    • Save exploration results: When enabled, Castrel will automatically submit discovered resources (services, service instances, infrastructure entities, etc.) to the resource review queue. You can review and approve these resources through the interface, and approved resources will be committed to the resource library.

    Once configured, start a chat to begin the exploration process.

2. View Exploration Report

After exploration completes, Castrel generates a detailed exploration report containing:

ContentDescription
Exploration OverviewData source info, data type, confirmed data collections and timestamp fields
Entity DiscoveryIdentified services, service_entities, infra_entities and their relationships
Reusable Query Templates3-5 query templates for the data source, with usage and parameter descriptions
Field DictionaryKey field paths, types, meanings, and common value examples

3. Persist as Knowledge

Exploration results are automatically persisted as knowledge for use in subsequent incident investigation, alert triage, and other scenarios. You can also view and edit this knowledge in the Knowledge Base.

Core Concepts

Entity Types

Data Exploration identifies three types of entities:

Entity TypeDescriptionCommon Field Examples
serviceStable identifier for a logical service or application, moderate cardinality, interpretableservice.name, service, app
service_entityService instance, higher cardinality, can be mapped back to servicek8s.pod.name, container.id, instance, process.pid
infra_entityInfrastructure resources hosting serviceshost.name, node.name, ip, k8s.cluster.name

Relationships

Castrel establishes relationships between entities through the following methods:

  • Same-record co-occurrence: Fields appearing together in the same log or metric record have natural associations
  • Strong co-occurrence aggregation: High-confidence associations verified through aggregation statistics (e.g., pod name to service name mappings)
  • Routing relationships: Upstream/downstream dependencies discovered based on service call traces

Exploration Principles

Castrel's Data Exploration follows these principles to ensure accuracy and verifiability of discoveries:

PrincipleDescription
Global to localFirst explore the overall structure of data collections, then dive into field details and entity identification
Validate before expandingFirst verify with small time windows and return limits, then expand scope after confirmation
Prefer aggregationUse aggregation statistics to discover entity distributions, avoiding full data pulls that cause performance issues
Evidence-basedAll conclusions must be supported by query results, no guessing allowed

Exploration Flow

Data Collection Discovery → Field Structure Analysis → Sample Retrieval → Aggregation Statistics → Relationship Verification → Template Persistence
  1. Data Collection Discovery: List all available data collections in the data source (such as indices, tables, log streams, etc.)
  2. Field Structure Analysis: Get field lists, confirm timestamp fields and candidate entity fields
  3. Sample Retrieval: Verify with small samples that fields exist, are non-empty, and appear consistently
  4. Aggregation Statistics: Group by candidate fields to determine if they form clear entity distributions
  5. Relationship Verification: Verify relationships between entities, ensuring they can be reproduced via queries
  6. Template Persistence: Generate reusable query templates with key parameter descriptions

Example Exploration Report

Here's a typical data exploration report structure:

Exploration Overview

  • Data Source: Elasticsearch (production-logs)
  • Data Type: Logs
  • Target Application: order-service
  • Data Collection: logs-*
  • Timestamp Field: @timestamp (type: date, format: ISO8601)

Entity Discovery

Service

  • Identification field: service.name
  • Discovered entities: order-service, payment-service, inventory-service, and 12 other services
  • Evidence: Aggregation statistics show clear service distribution

Service Entity

  • Identification field: kubernetes.pod.name
  • Relationship: Can be associated to corresponding service via service.name field
  • Evidence: service.name and kubernetes.pod.name co-occur in the same log entry

Infra Entity

  • Identification field: kubernetes.node.name
  • Discovered entities: 3 K8s nodes
  • Relationship: Can trace to running node via kubernetes.pod.name

Reusable Query Templates

1. Query Error Logs by Service

{
  "query": {
    "bool": {
      "must": [
        { "term": { "service.name": "${service_name}" } },
        { "term": { "log.level": "error" } },
        { "range": { "@timestamp": { "gte": "${start_time}", "lte": "${end_time}" } } }
      ]
    }
  },
  "size": 100
}
  • Purpose: Query error logs for a specific service within a time range
  • Parameters: service_name (service name), start_time/end_time (time range)

Tips for Better Results

TipDescription
Ensure data source connection is healthyCheck data source connection status before exploration, ensure sufficient permissions to read data structure and samples
Choose appropriate time rangeDefault of last three days is usually sufficient; if data volume is small, consider expanding the time range
Specify target applicationIf the data source contains data from multiple applications, specifying the target application improves exploration efficiency and accuracy
Review and supplement knowledgeExploration results are persisted as knowledge; we recommend reviewing and adding business context information

FAQ