欢迎参与 8 月 1 日中午 11 点的线上分享,了解 GreptimeDB 联合处理指标和日志的最新方案! 👉🏻 点击加入

Skip to content

Full-Text Search Meets Time-Series! GreptimeDB's Unified Observability Approach

GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn


Traditional observability stacks force you into uncomfortable trade-offs. Fast metrics queries or powerful log search. Real-time performance or historical analysis. GreptimeDB v0.14 shatters these limitations with advanced full-text indexing that seamlessly integrates with time-series data.

The Problem with Fragmented Observability

Most organizations run something like this:

Each system excels individually, but correlation across data types becomes a nightmare. When your application fails at 3 AM, you're frantically jumping between dashboards, trying to piece together the story.

GreptimeDB's unified approach changes this fundamentally. Metrics, logs, and traces share the same storage engine, query language, and operational model.

Full-Text Search Evolution in v0.14

The latest release introduces dual-backend full-text indexing that adapts to your specific use case:

Bloom Filter Backend

Perfect for general-purpose log search:

  • Low storage overhead: ~10% of raw data size
  • Consistent performance across query patterns
  • Stable resource usage for production workloads

Tantivy Backend

Optimized for high-selectivity queries:

  • 5x faster for unique identifier searches (trace IDs, user IDs)
  • Inverted index architecture for precise matching
  • Higher storage cost but unmatched precision

The new matches_term function and @@ operator make log analysis intuitive:

sql
-- Find all error-related entries
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';

-- Combine with time-series filtering
SELECT * FROM logs 
WHERE ts > '2024-01-01' 
  AND message @@ 'timeout' 
  AND service = 'api-gateway';

Real-World Log Processing Performance

Benchmark results show GreptimeDB outperforming traditional solutions:

Ingestion Performance (rows/second):

Resource Efficiency:

The 32x memory efficiency advantage over Elasticsearch is particularly striking for resource-constrained environments.

Compression That Actually Matters

GreptimeDB achieves 13% storage usage compared to raw log data in structured mode. This isn't just about saving disk space – it's about reducing bandwidth costs in distributed deployments and enabling longer data retention periods.

The Pipeline engine automatically parses unstructured logs into optimized columnar format:

yaml
processors:
  - dissect:
      fields:
        - line
      patterns:
        - '%{ip} - - [%{ts}] "%{method} %{path}" %{status} %{size}'
  - date:
      fields:
        - ts
      formats:
        - "%d/%b/%Y:%H:%M:%S %Z"

This transformation improves both query performance and storage efficiency while maintaining the flexibility to handle diverse log formats.

The Observability Data Model Revolution

Here's what makes this approach powerful. Instead of maintaining separate schemas for metrics and logs, GreptimeDB uses a unified table structure:

sql
CREATE TABLE observability_data (
  service STRING,
  environment STRING,
  message STRING FULLTEXT,
  level STRING INVERTED INDEX,
  latency_ms DOUBLE,
  error_count INT,
  ts TIMESTAMP,
  PRIMARY KEY(service, environment),
  TIME INDEX(ts)
);

Single queries can now correlate metrics with log events:

sql
SELECT 
  service,
  AVG(latency_ms) as avg_latency,
  COUNT(*) as log_entries
FROM observability_data 
WHERE ts > now() - INTERVAL '1 hour'
  AND (latency_ms > 1000 OR matches(message, 'error'))
GROUP BY service;

Vector search capabilities in v0.10+ enable semantic log analysis. Find logs with similar meanings even when exact wording differs:

sql
SELECT * FROM logs 
WHERE vec_dot_product(embedding, query_vector) > 0.8
ORDER BY similarity DESC;

This is particularly valuable for anomaly detection and incident pattern recognition.

Performance Optimization Strategies

Cold vs. Hot Query Optimization:

  • Cold queries: GreptimeDB's object storage integration shines here
  • Hot queries: In-memory caching and write buffers provide sub-second response times

Partitioning by service or environment enables massive scale-out while maintaining query performance.

Ready to unify your observability stack? GreptimeDB's full-text search capabilities represent the next evolution in observability databases – where time-series analytics and log search finally work together seamlessly.


About Greptime

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

  • GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.

  • GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.

  • GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

加入我们的社区

获取 Greptime 最新更新,并与其他用户讨论。