
💬 Slack | 🐦 Twitter | 💼 LinkedIn
Traditional observability stacks force you into uncomfortable trade-offs. Fast metrics queries or powerful log search. Real-time performance or historical analysis. GreptimeDB v0.14 shatters these limitations with advanced full-text indexing that seamlessly integrates with time-series data.
The Problem with Fragmented Observability
Most organizations run something like this:
- Prometheus for metrics
- Elasticsearch for logs
- Jaeger for traces
Each system excels individually, but correlation across data types becomes a nightmare. When your application fails at 3 AM, you're frantically jumping between dashboards, trying to piece together the story.
GreptimeDB's unified approach changes this fundamentally. Metrics, logs, and traces share the same storage engine, query language, and operational model.
Full-Text Search Evolution in v0.14
The latest release introduces dual-backend full-text indexing that adapts to your specific use case:
Bloom Filter Backend
Perfect for general-purpose log search:
- Low storage overhead: ~10% of raw data size
- Consistent performance across query patterns
- Stable resource usage for production workloads
Tantivy Backend
Optimized for high-selectivity queries:
- 5x faster for unique identifier searches (trace IDs, user IDs)
- Inverted index architecture for precise matching
- Higher storage cost but unmatched precision
The new matches_term
function and @@
operator make log analysis intuitive:
-- Find all error-related entries
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';
-- Combine with time-series filtering
SELECT * FROM logs
WHERE ts > '2024-01-01'
AND message @@ 'timeout'
AND service = 'api-gateway';
Real-World Log Processing Performance
Benchmark results show GreptimeDB outperforming traditional solutions:
Ingestion Performance (rows/second):
- GreptimeDB: 120,000-130,000
- ClickHouse: 150,000
- Elasticsearch: 40,000
Resource Efficiency:
- GreptimeDB: 400MB memory usage
- ClickHouse: 600MB memory usage
- Elasticsearch: 12GB+ memory usage
The 32x memory efficiency advantage over Elasticsearch is particularly striking for resource-constrained environments.
Compression That Actually Matters
GreptimeDB achieves 13% storage usage compared to raw log data in structured mode. This isn't just about saving disk space – it's about reducing bandwidth costs in distributed deployments and enabling longer data retention periods.
The Pipeline engine automatically parses unstructured logs into optimized columnar format:
processors:
- dissect:
fields:
- line
patterns:
- '%{ip} - - [%{ts}] "%{method} %{path}" %{status} %{size}'
- date:
fields:
- ts
formats:
- "%d/%b/%Y:%H:%M:%S %Z"
This transformation improves both query performance and storage efficiency while maintaining the flexibility to handle diverse log formats.
The Observability Data Model Revolution
Here's what makes this approach powerful. Instead of maintaining separate schemas for metrics and logs, GreptimeDB uses a unified table structure:
CREATE TABLE observability_data (
service STRING,
environment STRING,
message STRING FULLTEXT,
level STRING INVERTED INDEX,
latency_ms DOUBLE,
error_count INT,
ts TIMESTAMP,
PRIMARY KEY(service, environment),
TIME INDEX(ts)
);
Single queries can now correlate metrics with log events:
SELECT
service,
AVG(latency_ms) as avg_latency,
COUNT(*) as log_entries
FROM observability_data
WHERE ts > now() - INTERVAL '1 hour'
AND (latency_ms > 1000 OR matches(message, 'error'))
GROUP BY service;
Advanced Features Beyond Basic Search
Vector search capabilities in v0.10+ enable semantic log analysis. Find logs with similar meanings even when exact wording differs:
SELECT * FROM logs
WHERE vec_dot_product(embedding, query_vector) > 0.8
ORDER BY similarity DESC;
This is particularly valuable for anomaly detection and incident pattern recognition.
Performance Optimization Strategies
Cold vs. Hot Query Optimization:
- Cold queries: GreptimeDB's object storage integration shines here
- Hot queries: In-memory caching and write buffers provide sub-second response times
Partitioning by service or environment enables massive scale-out while maintaining query performance.
Ready to unify your observability stack? GreptimeDB's full-text search capabilities represent the next evolution in observability databases – where time-series analytics and log search finally work together seamlessly.
About Greptime
GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.
GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.
GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.
🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.