Full-Text Indexing in GreptimeDB! Revolutionizing Log Search Performance

Finding the needle in the haystack of your log data shouldn't take forever. Yet many engineering teams struggle with slow, inefficient log searches that drain productivity. GreptimeDB v0.14's enhanced full-text indexing capabilities are changing the game for observability professionals.

The Log Search Dilemma

When troubleshooting production issues, time is literally money. Traditional approaches to log searching face significant limitations:

Regular expression searches scan entire datasets, causing performance bottlenecks
Complex query language requirements create steep learning curves
Separate systems for metrics and logs create context-switching overhead

"Before implementing GreptimeDB's full-text search, our team sometimes waited minutes for critical log queries to complete," shares a DevOps lead at a mid-sized tech company.

Dual-Backend Architecture for Optimal Performance

GreptimeDB v0.14 introduces a groundbreaking dual-backend architecture for full-text indexing, allowing users to choose the perfect solution for their specific use case:

Bloom Backend: The Balanced Performer

The new Bloom-based full-text index backend offers impressive all-around capabilities:

Best for: General-purpose log searching with consistent performance
Highlights: Efficient filtering using Bloom filters with minimal storage overhead
Storage efficiency: Indexes typically require only ~10% of raw data size
Performance profile: Consistent response times across query patterns

Tantivy Backend: The Precision Specialist

When searching for highly specific terms like trace IDs, the Tantivy backend delivers unmatched speed:

Best for: High-selectivity queries (finding specific identifiers)
Highlights: Lightning-fast matching using inverted indexes
Storage requirements: Comparable to raw data size, but with dramatic performance gains
Performance profile: 5x faster for precise queries, though slower for broad searches

Real-World Performance Comparison

The choice between backends becomes clear when examining performance metrics:

Query Type	Bloom	Tantivy	LIKE Query
High Selectivity (e.g., TraceID)	1x	5x faster	50x slower
Low Selectivity (e.g., "HTTP")	1x	5x slower	1x

"The ability to choose the right indexing strategy based on our query patterns has been transformative," notes a platform engineer. "We use Bloom for our general logging and Tantivy for our trace ID lookups."

Beyond Search: The `matches_term` Function

GreptimeDB v0.14 also introduces a powerful new matches_term function and @@ operator for precise text matching:

sql

-- Using matches_term function
SELECT * FROM logs WHERE matches_term(message, 'error') OR matches_term(message, 'fail');

-- Using @@ operator (shorthand for matches_term)
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';

These additions make writing efficient log queries more intuitive, especially for SQL users.

Starting Your Full-Text Journey

Implementing advanced full-text searching in your observability stack doesn't have to be complicated. GreptimeDB's unified platform for metrics, logs, and traces eliminates the need for multiple specialized systems.

Ready to transform your log search experience? Download GreptimeDB today or explore GreptimeCloud for a fully-managed solution.

About Greptime

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.
GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

⭐ GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn

The Log Search Dilemma ​

Dual-Backend Architecture for Optimal Performance ​

Bloom Backend: The Balanced Performer ​

Tantivy Backend: The Precision Specialist ​

Real-World Performance Comparison ​

Beyond Search: The matches_term Function ​

Starting Your Full-Text Journey ​

About Greptime ​

加入我们的社区