欢迎参与 8 月 1 日中午 11 点的线上分享,了解 GreptimeDB 联合处理指标和日志的最新方案! 👉🏻 点击加入

Skip to content

Full-Text Indexing in GreptimeDB! Revolutionizing Log Search Performance

Finding the needle in the haystack of your log data shouldn't take forever. Yet many engineering teams struggle with slow, inefficient log searches that drain productivity. GreptimeDB v0.14's enhanced full-text indexing capabilities are changing the game for observability professionals.

The Log Search Dilemma

When troubleshooting production issues, time is literally money. Traditional approaches to log searching face significant limitations:

  • Regular expression searches scan entire datasets, causing performance bottlenecks
  • Complex query language requirements create steep learning curves
  • Separate systems for metrics and logs create context-switching overhead

"Before implementing GreptimeDB's full-text search, our team sometimes waited minutes for critical log queries to complete," shares a DevOps lead at a mid-sized tech company.

Dual-Backend Architecture for Optimal Performance

GreptimeDB v0.14 introduces a groundbreaking dual-backend architecture for full-text indexing, allowing users to choose the perfect solution for their specific use case:

Bloom Backend: The Balanced Performer

The new Bloom-based full-text index backend offers impressive all-around capabilities:

  • Best for: General-purpose log searching with consistent performance
  • Highlights: Efficient filtering using Bloom filters with minimal storage overhead
  • Storage efficiency: Indexes typically require only ~10% of raw data size
  • Performance profile: Consistent response times across query patterns

Tantivy Backend: The Precision Specialist

When searching for highly specific terms like trace IDs, the Tantivy backend delivers unmatched speed:

  • Best for: High-selectivity queries (finding specific identifiers)
  • Highlights: Lightning-fast matching using inverted indexes
  • Storage requirements: Comparable to raw data size, but with dramatic performance gains
  • Performance profile: 5x faster for precise queries, though slower for broad searches

Real-World Performance Comparison

The choice between backends becomes clear when examining performance metrics:

Query TypeBloomTantivyLIKE Query
High Selectivity (e.g., TraceID)1x5x faster50x slower
Low Selectivity (e.g., "HTTP")1x5x slower1x

"The ability to choose the right indexing strategy based on our query patterns has been transformative," notes a platform engineer. "We use Bloom for our general logging and Tantivy for our trace ID lookups."

Beyond Search: The matches_term Function

GreptimeDB v0.14 also introduces a powerful new matches_term function and @@ operator for precise text matching:

sql
-- Using matches_term function
SELECT * FROM logs WHERE matches_term(message, 'error') OR matches_term(message, 'fail');

-- Using @@ operator (shorthand for matches_term)
SELECT * FROM logs WHERE message @@ 'error' OR message @@ 'fail';

These additions make writing efficient log queries more intuitive, especially for SQL users.

Starting Your Full-Text Journey

Implementing advanced full-text searching in your observability stack doesn't have to be complicated. GreptimeDB's unified platform for metrics, logs, and traces eliminates the need for multiple specialized systems.

Ready to transform your log search experience? Download GreptimeDB today or explore GreptimeCloud for a fully-managed solution.


About Greptime

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

  • GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.

  • GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.

  • GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn


加入我们的社区

获取 Greptime 最新更新,并与其他用户讨论。