欢迎参与 8 月 1 日中午 11 点的线上分享,了解 GreptimeDB 联合处理指标和日志的最新方案! 👉🏻 点击加入

Skip to content
On this page

Elasticsearch Protocol Support, Inverted Index Optimization, and Performance Improvements – Individual Contributor Has 'Gone the Extra Mile' Again! | Greptime Biweekly Report

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

  • Support log ingestion via Elasticsearch protocol
  • Support inverted index modification using ALTER command
  • Introduce sparse primary key encoding to optimize Metrics performance
  • Begin implementing BloomFilter as an alternative to full-text indexing

Contributors

For the past two weeks, our community has been super active with a total of 113 PRs merged. 23 PRs from 6 individual contributors merged successfully and lots pending to be merged.

Congrats on becoming our most active contributors in the past 2 weeks:

👏 Welcome @mtrbpr to the community as a new contributor with a successfully merged PR, and more PRs from other individual contributors are waiting to be merged.

New Contributor of GreptimeDB
New Contributor of GreptimeDB

🎉 A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#5261 Support Elasticsearch _bulk API for Log Ingestion

Users can now ingest logs using either Elasticsearch _bulk API or Logstash, further enriching GreptimeDB's support for the logging ecosystem.

db#5131 Support Inverted Index Modification via ALTER Command

Users can configure inverted indexes using the ALTER command, making index adjustments more flexible and straightforward.

db#5365 Introduce SparsePrimaryKeyCodec and SparsePrimaryKeyFilter

In Metrics scenarios, when the number of primary key columns in physical tables becomes excessive, the CPU overhead required for encoding all primary keys increases significantly. This has led to notable performance bottlenecks in both write and query operations.

This PR introduces sparse primary keys to encode only non-null keys, reducing CPU overhead and improving performance.

For complete details, refer to Tracking Issue db#5282.

db#5406 Initial Implementation of BloomFilter as an Alternative to Full-Text Indexing

Full-text indexing in logging scenarios incurs substantial resource overhead. To address this, this PR begins implementing BloomFilter as an indexing method, serving as an alternative to full-text indexing. This indexing approach can significantly reduce resource consumption compared to full-text indexing.

Good First Issue

db#5084 Add the HTTP API for Querying Pipelines

Although we decide not to expose many HTTP APIs for DB, it will be natural to have an HTTP API for querying pipelines besides the create and delete operation for pipeline management.

For developer experience, when they create a pipeline, it will be convenient to use a similar API to query the pipeline unless using SQL to query greptime_private, for example:

plaintext
curl -XGET "http://localhost:4000/v1/events/pipelines/test?db=public"
  • Level: Simple

  • Keyword: Logs


About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

加入我们的社区

获取 Greptime 最新更新,并与其他用户讨论。