Elasticsearch Protocol Support, Inverted Index Optimization, and Performance Improvements – Individual Contributor Has 'Gone the Extra Mile' Again! | Greptime Biweekly Report

Together with our global community of contributors, GreptimeDB continues to evolve and flourish as a growing open-source project. We are grateful to each and every one of you.

Below are the highlights among recent commits:

Support log ingestion via Elasticsearch protocol
Support inverted index modification using ALTER command
Introduce sparse primary key encoding to optimize Metrics performance
Begin implementing BloomFilter as an alternative to full-text indexing

Contributors

For the past two weeks, our community has been super active with a total of 113 PRs merged. 23 PRs from 6 individual contributors merged successfully and lots pending to be merged.

Congrats on becoming our most active contributors in the past 2 weeks:

@linyihai (db#5303)
@lyang24 (db#5131)
@mtrbpr (promql-parser#101)
@NiwakaDev (docs#1468 db#5393 db#5383 db#5262)
@Xuanwo (db#5354)
@yihong0618 (db#5400 db#5388 db#5383 db#5363 db#5362 db#5352 db#5349 db#5342 db#5339 db#5329 db#5328 db#5325 db#5313 db#5311 db#5301)

👏 Welcome @mtrbpr to the community as a new contributor with a successfully merged PR, and more PRs from other individual contributors are waiting to be merged.

🎉 A big THANK YOU to all our members and contributors! It is people like you who are making GreptimeDB a great product. Let's build an even greater community together.

Highlights of Recent PRs

db#5261 Support Elasticsearch `_bulk` API for Log Ingestion

Users can now ingest logs using either Elasticsearch _bulk API or Logstash, further enriching GreptimeDB's support for the logging ecosystem.

db#5131 Support Inverted Index Modification via `ALTER` Command

Users can configure inverted indexes using the ALTER command, making index adjustments more flexible and straightforward.

db#5365 Introduce `SparsePrimaryKeyCodec` and `SparsePrimaryKeyFilter`

In Metrics scenarios, when the number of primary key columns in physical tables becomes excessive, the CPU overhead required for encoding all primary keys increases significantly. This has led to notable performance bottlenecks in both write and query operations.

This PR introduces sparse primary keys to encode only non-null keys, reducing CPU overhead and improving performance.

For complete details, refer to Tracking Issue db#5282.

db#5406 Initial Implementation of BloomFilter as an Alternative to Full-Text Indexing

Full-text indexing in logging scenarios incurs substantial resource overhead. To address this, this PR begins implementing BloomFilter as an indexing method, serving as an alternative to full-text indexing. This indexing approach can significantly reduce resource consumption compared to full-text indexing.

Good First Issue

db#5084 Add the HTTP API for Querying Pipelines

Although we decide not to expose many HTTP APIs for DB, it will be natural to have an HTTP API for querying pipelines besides the create and delete operation for pipeline management.

For developer experience, when they create a pipeline, it will be convenient to use a similar API to query the pipeline unless using SQL to query greptime_private, for example:

plaintext

curl -XGET "http://localhost:4000/v1/events/pipelines/test?db=public"

Level: Simple
Keyword: Logs

About Greptime

Greptime offers industry-leading time series database products and solutions to empower IoT and Observability scenarios, enabling enterprises to uncover valuable insights from their data with less time, complexity, and cost.

GreptimeDB is an open-source, high-performance time-series database offering unified storage and analysis for metrics, logs, and events. Try it out instantly with GreptimeCloud, a fully-managed DBaaS solution—no deployment needed!

The Edge-Cloud Integrated Solution combines multimodal edge databases with cloud-based GreptimeDB to optimize IoT edge scenarios, cutting costs while boosting data performance.

Star us on GitHub or join GreptimeDB Community on Slack to get connected.

Contributors ​

Highlights of Recent PRs ​

db#5261 Support Elasticsearch _bulk API for Log Ingestion ​

db#5131 Support Inverted Index Modification via ALTER Command ​

db#5365 Introduce SparsePrimaryKeyCodec and SparsePrimaryKeyFilter ​

db#5406 Initial Implementation of BloomFilter as an Alternative to Full-Text Indexing ​

Good First Issue ​

db#5084 Add the HTTP API for Querying Pipelines ​

About Greptime ​

加入我们的社区