Data Modeling Best Practices for Observability Workloads

The Foundation of Observability Success

Poor data modeling can make even the fastest database crawl. We've seen teams struggle with query performance issues that trace back to fundamental schema design problems.

Observability data has unique characteristics that require thoughtful modeling approaches. Unlike traditional business applications, monitoring data involves high-volume writes, time-based queries, and varying cardinality patterns.

Understanding Column Cardinality

Think of your data like a library:

Low-cardinality columns are like book genres (Sci-Fi, History, Art)
High-cardinality columns are like ISBNs or user IDs

This distinction directly impacts performance. In one e-commerce project, region had only 7 values while user_id reached billions.

Primary Key Design Rules

For time-series databases, follow these guidelines:

Choose low-cardinality columns only
Keep key combination under 100k unique values
Limit to ≤5 key columns
Prefer strings and integers over floats

Wide Tables vs. Multiple Tables

Best practice: Store related metrics in wide tables, especially when collected together.

sql

CREATE TABLE node_metrics (
  host STRING,
  cpu_user DOUBLE,
  cpu_system DOUBLE,
  memory_used DOUBLE,
  disk_read_bytes DOUBLE,
  net_in_bytes DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(host),
  TIME INDEX(ts)
);

Benefits include:

30-50% better compression
Simplified queries (no JOINs needed)
Improved query performance

Index Strategy for Different Query Types

GreptimeDB offers three index types optimized for specific patterns:

Inverted Index

Best for low-cardinality filtering:

Supports =, <, >, IN, BETWEEN
Efficient categorical queries
Moderate storage overhead

Skipping Index

Great for high-cardinality equality filters:

Minimal write performance impact
Very storage efficient
Limited to equality queries only

Fulltext Index

Essential for log keyword search:

Supports complex text matching
English analyzer for better relevance
Higher storage requirements

Partition Strategy for Scale

When data exceeds TB scale, distributed partitioning becomes essential:

sql

CREATE TABLE global_metrics (
  region STRING,
  datacenter STRING,
  host STRING,
  cpu DOUBLE,
  memory DOUBLE,
  ts TIMESTAMP,
  PRIMARY KEY(region, datacenter, host),
  TIME INDEX(ts)
) PARTITION ON COLUMNS (region);

Best practice: Partition by columns with even distribution and query-aligned patterns.

Performance Tuning Guidelines

Based on real-world experience:

Start simple without primary keys for write-heavy workloads
Avoid over-indexing (impacts write performance)
Consider partitioning when tables exceed 500GB
Set appropriate TTL policies for data retention

GreptimeDB's flexible architecture supports these optimization strategies while maintaining operational simplicity.

About Greptime

GreptimeDB is an open-source, cloud-native database purpose-built for real-time observability. Built in Rust and optimized for cloud-native environments, it provides unified storage and processing for metrics, logs, and traces—delivering sub-second insights from edge to cloud —at any scale.

GreptimeDB OSS – The open-sourced database for small to medium-scale observability and IoT use cases, ideal for personal projects or dev/test environments.
GreptimeDB Enterprise – A robust observability database with enhanced security, high availability, and enterprise-grade support.
GreptimeCloud – A fully managed, serverless DBaaS with elastic scaling and zero operational overhead. Built for teams that need speed, flexibility, and ease of use out of the box.

🚀 We’re open to contributors—get started with issues labeled good first issue and connect with our community.

⭐ GitHub | 🌐 Website | 📚 Docs

💬 Slack | 🐦 Twitter | 💼 LinkedIn

The Foundation of Observability Success ​

Understanding Column Cardinality ​

Primary Key Design Rules ​

Wide Tables vs. Multiple Tables ​

Index Strategy for Different Query Types ​

Inverted Index ​

Skipping Index ​

Fulltext Index ​

Partition Strategy for Scale ​

Performance Tuning Guidelines ​

About Greptime ​

加入我们的社区