Solutions: Refurbish "Long-term store"

amotl · amotl · commit 16d03d55b68a · 2025-10-24T20:49:42.000+02:00
diff --git a/docs/integrate/airflow/data-retention-hot-cold.md b/docs/integrate/airflow/data-retention-hot-cold.md
@@ -1,5 +1,5 @@
 (airflow-data-retention-hot-cold)=
-# Build a hot and cold storage data retention policy in CrateDB with Apache Airflow
+# Build a hot/cold storage data retention policy in CrateDB with Apache Airflow
 
 This fourth article on automating recurring CrateDB queries with [Apache Airflow](https://airflow.apache.org/) presents a second data‑retention strategy. Previously, the {ref}`Data Retention Delete DAG <airflow-data-retention-policy>` dropped old partitions after a set period. This article adds a complementary hot/cold storage approach.
 
diff --git a/docs/solution/index.md b/docs/solution/index.md
@@ -7,6 +7,7 @@
 :hidden:
 time-series/index
 industrial/index
+longterm/index
 analytics/index
 machine-learning/index
 :::
@@ -15,7 +16,7 @@ machine-learning/index
 ## Explanations
 
 :::{div} sd-text-muted
-About time series data storage and analytics, and machine learning.
+About time series and long-term data storage, real-time analytics, and machine learning.
 :::
 
 ::::{grid} 1 2 2 2
@@ -36,6 +37,20 @@ and how to apply time series modeling and analysis procedures to your data.
 - Scientific computing
 :::
 
+:::{grid-item-card} {material-outlined}`manage_history;2em` Long-term store
+:link: longterm-store
+:link-type: ref
+:link-alt: About storing time series data for the long term
+Permanently keeping your raw data accessible for querying yields insightful
+analysis opportunities other systems can't provide easily.
++++
+**What's inside:**
+- Time-based bucketing.
+- Advanced querying.
+- Import data using Dask.
+- Optimizing storage for historic time series data.
+:::
+
 :::{grid-item-card} {material-outlined}`model_training;2em` Machine learning
 :link: machine-learning
 :link-type: ref
diff --git a/docs/solution/longterm/index.md b/docs/solution/longterm/index.md
@@ -0,0 +1,103 @@
+(longterm-store)=
+(timeseries-longterm)=
+(timeseries-long-term-storage)=
+
+# Long-term store
+
+:::{div} sd-text-muted
+Never retire data just because your other systems can't handle the cardinality.
+:::
+
+CrateDB stores large volumes of data, keeping it accessible for querying
+and insightful analysis, even considering historic data records.
+
+Many organizations need to retain data for years or decades to meet regulatory
+requirements, support historical analysis, or preserve valuable insights for
+future use. However, traditional storage systems force you to choose between
+accessibility and affordability, often leading to data exports, archival
+systems, or downsampling that sacrifice query capabilities.
+
+CrateDB eliminates this trade-off by storing large volumes of data efficiently
+while keeping it fully accessible for querying and analysis. Unlike systems
+that struggle with high cardinality or require expensive tiered architectures,
+CrateDB handles billions of unique records in a single platform, maintaining
+fast query performance even on historic datasets spanning years.
+
+By keeping all your data in one place, you avoid the complexity and costs of
+exporting to specialized long-term storage systems, data lakes, or cold storage
+tiers. Your historical data remains as queryable as your recent data, enabling
+seamless analysis across any time range without data movement, ETL pipelines,
+or rehydration processes.
+
+With CrateDB, compatible to PostgreSQL, you can do all of that using plain SQL.
+Other than integrating well with commodity systems using standard database
+access interfaces like ODBC or JDBC, it provides a proprietary HTTP interface
+on top.
+
+## Use cases
+
+:::{rubric} Metrics and monitoring
+:::
+
+::::{grid} 1 1 1 2
+:gutter: 2
+:padding: 0
+
+:::{grid-item-card} Prometheus
+:link: prometheus
+:link-type: ref
+Prometheus and similar monitoring systems excel at real-time alerting but face challenges
+with long-term metric retention due to storage costs and query performance at scale. CrateDB
+addresses these challenges by providing:
+- **Scalable long-term storage**: Store years of metrics without compromising query performance.
+- **High cardinality support**: Handle millions of unique label combinations that would overwhelm traditional TSDBs.
+- **Rich SQL analytics**: Perform complex analytical queries on historic metrics using standard SQL.
+- **Seamless integration**: Use CrateDB's Prometheus Adapter for transparent remote write/read operations.
++++
+Set up CrateDB as a long-term metrics store for Prometheus.
+:::
+
+:::{grid-item-card} OpenTelemetry
+:link: opentelemetry
+:link-type: ref
+OpenTelemetry and similar observability frameworks excel at generating rich telemetry data
+but face challenges with long-term retention due to storage scale and query complexity.
+CrateDB addresses these challenges by providing:
+- **Scalable long-term storage**: Store large volumes of telemetry through CrateDB's distributed architecture.
+- **Vendor-neutral ingestion**: Use OpenTelemetry SDKs/agents and Telegraf to send telemetry into your CrateDB observability pipeline.
+- **Rich SQL analytics**: Run SQL/time-series queries, aggregations and joins on telemetry data for troubleshooting and analytics.
+- **Flexible attribute mapping**: Customize which span/log/profile attributes become columns/tags for dimensional queries.
++++
+Set up CrateDB as a long-term observability backend for OpenTelemetry.
+:::
+
+::::
+
+## Related sections
+
+{ref}`metrics-store` includes information about how to
+store and analyze high volumes of system monitoring information
+like metrics and log data with CrateDB.
+
+{ref}`analytics` describes how
+CrateDB provides real-time analytics on raw data stored for the long term.
+Keep massive amounts of data ready in the hot zone for analytics purposes.
+
+[Optimizing storage efficiency for historic time series data]
+illustrates how to reduce table storage size by 80%,
+by using arrays for time-based bucketing, a historical table having
+a dedicated layout, and querying using the UNNEST table function.
+
+{ref}`Build a hot/cold storage data retention policy <airflow-data-retention-hot-cold>`
+describes how to manage aging data by leveraging CrateDB cluster
+features to mix nodes with different hardware setups, i.e. hot
+nodes using the latest generation of NVMe drives for responding
+to analytics queries quickly, and cold nodes that have access to
+cheap mass storage for retaining historic data.
+
+{ref}`weather-data-storage` provides information about how to
+use CrateDB for mass storage of synoptic weather observations,
+allowing you to query them efficiently.
+
+
+[Optimizing storage efficiency for historic time series data]: https://community.cratedb.com/t/optimizing-storage-for-historic-time-series-data/762
diff --git a/docs/solution/time-series/index.md b/docs/solution/time-series/index.md
@@ -69,21 +69,6 @@ Machine Learning on Time Series Data: EDA, Decomposition, AutoML.
 :::
 
 
-:::{grid-item-card} {material-outlined}`manage_history;2em` Long-term storage
-:link: timeseries-longterm
-:link-type: ref
-:link-alt: About storing time series data for the long term
-
-Run efficient data operations for current and historical time series data.
-
-+++
-**What's inside:**
-Time-based bucketing.
-Import data using Dask.
-Optimizing storage for historic time series data.
-:::
-
-
 ::::
 
 
@@ -92,6 +77,7 @@ Optimizing storage for historic time series data.
 **Domains:**
 {ref}`analytics` •
 {ref}`industrial` •
+{ref}`longterm-store` •
 {ref}`machine-learning` •
 {ref}`metrics-store`
 
@@ -114,7 +100,6 @@ Optimizing storage for historic time series data.
 Fundamentals <fundamentals>
 Advanced analysis <analysis>
 video
-Long-term store <longterm>
 :::
 
 
diff --git a/docs/solution/time-series/longterm.md b/docs/solution/time-series/longterm.md
diff --git a/docs/start/application/index.md b/docs/start/application/index.md
@@ -1,5 +1,5 @@
 (example-applications)=
-# Sample Applications
+# Sample applications
 
 
 :::{rubric} Starter
@@ -87,3 +87,34 @@ Users can ask questions of the knowledge base using natural language.
 :::
 
 ::::
+
+
+:::{rubric} Community
+:::
+
+:::::{grid} 1 2 2 3
+:gutter: 2
+
+::::{grid-item-card}
+:link: https://wetterdienst.readthedocs.io/en/latest/usage/python-api.html#export
+:link-type: url
+(weather-data-storage)=
+:::{rubric} Store and analyze massive amounts of synoptic weather data
+:::
+Wetterdienst uses CrateDB for mass storage of weather data, allowing you to
+query it efficiently. It provides access to data at more than ten canonical
+sources of raw weather data from domestic weather agencies.
++++
+**What's inside:**
+
+{tags-primary}`Earth observations`
+{tags-primary}`Metadata`
+{tags-primary}`Sensor data`
+{tags-primary}`Time series`
+
+{tags-secondary}`pandas`
+{tags-secondary}`Polars`
+{tags-secondary}`SQL`
+::::
+
+:::::