|
| 1 | +--- |
| 2 | +title: Advanced Observability |
| 3 | +description: Deploy self-hosted observability solutions with ELK stack or Prometheus & Grafana. |
| 4 | +--- |
| 5 | + |
| 6 | +Zerops Advanced Observability provides one-click, self-hosted, pre-configured deployments of the **[ELK stack](#elk-stack)** or **[Prometheus with Grafana](#prometheus-with-grafana)**. This enables comprehensive observability for your deployments while maintaining full control over your data and infrastructure. |
| 7 | + |
| 8 | +## Deployment Modes |
| 9 | + |
| 10 | +Both ELK and Prometheus can be deployed in two ways: |
| 11 | + |
| 12 | +- **Local deployment**: Services are deployed within your target project |
| 13 | +- **Global deployment**: A dedicated project is created specifically for observability |
| 14 | + |
| 15 | +## ELK Stack |
| 16 | + |
| 17 | +The ELK observability setup deploys and configures two base services: |
| 18 | + |
| 19 | +- **Elasticsearch** (as `elkstorage`) |
| 20 | +- **Kibana** (as `kibana`) - the observability access point |
| 21 | + |
| 22 | +### Logging |
| 23 | + |
| 24 | +For log collection with ELK, see the [Log Forwarding](/references/log-forwarding#self-hosted-logstash) guide which covers the Logstash setup and configuration. |
| 25 | + |
| 26 | +### Tracing |
| 27 | + |
| 28 | +To collect traces in the ELK stack, an APM Server instance (deployed as the `apmserver` service) is required. APM Server listens for incoming traces securely over HTTPS and indexes them to Elasticsearch. |
| 29 | + |
| 30 | +First, follow the one-click GUI integration setup to deploy the required infrastructure. Then configure your application to send traces to APM Server. |
| 31 | + |
| 32 | +#### Setting Up APM Agents |
| 33 | + |
| 34 | +1. Update your code according to the [APM documentation](https://www.elastic.co/guide/en/apm/agent/index.html) for your application's language. See this [simple Go application](https://github.com/zerops-recipe-apps/go-hello-world-app/tree/with-apm-and-metrics) using APM libraries for reference. |
| 35 | + |
| 36 | +2. Add the following environment variables to your application service (these are specific to the official Go APM Agent library and may differ for other agents): |
| 37 | +``` |
| 38 | +ELASTIC_APM_ACTIVE=true |
| 39 | +ELASTIC_APM_SERVICE_NAME=recipe-go |
| 40 | +ELASTIC_APM_SERVER_URL=https://apmserver.url.copy.from.gui |
| 41 | +ELASTIC_APM_SECRET_TOKEN=secret_token_copy_from_gui |
| 42 | +``` |
| 43 | + |
| 44 | +:::note |
| 45 | +The `ELASTIC_APM_ACTIVE` variable is set to `false` by default on Zerops, so you must explicitly set it to `true` to enable APM for the official Go APM agent. |
| 46 | +::: |
| 47 | + |
| 48 | +3. Restart or reload your application service. |
| 49 | + |
| 50 | +You should start seeing traces appear in Kibana's "Applications > Traces" section. |
| 51 | + |
| 52 | +#### What Happens Behind the Scenes |
| 53 | + |
| 54 | +1. The `elkstorage`, `kibana`, and `apmserver` services are deployed and configured in the target project (if not already present) |
| 55 | +2. The `apmserver` is made publicly accessible over HTTPS via a Zerops subdomain and configured with a secret token for secure access |
| 56 | + |
| 57 | +Access information for both Kibana and APM Server can be found in the **Advanced Observability** section of your project in the GUI. |
| 58 | + |
| 59 | +### Next Steps |
| 60 | + |
| 61 | +- **Security**: Set up custom domains for Kibana and APM Server |
| 62 | +- **Optimization**: Connect to APM Server locally via HTTP for improved performance |
| 63 | +- **Customization**: Fork the [ELK recipe](https://github.com/zeropsio/recipe-elk), customize the Logstash configuration, and redeploy the `logstash` service via `zcli` |
| 64 | + |
| 65 | +## Prometheus with Grafana |
| 66 | + |
| 67 | +The full Prometheus setup consists of: |
| 68 | + |
| 69 | +- **Prometheus** (as `prometheus`) |
| 70 | +- **Grafana** (as `grafana`) - the observability access point |
| 71 | +- **PostgreSQL database** (as `grafanadb`) - for Grafana data storage |
| 72 | +- **S3 bucket** (as `prometheusbackups`) - for storing Prometheus backups |
| 73 | + |
| 74 | +Metrics can be forwarded to the target project via a Prometheus exporter (single `prometheuslight` service configured [with remote write](https://grafana.com/docs/agent/latest/flow/reference/components/prometheus.remote_write/#prometheusremote_write)). When present in a source project, it collects metrics and forwards them to Prometheus in the target project. |
| 75 | + |
| 76 | +Since Prometheus supports only filesystem storage and non-HA (single instance) setup, the `prometheus` service is configured to run a backup cron job that stores metric data snapshots in object storage. |
| 77 | + |
| 78 | +### Metrics |
| 79 | + |
| 80 | +After deploying the full or forwarder Prometheus setup, metrics are automatically scraped by Prometheus. After setup completes, log in to Grafana using the credentials found in the **Advanced Observability** section of your project in the GUI, then navigate to `/dashboards` to start exploring your metrics. |
| 81 | + |
| 82 | +Zerops provides several metrics out of the box: |
| 83 | + |
| 84 | +- Service scaling and resource metrics |
| 85 | +- PostgreSQL database service metrics (with HAProxy balancer metrics in HA setups) |
| 86 | +- MariaDB database service metrics |
| 87 | +- Valkey database metrics |
| 88 | + |
| 89 | +:::note |
| 90 | +Some PostgreSQL database metrics are available only after enabling the `pg_stat_statements` extension. Run `CREATE EXTENSION IF NOT EXISTS pg_stat_statements;` as a superuser and restart the database service. |
| 91 | +::: |
| 92 | + |
| 93 | +You can also expose custom metrics from your applications. |
| 94 | + |
| 95 | +#### Exposing Custom Metrics |
| 96 | + |
| 97 | +1. Configure your application to expose an HTTP `/metrics` endpoint on an arbitrary port. See this [simple Go application using `promauto`](https://github.com/zerops-recipe-apps/go-hello-world-app/tree/with-apm-and-metrics) for reference. |
| 98 | + |
| 99 | + :::note |
| 100 | + Only the `/metrics` path is scraped and cannot currently be configured otherwise. |
| 101 | + ::: |
| 102 | + |
| 103 | +2. Add and commit the `ZEROPS_PROMETHEUS_PORT` environment variable to your application service with the port where you exposed the `/metrics` endpoint: |
| 104 | +``` |
| 105 | +Single port: |
| 106 | +9090 |
| 107 | +
|
| 108 | +Multiple ports can be defined using commas: |
| 109 | +9090,9091 |
| 110 | +``` |
| 111 | + |
| 112 | +For example: `ZEROPS_PROMETHEUS_PORT=9090`. After setting the environment variable, your service's ports will be automatically added to metrics discovery and scraped by Prometheus. |
| 113 | + |
| 114 | +#### What Happens Behind the Scenes |
| 115 | + |
| 116 | +**Full setup:** |
| 117 | + |
| 118 | +1. The `prometheus`, `grafana`, `grafanadb`, and `prometheusbackups` services are deployed and configured (if not already present) |
| 119 | +2. `grafana` is made publicly accessible over HTTPS, securely accessible via a generated password |
| 120 | + |
| 121 | +After that, metrics are scraped and available for visualization in Grafana. |
| 122 | + |
| 123 | +**Forwarder setup:** |
| 124 | + |
| 125 | +1. The `prometheuslight` forwarder service is deployed in the source project |
| 126 | +2. The `prometheus`, `grafana`, `grafanadb`, and `prometheusbackups` services are deployed and configured in the target project (if not already present) |
| 127 | +3. `grafana` is made publicly accessible over HTTPS, securely accessible via a generated password |
| 128 | +4. `prometheus` is made publicly accessible over HTTPS via a Zerops zone domain and secured with basic access authentication (username and password) |
| 129 | +5. `prometheuslight` is configured to forward metrics to the target `prometheus` using the generated credentials over secure HTTPS |
| 130 | + |
| 131 | +### Next Steps |
| 132 | + |
| 133 | +- **Security**: Set up custom domains for Grafana and Prometheus |
| 134 | +- **Customization**: Fork the [Prometheus recipe](https://github.com/zeropsio/recipe-prometheus), customize dashboards and alerting rules, and redeploy via `zcli` |
| 135 | +- **Advanced monitoring**: Explore Grafana's alerting capabilities and create custom dashboards for your specific use cases |
0 commit comments