ClickHouse High Availability Setup

This guide describes how to deploy Observo Manager with ClickHouse in High Availability (HA) mode and how to migrate existing standalone deployments to HA.

New Deployment with ClickHouse HA

For new Observo deployments requiring ClickHouse HA, use this configuration in values.yaml of your observo manager chart:

clickhouse:
  enabled: true
  nameOverride: "clickhouse"
  fullnameOverride: "clickhouse"
  zookeeper:
    enabled: true
  shards: 1
  replicaCount: 3  # Three replicas for HA
  persistence:
    size: 5Gi
  auth:
    username: default
    password: your_password
  startdbScripts:
    create_pattern_tables.sh: |
      #!/bin/bash
      clickhouse-client --multiquery --host "127.0.0.1" -u "$CLICKHOUSE_ADMIN_USER" --password "$CLICKHOUSE_ADMIN_PASSWORD" < /docker-entrypoint-startdb.d/create_pattern_tables.sql
    create_pattern_tables.sql: |
      CREATE DATABASE IF NOT EXISTS patterns ON CLUSTER 'default';

      CREATE TABLE IF NOT EXISTS patterns.patterns ON CLUSTER 'default'
      (
        `template` String,
        `regex` String,
        `count` UInt64,
        `tags` Map(String, String),
        `id` String,
        `start` DateTime,
        `end` DateTime,
        `modelName` String,
        `sourceId` String
      )
      ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/patterns.patterns', '{replica}')
      ORDER BY (sourceId, start)
      TTL end + INTERVAL 1 MONTH DELETE;

      CREATE TABLE IF NOT EXISTS patterns.tags ON CLUSTER 'default'
      (
        `sourceId` String,
        `tagName` String,
        `tagValue` String,
        `count` UInt64,
        `start` DateTime,
        `end` DateTime
      )
      ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/patterns.tags', '{replica}')
      ORDER BY (sourceId, tagName, start)
      TTL end + INTERVAL 1 MONTH DELETE;

Verify New HA Deployment

After deploying, verify the setup:

1. Check pod status:

2. Verify cluster configuration:

If you have already deployed observo manager helm chart with standalone clickhouse(default) follow the steps to migrate to HA mode.

Migrating Existing Standalone to HA

1. Backup Existing Data

Connect to existing ClickHouse pod:

Create backup tables and copy data:

2. Update values.yaml

Update your helm values file of observo manager helm chart with the HA configuration shown in the New Deployment section.

3. Apply the Update

4. Restart Original ClickHouse Pod

After all pods are running, restart the original ClickHouse pod to force it join the cluster:

5. Verify Deployment

6. Restore Data and Verify Replication

Connect to any ClickHouse pod:

Check table engines across cluster:

Restore data from backup:

Verify data and replication:

Resource Requirements

Total Suggested Resources for HA Setup

For a typical 3-replica HA setup:

  • ClickHouse Total (3 replicas):

    • CPU: 6 cores (2 × 3)

    • Memory: 12GB (4GB × 3)

    • Storage: 15GB (5GB × 3)

  • ZooKeeper Total (3 nodes):

    • CPU: 1.5 cores (0.5 × 3)

    • Memory: 3GB (1GB × 3)

    • Storage: 3GB (1GB × 3)

References:

Last updated

Was this helpful?