ClickHouse High Availability Setup

This guide describes how to deploy Observo Manager with ClickHouse in High Availability (HA) mode and how to migrate existing standalone deployments to HA.

New Deployment with ClickHouse HA

For new Observo deployments requiring ClickHouse HA, use this configuration in values.yaml of your observo manager chart:

clickhouse:
  enabled: true
  nameOverride: "clickhouse"
  fullnameOverride: "clickhouse"
  zookeeper:
    enabled: true
  shards: 1
  replicaCount: 3  # Three replicas for HA
  persistence:
    size: 5Gi
  auth:
    username: default
    password: your_password
  startdbScripts:
    create_pattern_tables.sh: |
      #!/bin/bash
      clickhouse-client --multiquery --host "127.0.0.1" -u "$CLICKHOUSE_ADMIN_USER" --password "$CLICKHOUSE_ADMIN_PASSWORD" < /docker-entrypoint-startdb.d/create_pattern_tables.sql
    create_pattern_tables.sql: |
      CREATE DATABASE IF NOT EXISTS patterns ON CLUSTER 'default';

      CREATE TABLE IF NOT EXISTS patterns.patterns ON CLUSTER 'default'
      (
        `template` String,
        `regex` String,
        `count` UInt64,
        `tags` Map(String, String),
        `id` String,
        `start` DateTime,
        `end` DateTime,
        `modelName` String,
        `sourceId` String
      )
      ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/patterns.patterns', '{replica}')
      ORDER BY (sourceId, start)
      TTL end + INTERVAL 1 MONTH DELETE;

      CREATE TABLE IF NOT EXISTS patterns.tags ON CLUSTER 'default'
      (
        `sourceId` String,
        `tagName` String,
        `tagValue` String,
        `count` UInt64,
        `start` DateTime,
        `end` DateTime
      )
      ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/patterns.tags', '{replica}')
      ORDER BY (sourceId, tagName, start)
      TTL end + INTERVAL 1 MONTH DELETE;

Verify New HA Deployment

After deploying, verify the setup:

1. Check pod status:

kubectl get pods -n observo | grep clickhouse

# Expected output:
clickhouse           clickhouse-shard0-0                          1/1     Running   2 (19m ago)   20m
clickhouse           clickhouse-shard0-1                          1/1     Running   2 (20m ago)   20m
clickhouse           clickhouse-shard0-2                          1/1     Running   2 (20m ago)   20m
clickhouse           clickhouse-zookeeper-0                       1/1     Running   0             20m
clickhouse           clickhouse-zookeeper-1                       1/1     Running   0             20m
clickhouse           clickhouse-zookeeper-2                       1/1     Running   0             20m

2. Verify cluster configuration:

If you have already deployed observo manager helm chart with standalone clickhouse(default) follow the steps to migrate to HA mode.

SELECT * FROM system.clusters;
SELECT * FROM system.replicas;

Migrating Existing Standalone to HA

1. Backup Existing Data

Connect to existing ClickHouse pod:

kubectl exec -it clickhouse-shard0-0 -n observo -- clickhouse-client -u default --password your_password

Create backup tables and copy data:

-- Backup patterns table with data
CREATE TABLE patterns.patterns_backup AS patterns.patterns;
INSERT INTO patterns.patterns_backup SELECT * FROM patterns.patterns;

-- Backup tags table with data
CREATE TABLE patterns.tags_backup AS patterns.tags;
INSERT INTO patterns.tags_backup SELECT * FROM patterns.tags;

-- Verify backup data
SELECT 
    'patterns' as table_name,
    (SELECT COUNT(*) FROM patterns.patterns) as actual_count,
    (SELECT COUNT(*) FROM patterns.patterns_backup) as backup_count
UNION ALL
SELECT 
    'tags' as table_name,
    (SELECT COUNT(*) FROM patterns.tags) as actual_count,
    (SELECT COUNT(*) FROM patterns.tags_backup) as backup_count;

2. Update values.yaml

Update your helm values file of observo manager helm chart with the HA configuration shown in the New Deployment section.

3. Apply the Update

helm upgrade -i -n observo observo-manager observo-manager-VERSION.tgz --values values.yaml

4. Restart Original ClickHouse Pod

After all pods are running, restart the original ClickHouse pod to force it join the cluster:

kubectl delete pod clickhouse-shard0-0 -n observo

5. Verify Deployment

# Check pod status
kubectl get pods -n observo | grep clickhouse

# Expected output:
clickhouse           clickhouse-shard0-0                          1/1     Running   2 (19m ago)   20m
clickhouse           clickhouse-shard0-1                          1/1     Running   2 (20m ago)   20m
clickhouse           clickhouse-shard0-2                          1/1     Running   2 (20m ago)   20m
clickhouse           clickhouse-zookeeper-0                       1/1     Running   0             20m
clickhouse           clickhouse-zookeeper-1                       1/1     Running   0             20m
clickhouse           clickhouse-zookeeper-2                       1/1     Running   0             20m

6. Restore Data and Verify Replication

Connect to any ClickHouse pod:

kubectl exec -it clickhouse-shard0-1 -n observo -- clickhouse-client -u default --password your_password

Check table engines across cluster:

SELECT 
    hostName(),
    database,
    name,
    engine
FROM clusterAllReplicas('default', system.tables)
WHERE database = 'patterns'
ORDER BY hostname();

Restore data from backup:

INSERT INTO patterns.patterns SELECT * FROM patterns.patterns_backup;
INSERT INTO patterns.tags SELECT * FROM patterns.tags_backup;

Verify data and replication:

-- Check row counts across cluster
SELECT 
    hostName(),
    (SELECT COUNT(*) FROM patterns.patterns) as patterns_count,
    (SELECT COUNT(*) FROM patterns.tags) as tags_count
FROM clusterAllReplicas('default', system.one)
ORDER BY hostname();

-- Check replication status
SELECT * FROM system.replicas;

-- Check cluster status
SELECT * FROM system.clusters;

Resource Requirements

Total Suggested Resources for HA Setup

For a typical 3-replica HA setup:

  • ClickHouse Total (3 replicas):

    • CPU: 6 cores (2 × 3)

    • Memory: 12GB (4GB × 3)

    • Storage: 15GB (5GB × 3)

  • ZooKeeper Total (3 nodes):

    • CPU: 1.5 cores (0.5 × 3)

    • Memory: 3GB (1GB × 3)

    • Storage: 3GB (1GB × 3)

References:

Last updated

Was this helpful?