ClickHouse Deployment and Usage Guide
1. Prerequisites
System Requirements:
- Linux, macOS, or FreeBSD operating system
- x86_64 CPU architecture with SSE 4.2 instruction set support
- Minimum 4GB RAM (16GB+ recommended for production)
- At least 10GB free disk space
Required Tools:
- curl or wget for installation
- For building from source: C++17 compiler (gcc 10+ or clang 10+), CMake 3.21+, Ninja or GNU Make, Python 3
Network:
- Port 8123 (HTTP interface) and 9000 (native TCP protocol) should be accessible
- For distributed deployments: port 9009 (interserver communication)
2. Installation
Quick Install (Recommended)
For Linux, macOS, and FreeBSD systems:
curl https://clickhouse.com/ | sh
This script will:
- Detect your operating system
- Download the appropriate ClickHouse binary package
- Install ClickHouse server and client
- Set up basic configuration
Alternative Installation Methods
Using Package Managers:
Ubuntu/Debian:
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754
echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
RHEL/CentOS:
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
sudo yum install -y clickhouse-server clickhouse-client
macOS (Homebrew):
brew install clickhouse
3. Configuration
Basic Configuration
After installation, the main configuration files are located at:
/etc/clickhouse-server/config.xml(server configuration)/etc/clickhouse-server/users.xml(user authentication and quotas)
Essential configuration settings to review:
- Data Directory (in config.xml):
<path>/var/lib/clickhouse/</path>
<tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
- Network Configuration (in config.xml):
<listen_host>0.0.0.0</listen_host> <!-- Listen on all interfaces -->
<!-- Or specify specific IP: <listen_host>::</listen_host> for IPv6 -->
- Default User (in users.xml):
<users>
<default>
<password></password> <!-- Empty password by default -->
<profile>default</profile>
<networks>
<ip>::/0</ip> <!-- Allow from anywhere -->
</networks>
</default>
</users>
Environment Variables:
CLICKHOUSE_CONFIG: Override config file locationCLICKHOUSE_LOG_FILE: Specify log file locationCLICKHOUSE_DATA: Override data directory
4. Build & Run
Building from Source
If you need to build ClickHouse from source (for development or custom features):
# Clone the repository
git clone --recursive https://github.com/ClickHouse/ClickHouse.git
cd ClickHouse
# Create build directory
mkdir build
cd build
# Configure with CMake
cmake .. -DCMAKE_CXX_COMPILER=$(which g++-10) -DCMAKE_C_COMPILER=$(which gcc-10)
# Build (adjust -j for your CPU cores)
cmake --build . --config Release -j $(nproc)
# Run tests (optional)
ctest . --config Release
Running ClickHouse
Start the Server:
# Systemd (Linux):
sudo systemctl start clickhouse-server
# Manual start:
sudo clickhouse server --config-file=/etc/clickhouse-server/config.xml
# Development mode (foreground):
clickhouse server --config-file=/etc/clickhouse-server/config.xml
Connect with Client:
# Interactive mode:
clickhouse-client
# With connection parameters:
clickhouse-client --host localhost --port 9000 --user default
# Execute SQL directly:
clickhouse-client --query "SELECT version()"
Verify Installation:
# Check server status
clickhouse-client --query "SELECT version()"
# Expected output: 24.x.x.x
# Create test table and query
clickhouse-client --query "CREATE TABLE test (id Int32, name String) ENGINE = Memory"
clickhouse-client --query "INSERT INTO test VALUES (1, 'Hello'), (2, 'World')"
clickhouse-client --query "SELECT * FROM test"
5. Deployment
Single Node Deployment
For development or small workloads, a single node is sufficient:
- Install ClickHouse using the quick install method
- Configure firewall to allow ports 8123 (HTTP) and 9000 (TCP)
- Set up authentication in
users.xml - Configure data retention policies
- Enable monitoring (exposes metrics on port 9363)
Cluster Deployment
For production workloads, deploy a ClickHouse cluster:
Minimum Production Setup:
- 3 ZooKeeper nodes (or ClickHouse Keeper) for coordination
- 2+ ClickHouse shards with 2+ replicas each
- Load balancer for client connections
Deployment Steps:
- Install ClickHouse on all nodes
- Configure ZooKeeper/ClickHouse Keeper
- Update
config.xmlwith cluster configuration:
<remote_servers>
<my_cluster>
<shard>
<replica>
<host>node1</host>
<port>9000</port>
</replica>
<replica>
<host>node2</host>
<port>9000</port>
</replica>
</shard>
</my_cluster>
</remote_servers>
- Configure replication on each node
- Set up distributed tables
Cloud Deployment Options
Managed Services:
- ClickHouse Cloud: Fully managed service by ClickHouse creators
- AWS: ClickHouse on EC2 or through Marketplace
- Google Cloud: ClickHouse on GCE
- Azure: ClickHouse on Azure VMs
Containerized Deployment:
# Using Docker
docker run -d --name clickhouse-server \
-p 8123:8123 -p 9000:9000 \
-v /path/to/data:/var/lib/clickhouse \
clickhouse/clickhouse-server:latest
# Using Docker Compose
version: '3'
services:
clickhouse:
image: clickhouse/clickhouse-server:latest
ports:
- "8123:8123"
- "9000:9000"
volumes:
- ./data:/var/lib/clickhouse
Kubernetes:
- Use the ClickHouse Operator
- Helm charts available for deployment
6. Troubleshooting
Common Issues and Solutions
1. Server Won't Start
# Check logs
sudo tail -f /var/log/clickhouse-server/clickhouse-server.log
# Common causes:
# - Port already in use: sudo lsof -i :9000
# - Insufficient memory: check dmesg | grep -i kill
# - Corrupted data: try removing /var/lib/clickhouse/metadata/
2. "Connection Refused" Errors
# Verify server is running
sudo systemctl status clickhouse-server
# Check network configuration
grep -A5 '<listen_host>' /etc/clickhouse-server/config.xml
# Test connectivity
curl http://localhost:8123/ping
# Should return "Ok"
3. Performance Issues
# Check system resources
clickhouse-client --query "SELECT * FROM system.metrics"
# Monitor queries
clickhouse-client --query "SELECT * FROM system.processes"
# Check table sizes
clickhouse-client --query "SELECT table, formatReadableSize(sum(bytes)) FROM system.parts GROUP BY table"
4. Memory Errors
# Adjust memory settings in config.xml
<max_memory_usage>10000000000</max_memory_usage> # 10GB
<max_bytes_before_external_group_by>5000000000</max_bytes_before_external_group_by>
5. Replication Issues
# Check replication status
clickhouse-client --query "SELECT * FROM system.replicas"
# Check ZooKeeper connection
clickhouse-client --query "SELECT * FROM system.zookeeper WHERE path='/'"
6. Getting Help
- Check official documentation
- Join Slack community
- Search GitHub issues
- Attend community events
Useful Diagnostic Commands:
# Check system settings
clickhouse-client --query "SELECT * FROM system.settings WHERE name LIKE '%memory%'"
# View server logs in real-time
sudo tail -f /var/log/clickhouse-server/clickhouse-server.log
# Check table health
clickhouse-client --query "SELECT database, table, is_leader, total_replicas, active_replicas FROM system.replicas"
# Monitor background merges
clickhouse-client --query "SELECT * FROM system.merges"