← Back to ApacheInfra/superset

How to Deploy & Use ApacheInfra/superset

Apache Superset Deployment & Usage Guide

Prerequisites

Runtime Requirements:

  • Python: 3.9, 3.10, or 3.11 (3.12 not yet supported)
  • Node.js: 16.x or 18.x LTS (for frontend builds)
  • npm: 8.x+ or Yarn 1.22.x
  • Database: PostgreSQL 12+, MySQL 5.7+, or SQLite (dev only)
  • Redis: 5.0+ (required for caching, Celery, and async queries)

System Dependencies:

  • macOS/Linux: gcc, libffi-dev, libpq-dev (or postgresql-devel), libsasl2-dev
  • Windows: WSL2 recommended; Visual C++ 14.0+ build tools if native

Accounts & Services:

  • Database credentials with CREATE privileges
  • (Optional) S3/GCS/Azure blob storage for chart exports
  • (Optional) OAuth/OpenID Connect provider credentials for SSO

Installation

1. Clone the Repository

git clone https://github.com/apache/superset.git
cd superset
git checkout 3.1.0  # Or latest stable tag

2. Backend Setup (Python)

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install Superset with all database drivers
pip install -e ".[postgres,redis,thumbnails]"  # Add other extras as needed

# Initialize database
superset db upgrade

# Create admin user
superset fab create-admin

# Load example data (optional)
superset load_examples

# Initialize roles/permissions
superset init

3. Frontend Setup (TypeScript/React)

cd superset-frontend

# Install dependencies (npm or yarn)
npm ci  # Recommended for reproducible builds
# OR
yarn install --frozen-lockfile

# Return to root
cd ..

Configuration

1. Create Configuration File Create superset_config.py in your PYTHONPATH (e.g., repository root):

import os

# Security
SECRET_KEY = os.environ.get('SUPERSET_SECRET_KEY', 'your-secure-random-key-here')
WTF_CSRF_ENABLED = True

# Database
SQLALCHEMY_DATABASE_URI = os.environ.get(
    'DATABASE_URL', 
    'postgresql://user:password@localhost:5432/superset'
)

# Redis (for caching and Celery)
REDIS_HOST = os.environ.get('REDIS_HOST', 'localhost')
REDIS_PORT = int(os.environ.get('REDIS_PORT', 6379))
CACHE_CONFIG = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': 300,
    'CACHE_KEY_PREFIX': 'superset_',
    'CACHE_REDIS_HOST': REDIS_HOST,
    'CACHE_REDIS_PORT': REDIS_PORT,
}

# Feature Flags (referenced in source: FeatureFlag, isFeatureEnabled)
FEATURE_FLAGS = {
    'ENABLE_TEMPLATE_PROCESSING': True,
    'DASHBOARD_CROSS_FILTERS': True,  # Enables cross-filtering (seen in dashboardState.ts)
    'DASHBOARD_NATIVE_FILTERS': True,
    'THUMBNAILS': False,
}

# Theme Configuration (optional, references ThemeController.ts)
APP_THEME = 'LIGHT'  # or 'DARK' - controls default ThemeMode

# Superset specific settings
SUPERSET_WEBSERVER_PORT = 8088

2. Environment Variables

export FLASK_APP=superset
export PYTHONPATH=/path/to/superset_config/directory:$PYTHONPATH
export SUPERSET_SECRET_KEY=$(openssl rand -base64 42)

3. Database Connection Strings Configure data sources via UI or superset_config.py:

# Example: Athena connection
SQLALCHEMY_EXAMPLES_URI = 'awsathena+rest://...'

Build & Run

Development Mode (Full Stack)

Terminal 1 - Backend:

source venv/bin/activate
flask run -p 8088 --with-threads --reload --debugger
# OR
superset run -p 8088 --with-threads --reload --debugger

Terminal 2 - Frontend (Hot Reload):

cd superset-frontend
npm run dev-server

Access at http://localhost:9000 (webpack dev server proxies to Flask at 8088)

Production Build

  1. Build Frontend Assets:
cd superset-frontend
npm run build  # Outputs to superset/static/assets
  1. Production Server:
# Using Gunicorn (recommended)
gunicorn -w 10 -k gevent --timeout 120 -b 0.0.0.0:8088 "superset.app:create_app()"

# Or using Superset CLI
superset run -h 0.0.0.0 -p 8088

Celery Workers (for async queries, thumbnails, alerts):

celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4
celery --app=superset.tasks.celery_app:app beat  # For scheduled jobs

Deployment

Docker (Recommended for Quick Deploy)

# Clone and run official compose
git clone https://github.com/apache/superset.git
cd superset
docker-compose -f docker-compose-non-dev.yml up -d

Kubernetes (Production Scale)

# Using official Helm chart
helm repo add superset https://apache.github.io/superset
helm install superset superset/superset \
  --set configOverrides.secret=SECRET_KEY \
  --set postgresql.enabled=true \
  --set redis.enabled=true

Cloud Platforms:

  • AWS: Deploy via ECS Fargate or EKS; use RDS PostgreSQL and ElastiCache Redis
  • GCP: Cloud Run (stateless) + Cloud SQL; or GKE with persistent volumes
  • Azure: Container Instances or AKS with Azure Database for PostgreSQL
  • Heroku: Use apache-superset buildpack (note: ephemeral filesystem limits caching)

Security Checklist for Production:

  • Change default SECRET_KEY (minimum 32 bytes)
  • Enable HTTPS/TLS termination
  • Configure authentication (OAuth, LDAP, or SAML) via Flask-AppBuilder
  • Set SESSION_COOKIE_SECURE = True and SESSION_COOKIE_HTTPONLY = True
  • Restrict database connections (read-only service accounts recommended)

Troubleshooting

Build Issues:

Frontend build fails with "JavaScript heap out of memory"

export NODE_OPTIONS="--max-old-space-size=8192"
npm run build

Module not found errors in superset-frontend

# Clear caches and reinstall
rm -rf node_modules package-lock.json
npm cache clean --force
npm ci

Runtime Issues:

Database migration errors

# Reset (WARNING: destroys data)
superset db downgrade base
superset db upgrade

# Or stamp to specific version
superset db stamp <revision>

"No module named superset"

# Ensure you're in the virtual environment and installed in editable mode
pip install -e .
export PYTHONPATH=$(pwd):$PYTHONPATH

Chart rendering errors (NVD3/Pivot Table plugins)

  • Check browser console for missing CSS (source files reference nv.d3.css)
  • Verify supersetThemeObject is loaded (ThemeController.ts dependency)
  • Clear localStorage if theme-related crashes occur (localStorage.clear())

Cross-filtering not working

  • Verify DASHBOARD_CROSS_FILTERS feature flag is enabled in superset_config.py
  • Check browser console for getCrossFiltersConfiguration errors (referenced in dashboardState.ts)

Celery tasks stuck in "Pending"

  • Verify Redis connectivity: redis-cli ping
  • Check worker logs: celery -A superset.tasks.celery_app:app worker -l info
  • Ensure RESULT_BACKEND is configured in celeryconfig

Performance Tuning:

  • Enable caching: Set CACHE_CONFIG with Redis
  • Configure async queries for large datasets: GLOBAL_ASYNC_QUERIES = True
  • Increase SUPERSET_WEBSERVER_TIMEOUT for long-running queries
  • Use a production WSGI server (Gunicorn/gevent) instead of Flask dev server