Theano Deployment and Usage Guide

Note: Theano is discontinued and maintained in maintenance mode only. New projects should use PyTensor. This guide covers legacy Theano for existing codebases.

1. Prerequisites

System Requirements

Python: 2.7 or 3.4–3.9 (Python 3.10+ not officially supported)
Operating System: Linux, macOS, or Windows (with limitations)
Memory: Minimum 4GB RAM; 8GB+ recommended for large graph compilations

Required Dependencies

NumPy: >=1.9.1
SciPy: >=0.14 (optional but recommended for sparse matrix support)
C/C++ Compiler:
- Linux: gcc (4.2+) or clang
- macOS: Xcode Command Line Tools
- Windows: MinGW-w64 or Microsoft Visual C++ Build Tools
BLAS Implementation: OpenBLAS, ATLAS, MKL, or Accelerate (macOS)

Optional Dependencies

CUDA Toolkit: 7.5–11.x (for GPU support)
cuDNN: Compatible version with your CUDA (for optimized GPU operations)
libgpuarray: For CUDA-based GPU operations (pygpu)
pydot/graphviz: For graph visualization (theano.printing.pydotprint)

2. Installation

Method A: pip (Recommended for Users)

# Basic installation
pip install theano

# With all optional dependencies
pip install theano[all]

# Specific version (last release was 1.0.5)
pip install theano==1.0.5

Method B: conda (Recommended for GPU support)

conda install -c conda-forge theano
# For GPU support
conda install -c conda-forge pygpu

Method C: Build from Source (For Development)

git clone https://github.com/Theano/Theano.git
cd Theano
pip install -e .

Verify Installation

import theano
print(theano.__version__)
# Test basic functionality
from theano import tensor as T
a = T.scalar('a')
b = T.scalar('b')
c = a + b
f = theano.function([a, b], c)
print(f(1.5, 2.5))  # Should output 4.0

3. Configuration

Configuration File (`~/.theanorc`)

Create this file in your home directory:

[global]
device = cpu
floatX = float32
force_device = True

[gcc]
cxxflags = -march=native

[blas]
ldflags = -lopenblas

[nvcc]
fastmath = True
flags = -arch=sm_52  # Adjust for your GPU architecture

[cmodule]
# Location for compiled C cache
base_compiledir = /tmp/.theano

Environment Variables

Set these before running Python:

# Quick device switching without editing config
export THEANO_FLAGS="device=cuda,floatX=float32,force_device=True"

# Disable GPU usage
export THEANO_FLAGS="device=cpu,floatX=float64"

# Debug mode (very slow, for debugging graphs)
export THEANO_FLAGS="mode=DebugMode,DebugMode.check_py=False"

# Disable compiler optimizations (faster compilation, slower runtime)
export THEANO_FLAGS="optimizer=fast_compile"

Key Configuration Options

device: cpu, cuda, or gpu (legacy)
floatX: float32 (recommended for GPU) or float64
mode: FAST_RUN (default), FAST_COMPILE, or DebugMode
optimizer: fast_run, fast_compile, or None

4. Build & Run

Basic Usage Pattern

import theano
import theano.tensor as T
import numpy as np

# 1. Define symbolic variables
x = T.matrix('x')
y = T.vector('y')

# 2. Build expression graph (uses gof/graph.py internals)
z = T.dot(x, y) + T.mean(x)

# 3. Compile function (triggers C code generation via gof/cmodule)
f = theano.function([x, y], z)

# 4. Execute
x_data = np.random.randn(10, 5).astype(theano.config.floatX)
y_data = np.random.randn(5).astype(theano.config.floatX)
result = f(x_data, y_data)

Development Mode (Faster Compilation)

import theano
theano.config.mode = 'FAST_COMPILE'
theano.config.optimizer = 'fast_compile'

Production Mode (Faster Execution)

import theano
theano.config.mode = 'FAST_RUN'
theano.config.optimizer = 'fast_run'
theano.config.reoptimize_unpickled_function = False

Saving/Loading Compiled Functions

import pickle

# Save
with open('model.pkl', 'wb') as f:
    pickle.dump(theano.function([...], ...), f)

# Load
with open('model.pkl', 'rb') as f:
    predict_fn = pickle.load(f)

5. Deployment

Docker Deployment

Dockerfile (CPU-only):

FROM python:3.8-slim

RUN apt-get update && apt-get install -y \
    build-essential \
    libopenblas-dev \
    && rm -rf /var/lib/apt/lists/*

RUN pip install numpy scipy theano

ENV THEANO_FLAGS="floatX=float32,device=cpu,optimizer=fast_run"

COPY . /app
WORKDIR /app
CMD ["python", "your_script.py"]

Dockerfile (GPU with CUDA):

FROM nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04

RUN apt-get update && apt-get install -y \
    python3-pip \
    python3-dev \
    build-essential \
    libopenblas-dev \
    && rm -rf /var/lib/apt/lists/*

RUN pip3 install numpy scipy theano pygpu

ENV THEANO_FLAGS="floatX=float32,device=cuda,force_device=True"

COPY . /app
WORKDIR /app
CMD ["python3", "your_script.py"]

Cloud Deployment

AWS (EC2 GPU Instances):

Launch p3/p2 instance with Deep Learning AMI (Ubuntu)
Theano is pre-installed; verify with python -c "import theano; print(theano.__version__)"
Set THEANO_FLAGS="device=cuda,floatX=float32" in /etc/environment

Google Cloud Platform:

# Create GPU instance with CUDA
gcloud compute instances create theano-server \
    --zone=us-central1-a \
    --machine-type=n1-standard-4 \
    --accelerator=type=nvidia-tesla-k80,count=1 \
    --image-family=tf-latest-gpu \
    --image-project=deeplearning-platform-release \
    --maintenance-policy=TERMINATE \
    --boot-disk-size=50GB

Heroku (CPU-only): Add to requirements.txt:

theano==1.0.5
numpy
scipy

Set buildpack: heroku buildpacks:set heroku/python Configure via environment variables in dashboard or CLI:

heroku config:set THEANO_FLAGS="floatX=float32,device=cpu,compiledir=/tmp/.theano"

Performance Optimization for Production

# Pre-compile functions before serving requests
import theano
theano.config.compute_test_value = 'off'  # Disable debug computations
theano.config.exception_verbosity = 'low'

# Use C linker for VM (see theano/gof/vm.py)
theano.config.linker = 'cvm'  # Default, fastest

6. Troubleshooting

ImportError: No module named 'theano'

Solution:

pip install --upgrade --force-reinstall theano
# Ensure not mixing pip and conda
conda uninstall theano && pip uninstall theano && pip install theano

Compilation Errors (gcc/clang)

Error: g++: error: unrecognized command line option '-march=native' Solution:

# In ~/.theanorc
[gcc]
cxxflags = -march=core2  # Use safer architecture flag

Error: ld: library not found for -lblas Solution:

# macOS
brew install openblas
export LDFLAGS="-L/usr/local/opt/openblas/lib"
export CPPFLAGS="-I/usr/local/opt/openblas/include"

# Linux
sudo apt-get install libopenblas-dev

GPU/CUDA Issues

Error: ERROR (theano.sandbox.cuda): CUDA is installed, but device gpu is not available Solution:

# Check CUDA installation
nvidia-smi
nvcc --version

# Set correct device
export THEANO_FLAGS="device=cuda0"  # Instead of 'gpu'

Error: cuDNN not found Solution:

# Add to ~/.theanorc
[dnn]
enabled = True
include_path = /usr/local/cuda/include
library_path = /usr/local/cuda/lib64

Memory Issues

Error: MemoryError during compilation Solution:

# Clear the compile cache
theano-cache purge

# Or set smaller cache
export THEANO_FLAGS="base_compiledir=/tmp/small_theano_cache"

Slow Compilation