Theano Deployment and Usage Guide
Note: Theano is discontinued and maintained in maintenance mode only. New projects should use PyTensor. This guide covers legacy Theano for existing codebases.
1. Prerequisites
System Requirements
- Python: 2.7 or 3.4–3.9 (Python 3.10+ not officially supported)
- Operating System: Linux, macOS, or Windows (with limitations)
- Memory: Minimum 4GB RAM; 8GB+ recommended for large graph compilations
Required Dependencies
- NumPy:
>=1.9.1 - SciPy:
>=0.14(optional but recommended for sparse matrix support) - C/C++ Compiler:
- Linux:
gcc(4.2+) orclang - macOS: Xcode Command Line Tools
- Windows: MinGW-w64 or Microsoft Visual C++ Build Tools
- Linux:
- BLAS Implementation: OpenBLAS, ATLAS, MKL, or Accelerate (macOS)
Optional Dependencies
- CUDA Toolkit: 7.5–11.x (for GPU support)
- cuDNN: Compatible version with your CUDA (for optimized GPU operations)
- libgpuarray: For CUDA-based GPU operations (
pygpu) - pydot/graphviz: For graph visualization (
theano.printing.pydotprint)
2. Installation
Method A: pip (Recommended for Users)
# Basic installation
pip install theano
# With all optional dependencies
pip install theano[all]
# Specific version (last release was 1.0.5)
pip install theano==1.0.5
Method B: conda (Recommended for GPU support)
conda install -c conda-forge theano
# For GPU support
conda install -c conda-forge pygpu
Method C: Build from Source (For Development)
git clone https://github.com/Theano/Theano.git
cd Theano
pip install -e .
Verify Installation
import theano
print(theano.__version__)
# Test basic functionality
from theano import tensor as T
a = T.scalar('a')
b = T.scalar('b')
c = a + b
f = theano.function([a, b], c)
print(f(1.5, 2.5)) # Should output 4.0
3. Configuration
Configuration File (~/.theanorc)
Create this file in your home directory:
[global]
device = cpu
floatX = float32
force_device = True
[gcc]
cxxflags = -march=native
[blas]
ldflags = -lopenblas
[nvcc]
fastmath = True
flags = -arch=sm_52 # Adjust for your GPU architecture
[cmodule]
# Location for compiled C cache
base_compiledir = /tmp/.theano
Environment Variables
Set these before running Python:
# Quick device switching without editing config
export THEANO_FLAGS="device=cuda,floatX=float32,force_device=True"
# Disable GPU usage
export THEANO_FLAGS="device=cpu,floatX=float64"
# Debug mode (very slow, for debugging graphs)
export THEANO_FLAGS="mode=DebugMode,DebugMode.check_py=False"
# Disable compiler optimizations (faster compilation, slower runtime)
export THEANO_FLAGS="optimizer=fast_compile"
Key Configuration Options
device:cpu,cuda, orgpu(legacy)floatX:float32(recommended for GPU) orfloat64mode:FAST_RUN(default),FAST_COMPILE, orDebugModeoptimizer:fast_run,fast_compile, orNone
4. Build & Run
Basic Usage Pattern
import theano
import theano.tensor as T
import numpy as np
# 1. Define symbolic variables
x = T.matrix('x')
y = T.vector('y')
# 2. Build expression graph (uses gof/graph.py internals)
z = T.dot(x, y) + T.mean(x)
# 3. Compile function (triggers C code generation via gof/cmodule)
f = theano.function([x, y], z)
# 4. Execute
x_data = np.random.randn(10, 5).astype(theano.config.floatX)
y_data = np.random.randn(5).astype(theano.config.floatX)
result = f(x_data, y_data)
Development Mode (Faster Compilation)
import theano
theano.config.mode = 'FAST_COMPILE'
theano.config.optimizer = 'fast_compile'
Production Mode (Faster Execution)
import theano
theano.config.mode = 'FAST_RUN'
theano.config.optimizer = 'fast_run'
theano.config.reoptimize_unpickled_function = False
Saving/Loading Compiled Functions
import pickle
# Save
with open('model.pkl', 'wb') as f:
pickle.dump(theano.function([...], ...), f)
# Load
with open('model.pkl', 'rb') as f:
predict_fn = pickle.load(f)
5. Deployment
Docker Deployment
Dockerfile (CPU-only):
FROM python:3.8-slim
RUN apt-get update && apt-get install -y \
build-essential \
libopenblas-dev \
&& rm -rf /var/lib/apt/lists/*
RUN pip install numpy scipy theano
ENV THEANO_FLAGS="floatX=float32,device=cpu,optimizer=fast_run"
COPY . /app
WORKDIR /app
CMD ["python", "your_script.py"]
Dockerfile (GPU with CUDA):
FROM nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04
RUN apt-get update && apt-get install -y \
python3-pip \
python3-dev \
build-essential \
libopenblas-dev \
&& rm -rf /var/lib/apt/lists/*
RUN pip3 install numpy scipy theano pygpu
ENV THEANO_FLAGS="floatX=float32,device=cuda,force_device=True"
COPY . /app
WORKDIR /app
CMD ["python3", "your_script.py"]
Cloud Deployment
AWS (EC2 GPU Instances):
- Launch p3/p2 instance with Deep Learning AMI (Ubuntu)
- Theano is pre-installed; verify with
python -c "import theano; print(theano.__version__)" - Set
THEANO_FLAGS="device=cuda,floatX=float32"in/etc/environment
Google Cloud Platform:
# Create GPU instance with CUDA
gcloud compute instances create theano-server \
--zone=us-central1-a \
--machine-type=n1-standard-4 \
--accelerator=type=nvidia-tesla-k80,count=1 \
--image-family=tf-latest-gpu \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--boot-disk-size=50GB
Heroku (CPU-only):
Add to requirements.txt:
theano==1.0.5
numpy
scipy
Set buildpack: heroku buildpacks:set heroku/python
Configure via environment variables in dashboard or CLI:
heroku config:set THEANO_FLAGS="floatX=float32,device=cpu,compiledir=/tmp/.theano"
Performance Optimization for Production
# Pre-compile functions before serving requests
import theano
theano.config.compute_test_value = 'off' # Disable debug computations
theano.config.exception_verbosity = 'low'
# Use C linker for VM (see theano/gof/vm.py)
theano.config.linker = 'cvm' # Default, fastest
6. Troubleshooting
ImportError: No module named 'theano'
Solution:
pip install --upgrade --force-reinstall theano
# Ensure not mixing pip and conda
conda uninstall theano && pip uninstall theano && pip install theano
Compilation Errors (gcc/clang)
Error: g++: error: unrecognized command line option '-march=native'
Solution:
# In ~/.theanorc
[gcc]
cxxflags = -march=core2 # Use safer architecture flag
Error: ld: library not found for -lblas
Solution:
# macOS
brew install openblas
export LDFLAGS="-L/usr/local/opt/openblas/lib"
export CPPFLAGS="-I/usr/local/opt/openblas/include"
# Linux
sudo apt-get install libopenblas-dev
GPU/CUDA Issues
Error: ERROR (theano.sandbox.cuda): CUDA is installed, but device gpu is not available
Solution:
# Check CUDA installation
nvidia-smi
nvcc --version
# Set correct device
export THEANO_FLAGS="device=cuda0" # Instead of 'gpu'
Error: cuDNN not found
Solution:
# Add to ~/.theanorc
[dnn]
enabled = True
include_path = /usr/local/cuda/include
library_path = /usr/local/cuda/lib64
Memory Issues
Error: MemoryError during compilation
Solution:
# Clear the compile cache
theano-cache purge
# Or set smaller cache
export THEANO_FLAGS="base_compiledir=/tmp/small_theano_cache"
Slow Compilation
Solution:
# Disable expensive optimizations
theano.config.optimizer = 'fast_compile'
# Or use precompiled pickles
Debug Mode Crashes
Error: DebugModeError
Solution:
# Disable DebugMode in production
export THEANO_FLAGS="mode=FAST_RUN"
Random Number Generator Issues (MRG31k3p)
If using theano.sandbox.rng_mrg.MRG_RandomStreams and getting stream errors:
from theano.sandbox.rng_mrg import MRG_RandomStreams
# Ensure consistent seeding across devices
srng = MRG_RandomStreams(seed=123, use_cuda=True)
Windows-Specific Issues
Error: gcc.exe: error: unrecognized command line option '-mno-cygwin'
Solution:
Edit theano/configdefaults.py or set:
set THEANO_FLAGS=gcc.cxxflags=-march=core2