← Back to rasbt/python-machine-learning-book

How to Deploy & Use rasbt/python-machine-learning-book

# Python Machine Learning (1st Edition) - Setup and Usage Guide

Comprehensive guide for running the code examples from the "Python Machine Learning" book by Sebastian Raschka.

## 1. Prerequisites

**Required:**
- Python 3.6+ (Python 2.7 supported but deprecated)
- Git
- pip or conda package manager

**Core Dependencies:**
- NumPy ≥1.9.1
- SciPy ≥0.14
- scikit-learn ≥0.15 (0.18+ recommended)
- matplotlib ≥1.4.0
- pandas ≥0.16
- Jupyter Notebook or JupyterLab

**Optional (for specific chapters):**
- Theano ≥0.7 (Chapter 13 - Neural Networks) *Note: Theano development ceased in 2017*
- Flask ≥0.10.1 (Chapter 9 - Web Application)
- PyYAML (Chapter 9)

## 2. Installation

### Clone Repository
```bash
git clone https://github.com/rasbt/python-machine-learning-book.git
cd python-machine-learning-book

Option A: Using pip (Recommended)

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install numpy scipy scikit-learn matplotlib pandas jupyter
pip install flask pyyaml  # For Chapter 9
pip install theano  # Optional, for Chapter 13

Option B: Using Conda

conda create -n pyml python=3.8
conda activate pyml
conda install numpy scipy scikit-learn matplotlib pandas jupyter flask pyyaml
conda install theano  # Optional

Verify Installation

python -c "import sklearn; print(sklearn.__version__)"
jupyter --version

3. Configuration

Theano Configuration (Chapter 13 only)

Create ~/.theanorc for GPU support or optimized CPU:

[global]
device = cpu
floatX = float32

Data Paths

Datasets are downloaded automatically by most notebooks. For Chapter 9 (Web Application), ensure write permissions in the code/ch09/ directory for the SQLite database.

Environment Variables (Optional)

export PYTHONPATH="${PYTHONPATH}:$(pwd)/code"

4. Build & Run

Launch Jupyter Notebook

jupyter notebook

Navigate to code/chXX/ directories and open .ipynb files.

Alternative: JupyterLab

jupyter lab

Running Individual Scripts

Some chapters contain standalone .py files:

cd code/ch09
python app.py  # Starts the movie review classifier web service

Chapter 9 Web Application Deployment

The Chapter 9 example includes a Flask application:

cd code/ch09
export FLASK_APP=app.py
export FLASK_ENV=development
flask run --host=0.0.0.0 --port=5000

Access at http://localhost:5000

5. Deployment Options

Static Notebook Viewing (No Installation)

View rendered notebooks via NbViewer (read-only):

Production Deployment (Chapter 9 Web App)

For deploying the sentiment analysis web application:

Heroku:

cd code/ch09
# Create Procfile with: web: gunicorn app:app
pip freeze > requirements.txt
heroku create your-ml-app
git push heroku master

Docker:

FROM python:3.8-slim
WORKDIR /app
COPY code/ch09/requirements.txt .
RUN pip install -r requirements.txt
COPY code/ch09/ .
EXPOSE 5000
CMD ["python", "app.py"]

Model Serialization

Export trained models for production use:

import pickle
# From any notebook cell
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

6. Troubleshooting

Theano Issues (Chapter 13)

Problem: ImportError: No module named theano or GPU errors
Solution: Theano is deprecated. Use CPU mode only or migrate to TensorFlow/PyTorch for production:

# In notebooks, replace Theano backend with pure NumPy or modern frameworks
import os
os.environ["THEANO_FLAGS"] = "device=cpu,floatX=float32"

Scikit-learn API Changes

Problem: AttributeError: 'module' object has no attribute 'cross_validation'
Solution: The book uses older scikit-learn APIs. Update imports:

# Old (book code)
from sklearn.cross_validation import train_test_split

# New (modern sklearn)
from sklearn.model_selection import train_test_split

Python 2 vs 3 Compatibility

Problem: print statement syntax errors
Solution: Ensure Python 3.x environment or manually update print "text" to print("text").

Missing Data Files

Problem: FileNotFoundError for CSV/datasets
Solution: Run notebooks from their respective chapter directories:

cd code/ch08
jupyter notebook ../ch08.ipynb

Permission Errors (macOS/Linux)

Problem: Cannot write model pickles or database files
Solution:

chmod 755 code/ch09/

Jupyter Kernel Issues

Problem: Module not found in notebook but available in terminal
Solution: Install kernel in correct environment:

python -m ipykernel install --user --name=pyml --display-name "Python ML"
# Then select Kernel > Change kernel > Python ML in Jupyter

Memory Errors (Large Datasets)

Problem: Kernel dies processing large arrays
Solution: Reduce dataset size in notebook parameters or increase Jupyter memory limits:

jupyter notebook --NotebookApp.max_buffer_size=2147483648