← Back to hangtwenty/dive-into-machine-learning

How to Deploy & Use hangtwenty/dive-into-machine-learning

# Deployment and Usage Guide

## Prerequisites

* **Python 3.x** (3.6 or newer recommended)
* **Git** (for cloning the repository)
* **Web browser** (Chrome, Firefox, or Edge recommended for Jupyter Notebook)
* **2GB free disk space** (if using Anaconda distribution)

### Optional Accounts
* **GitHub account** (for Binder deployment)
* **Google account** (for Google Colab GPU access)
* **Deepnote account** (for real-time collaboration)

## Installation

### Option 1: Anaconda Distribution (Recommended)

The Anaconda Python distribution includes Python 3, Jupyter Notebook, and all required scientific computing packages (numpy, pandas, scikit-learn, matplotlib) in a single installation.

1. Download Anaconda from [anaconda.com/download](https://www.anaconda.com/download)
2. Run the installer for your OS (Windows/macOS/Linux)
3. Verify installation:
   ```bash
   conda --version
   python --version

Option 2: Minimal pip Installation

If you prefer a lightweight setup using virtualenv/pip:

# Create virtual environment
python -m venv ml-env
source ml-env/bin/activate  # On Windows: ml-env\Scripts\activate

# Install core packages
pip install numpy pandas scikit-learn matplotlib jupyter

Clone the Repository

git clone https://github.com/dive-into-machine-learning/dive-into-machine-learning.git
cd dive-into-machine-learning

Configuration

Jupyter Notebook Setup

Generate default configuration (optional):

jupyter notebook --generate-config

The config file is located at:

  • Linux/macOS: ~/.jupyter/jupyter_notebook_config.py
  • Windows: %APPDATA%\jupyter\jupyter_notebook_config.py

Environment Verification

Verify all packages are available:

python -c "import numpy, pandas, sklearn, matplotlib; print('All packages imported successfully')"

Build & Run

Launch Jupyter Notebook

From the repository directory:

jupyter notebook

This will:

  1. Start the Jupyter server (default port 8888)
  2. Open your default browser to http://localhost:8888
  3. Display the file browser showing repository contents

Follow the Learning Path

  1. Complete the scikit-learn tutorial: Create a new notebook and follow along with An introduction to machine learning with scikit-learn, executing the code cells as you read.

  2. Explore the digits classification example: Run the hand-written digits classification code mentioned in the README to verify your installation works.

  3. Access external resources: Open the linked resources like "A Visual Introduction to Machine Learning" in your browser alongside your notebook.

Cloud-Based Alternative (No Local Installation)

If you skip local installation, use cloud environments:

Google Colab (for GPU access):

Binder (official Jupyter choice):

  • Navigate to mybinder.org
  • Enter repository URL: https://github.com/dive-into-machine-learning/dive-into-machine-learning
  • Click "Launch" to run notebooks in browser

Deepnote (for collaboration):

  • Import repository at deepnote.com for real-time collaborative editing

Deployment

Since this is a curated learning resource rather than a production application, "deployment" refers to sharing notebooks and environments:

Deploy Notebooks via Binder

To share your working notebooks with others:

  1. Push your notebook to a GitHub repository
  2. Visit mybinder.org
  3. Enter your GitHub repository URL
  4. Copy the generated Binder badge/link to share

Deploy to GitHub Pages (Static Guide)

If you fork this repository to create your own learning guide:

  1. Enable GitHub Pages in repository settings
  2. Select source branch (usually main or master)
  3. Access at https://yourusername.github.io/dive-into-machine-learning

Production ML Deployment (Advanced)

When ready to deploy actual ML models built using these tutorials:

  • scikit-learn models: Use joblib.dump() to serialize models, deploy via Flask/FastAPI
  • Cloud deployment: Consider AWS SageMaker, Google Vertex AI, or Azure Machine Learning
  • Containerization: Package environments using Docker with official Python images

Troubleshooting

Jupyter Won't Start (Port 8888 in Use)

# Specify alternate port
jupyter notebook --port 8889

Module Not Found Errors

If import sklearn fails:

# If using conda
conda install scikit-learn

# If using pip
pip install --upgrade scikit-learn

Python 2 vs 3 Conflicts

Ensure Jupyter uses Python 3:

python3 -m pip install ipykernel
python3 -m ipykernel install --user
jupyter notebook

Anaconda Path Issues (Windows)

If conda command not recognized:

  1. Add Anaconda to system PATH during installation, OR
  2. Use "Anaconda Prompt" from Start Menu instead of standard Command Prompt

Browser Doesn't Open Automatically

Copy the URL from terminal output:

http://localhost:8888/?token=your_token_here

Paste directly into browser address bar.

Package Version Conflicts

Create isolated environment specifically for this project:

conda create -n dive-ml python=3.9 numpy pandas scikit-learn matplotlib jupyter
conda activate dive-ml
jupyter notebook