← Back to deeplearning4j/deeplearning4j

How to Deploy & Use deeplearning4j/deeplearning4j

Eclipse Deeplearning4J Deployment & Usage Guide

1. Prerequisites

Runtime Requirements

  • JDK: Java 8 or higher (JDK 11 required for running the full test suite due to Spark/Scala compatibility)
  • Build Tool: Maven 3.6+ or Gradle 6+
  • Git: For cloning examples and source builds

Platform Support

  • OS: Windows, Linux, macOS (CPU only for macOS)
  • Architectures: x86_64 (AVX2/AVX512), ARM (arm64, armhf), PowerPC (ppc64le)

GPU Requirements (Optional)

  • CUDA: Version 10.0, 10.1, or 10.2 (Linux/Windows only; macOS not supported for GPU)
  • cuDNN: Matching CUDA version
  • NVIDIA Driver: Compatible with CUDA 10.x

Build from Source Prerequisites

If building LibND4J (C++ backend) from source:

  • CMake 3.14+
  • C++ compiler with C++14 support (GCC 7+, Clang 6+, MSVC 2019+)
  • OpenBLAS or MKL (optional but recommended for CPU acceleration)
  • For CUDA builds: CUDA Toolkit matching target version

2. Installation

Quick Start (Maven)

Add to your pom.xml:

<dependencies>
  <!-- Core DL4J -->
  <dependency>
    <groupId>org.eclipse.deeplearning4j</groupId>
    <artifactId>deeplearning4j-core</artifactId>
    <version>1.0.0-M2.1</version>
  </dependency>
  
  <!-- CPU Backend (ND4J Native) -->
  <dependency>
    <groupId>org.eclipse.deeplearning4j</groupId>
    <artifactId>nd4j-native-platform</artifactId>
    <version>1.0.0-M2.1</version>
  </dependency>
</dependencies>

GPU Backend (CUDA)

Replace the CPU backend with:

<dependency>
  <groupId>org.eclipse.deeplearning4j</groupId>
  <artifactId>nd4j-cuda-10.2-platform</artifactId>
  <version>1.0.0-M2.1</version>
</dependency>

Note: Adjust version to match your CUDA installation (10.0, 10.1, or 10.2).

Gradle Configuration

dependencies {
    implementation 'org.eclipse.deeplearning4j:deeplearning4j-core:1.0.0-M2.1'
    implementation 'org.eclipse.deeplearning4j:nd4j-native-platform:1.0.0-M2.1'
}

Clone Examples Repository

git clone https://github.com/eclipse/deeplearning4j-examples.git
cd deeplearning4j-examples
mvn clean install

3. Configuration

Backend Selection

DL4J automatically detects available backends. Priority order:

  1. CUDA (if available and configured)
  2. Native CPU (AVX512 > AVX2 > Generic)

Force CPU backend via system property:

java -Dorg.nd4j.linalg.factory.Nd4jBackend=org.nd4j.linalg.cpu.nativecpu.CpuBackend -jar app.jar

Memory Configuration

Native libraries require off-heap memory. Configure JVM options:

# Minimum recommended for small models
java -Xmx8g -Dorg.bytedeco.javacpp.maxbytes=8G -Dorg.bytedeco.javacpp.maxphysicalbytes=9G

# For large models or training
java -Xmx16g -Dorg.bytedeco.javacpp.maxbytes=16G -Dorg.bytedeco.javacpp.maxphysicalbytes=20G

Workspace Configuration

Enable memory workspaces for better performance:

System.setProperty("org.nd4j.linalg.api.memory.conf.WorkspaceConfiguration", "ENABLED");

GPU-Specific Settings

// Select specific GPU (if multiple available)
Nd4j.getAffinityManager().attachThreadToDevice(Thread.currentThread(), 0);

// Enable CUDA workspace
System.setProperty("org.nd4j.linalg.cuda.workspace.enabled", "true");

4. Build & Run

Basic Project Structure

src/
├── main/
│   ├── java/
│   │   └── com/example/
│   │       └── NeuralNetExample.java
│   └── resources/
│       └── logback.xml (optional logging config)
└── test/
    └── java/

Minimal Working Example

Create src/main/java/DL4JQuickstart.java:

import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.learning.config.Adam;
import org.nd4j.linalg.lossfunctions.LossFunctions;

public class DL4JQuickstart {
    public static void main(String[] args) {
        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .updater(new Adam(0.001))
            .list()
            .layer(0, new DenseLayer.Builder().nIn(784).nOut(256)
                .activation(Activation.RELU).build())
            .layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                .activation(Activation.SOFTMAX).nIn(256).nOut(10).build())
            .build();
            
        MultiLayerNetwork model = new MultiLayerNetwork(conf);
        model.init();
        System.out.println("Model initialized successfully");
    }
}

Compile and Run

mvn clean compile exec:java -Dexec.mainClass="DL4JQuickstart"

Or package and run:

mvn clean package
java -Xmx4g -cp target/myapp-1.0-SNAPSHOT.jar:target/dependency/* DL4JQuickstart

Running Tests

The full test suite requires JDK 11:

cd platform-tests
mvn test -Djdk.version=11

5. Deployment

Maven Central (Recommended)

Use pre-built artifacts for production:

  • Stable releases available on Maven Central
  • Snapshot versions via OSSRH repository (if needed)

Docker Deployment

Create Dockerfile:

FROM openjdk:11-jdk-slim
WORKDIR /app
COPY target/myapp-1.0-SNAPSHOT.jar app.jar
COPY target/dependency/* ./lib/
ENV JAVA_OPTS="-Xmx8g -Dorg.bytedeco.javacpp.maxbytes=8G"
ENTRYPOINT ["java", "-cp", "app.jar:lib/*", "com.example.Main"]

Build and run:

docker build -t dl4j-app .
docker run --memory=10g dl4j-app

Apache Spark Deployment

For distributed training:

<dependency>
    <groupId>org.eclipse.deeplearning4j</groupId>
    <artifactId>dl4j-spark_2.12</artifactId>
    <version>1.0.0-M2.1</version>
</dependency>

Submit job:

spark-submit --class com.example.TrainingJob \
  --master yarn \
  --deploy-mode cluster \
  --executor-memory 16g \
  --driver-memory 8g \
  myapp.jar

Model Import Deployment

For inference-only deployment with imported models:

// Import Keras model
MultiLayerNetwork model = KerasModelImport.importKerasSequentialModelAndWeights("model.h5");

// Import TensorFlow (SameDiff)
SameDiff sd = SameDiff.importFrozenTF("model.pb");

6. Troubleshooting

Native Library Loading Errors

Issue: UnsatisfiedLinkError: no jnind4jcpu in java.library.path

Solution:

  • Ensure nd4j-native-platform is in dependencies (brings native binaries for all platforms)
  • For custom builds, set library path:
    java -Djava.library.path=/path/to/libnd4j/libs ...
    

CUDA Version Mismatch

Issue: CUDA error: CUDA driver version is insufficient for CUDA runtime version

Solution:

  • Verify CUDA toolkit version matches driver capabilities
  • Use nd4j-cuda-10.2-platform only if CUDA 10.2 is installed
  • Check available CUDA version: nvcc --version

OutOfMemoryError (Off-Heap)

Issue: java.lang.OutOfMemoryError: Physical memory usage is too high

Solution:

  • Increase off-heap limits:
    -Dorg.bytedeco.javacpp.maxbytes=16G -Dorg.bytedeco.javacpp.maxphysicalbytes=20G
    
  • Enable workspaces: .trainingWorkspaceMode(WorkspaceMode.ENABLED) in configuration
  • Reduce batch size

macOS GPU Not Supported

Issue: Cannot load CUDA backend on macOS

Solution:

  • Use CPU backend (nd4j-native-platform) on macOS
  • GPU acceleration only available on Linux/Windows

AVX Optimization Issues

Issue: Illegal instruction (core dumped) on older CPUs

Solution:

  • Force generic x86 backend by excluding optimized binaries:
    <dependency>
      <groupId>org.eclipse.deeplearning4j</groupId>
      <artifactId>nd4j-native</artifactId>
      <version>1.0.0-M2.1</version>
      <classifier>linux-x86_64</classifier> <!-- Generic, no AVX -->
    </dependency>
    

Build from Source Failures

Issue: CMake or C++ compilation errors

Solution:

  • Install build essentials: apt-get install build-essential cmake (Ubuntu) or yum groupinstall "Development Tools" (RHEL)
  • Set explicit architecture: -Dnative.cpu.skip=true -Djavacpp.platform=linux-x86_64
  • For CUDA builds: Ensure CUDA_HOME environment variable is set

Community Support

For additional issues: