Compilation and Installation
Procedure
- Use PuTTY to log in to the server as the root user.
- Install the Docker toolkit.
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo dnf clean expire-cache --refresh dnf install -y nvidia-docker2 systemctl restart docker
- Obtain the base image.
docker pull nvidia/cuda-arm64:11.1-devel-centos8
- Run Docker.
docker run -it nvidia/cuda-arm64:11.1-devel-centos8 /bin/bash
- Modify the /etc/yum.repos.d/nvidia-ml.repo file.
- Open the /etc/yum.repos.d/nvidia-ml.repo file.
vi /etc/yum.repos.d/nvidia-ml.repo
- Press i to enter the edit mode and add the following information in bold:
[nvidia-ml] name=nvidia-ml baseurl=https://developer.download.nvidia.com/compute/machine-learning/repos/rhel8/sbsa enabled=1 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-NVIDIA sslverify=false
- Press Esc, type :wq!, and press Enter to save the settings and exit.
- Open the /etc/yum.repos.d/nvidia-ml.repo file.
- Modify the /etc/yum.repos.d/cuda.repo file.
- Open the /etc/yum.repos.d/cuda.repo file.
vi /etc/yum.repos.d/cuda.repo
- Press i to enter the edit mode and add the following information in bold:
[cuda] name=cuda baseurl=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/sbsa enabled=1 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-NVIDIA sslverify=false
- Press Esc, type :wq!, and press Enter to save the settings and exit.
- Open the /etc/yum.repos.d/cuda.repo file.
- Refresh the Yum source cache.
yum makecache
- Install the system dependencies.
yum install -y zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel python3 python3-devel gcc wget cmake3 libarchive git which
- Download and install cudnn8.
wget https://developer.download.nvidia.cn/compute/machine-learning/repos/rhel8/sbsa/libcudnn8-8.0.4.30-1.cuda11.1.aarch64.rpm --no-check-certificate rpm -ivh libcudnn8-8.0.4.30-1.cuda11.1.aarch64.rpm wget https://developer.download.nvidia.cn/compute/machine-learning/repos/rhel8/sbsa/libcudnn8-devel-8.0.4.30-1.cuda11.1.aarch64.rpm --no-check-certificate rpm –ivh libcudnn8-devel-8.0.4.30-1.cuda11.1.aarch64.rpm
- Deploy Anaconda.
When executing the ./Anaconda3-2021.05-Linux-aarch64.sh script, you need to type Enter and yes. The script is deployed in the /root/anaconda3 directory by default.
wget https://repo.anaconda.com/archive/Anaconda3-2021.05-Linux-aarch64.sh chmod +x Anaconda3-2021.05-Linux-aarch64.sh ./Anaconda3-2021.05-Linux-aarch64.sh export PATH=/root/anaconda3/bin/:$PATH
- Install the PyTorch dependencies.
conda install astunparse numpy ninja pyyaml setuptools cmake cffi typing_extensions future six requests dataclasses
(Optional) Configure the conda proxy as follows:
cat > /root/.condarcexport PATH=/root/anaconda3/bin:$PATH channels: - conda-forge - defaults proxy_servers: http: xxxxxx https: xxxxxx ssl_verify: false
- Set the environment variables.
export PATH=/root/anaconda3/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64
- Download and compile the PyTorch source code.
export GIT_SSL_NO_VERIFY=1 git clone --recursive https://github.com/pytorch/pytorch --depth=1 cd pytorch git submodule sync git submodule update --init --recursive --jobs 0 export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"} python setup.py install - Create another terminal shell to export the container.
docker ps

Move large files such as the PyTorch source code out of the container, delete the files, and export the image.
- Back up the source code.
docker cp 4ad81495b7ef:/pytorch .
- Delete the PyTorch source code from the container.
rm -rf pytorch/ rm -rf Anaconda3-2021.05-Linux-aarch64.sh rm -f libcudnn8-8.0.4.30-1.cuda11.1.aarch64.rp rm -f libcudnn8-devel-8.0.4.30-1.cuda11.1.aarch64.rpm
- Export the image.
docker export -o pytorch_cuda.tar 4ad81495b7ef
- Back up the source code.
Parent topic: PyTorch 1.9 Porting Guide (CentOS 8.2)