使用Docker搭建多人使用GPU服务器

背景

让多人共同使用GPU资源而不相互干扰,同时系统资源分配比较灵活。

服务器配置

cpu

48  Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz

2颗24核CPU

(指令:

cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c

cat /proc/cpuinfo | grep physical | uniq -c)

安装显卡驱动

cd到.run文件目录
sudo apt-get purge nvidia*
sudo vim /etc/modprobe.d/blacklist-nouveau.conf

写上:

blacklist nouveau

  options nouveau modeset=0
sudo update-initramfs -u
sudo apt-get install build-essential freeglut3-dev libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
sudo chmod +x NVIDIA-Linux-x86_64-410.104.run
sudo ./NVIDIA-Linux-x86_64-410.104.run --no-opengl-files -no-x-check

安装docker CE和nvidia-docker

参照https://www.cnblogs.com/journeyonmyway/p/10318624.html

docker安装错了卸载docker:

sudo apt-get purge docker

sudo apt-get purge docker-ce

sudo apt-get remove -y docker-*

sudo rm -rf /var/lib/docker

进行验证 docker --version

 创建容器

docker pull nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04

(ubuntu和cuda版本查询:https://hub.docker.com/r/nvidia/cuda/tags

nvidia-docker run -dit --net host --name=cuda1 -h=LAB_VM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04

docker exec -it cuda1 /bin/bash

apt-get update

apt-get install net-tools -y
apt-get install inetutils-ping

apt-get install vim

cp /etc/apt/sources.list /etc/apt/sources.list.bak
rm /etc/apt/sources.list
vim /etc/apt/sources.list 添加清华源 https://mirrors.tuna.tsinghua.edu.cn/help/ubuntu/

apt-get update

apt-get install openssh-server

把   #PermitRootLogin prohibit-password  改为   PermitRootLogin yes

passwd root

service ssh start

cd /home

vim startup.sh

#!/bin/bash
service ssh start
/bin/bash

chmod 777 startup.sh

exit

打包为镜像

 参考:

https://blog.csdn.net/hangvane123/article/details/88639279

原文地址:https://www.cnblogs.com/walker-lin/p/11200074.html