problem description
when I use a K8s private cloud, the kernel"s OOM is always triggered due to memory limitations in jvm and docker. But after triggering the OOM of the kernel, dockerd may die, completely unresponsive.
- docker images can be used
- docker ps can be used
- docker run-it nginx sh is stuck and no response
- service docker restart is stuck and no response
- reboot is stuck and no response
the environmental background of the problems and what methods you have tried
- operating system
CentOS Linux release 7.4.1708 (Core)
Linux x 3.10.0-693.2.2.el7.x86_64-sharp1 SMP Tue Sep 12 22:26:13 UTC 2017 x86 "64 GNU/Linux
- Docker
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:23:03 2018
OS/Arch: linux/amd64
Experimental: false
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:25:29 2018
OS/Arch: linux/amd64
Experimental: false
I tried to turn off the kernel"s OOM kill mechanism, but it didn"t work.
sysctl -p
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
vm.swappiness = 0
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
kernel.sysrq = 1
vm.overcommit_memory = 2