Why did kernel kill OOM cause Dockerd to die?

problem description

when I use a K8s private cloud, the kernel"s OOM is always triggered due to memory limitations in jvm and docker. But after triggering the OOM of the kernel, dockerd may die, completely unresponsive.

  • docker images can be used
  • docker ps can be used
  • docker run-it nginx sh is stuck and no response
  • service docker restart is stuck and no response
  • reboot is stuck and no response

the environmental background of the problems and what methods you have tried

  • operating system

CentOS Linux release 7.4.1708 (Core)
Linux x 3.10.0-693.2.2.el7.x86_64-sharp1 SMP Tue Sep 12 22:26:13 UTC 2017 x86 "64 GNU/Linux

  • Docker
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:25:29 2018
  OS/Arch:          linux/amd64
  Experimental:     false

I tried to turn off the kernel"s OOM kill mechanism, but it didn"t work.

sysctl -p
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
vm.swappiness = 0
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
kernel.sysrq = 1
vm.overcommit_memory = 2
Mar.17,2022

when OOM occurs, the container process is forced to end, and some resources may not be released in time, resulting in an docker service exception.

I don't think it's good to turn off OOM. It's better to find out the cause of the memory leak, or limit the container's memory size, or assign swap partitions to the container.
Please refer to https://docs.docker.com/confi.

.
MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1b3229b-2bdf5.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1b3229b-2bdf5.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?