I installed CDH6, on three virtual machines to install agent in the web interface, waiting for the newly installed Agent detection signal. This step waited for about 1 minute, prompting the following error:
Agent
7182 Cloudera Manager Server
9000 9001
/var/log/cloudera-scm-agent/
Cloudera Manager TLS -> -> /etc/cloudera-scm-agent/config.ini
use_tls=1
< H1 > deployment environment: < / H1 >
three virtual machines on a host of 32GB, each configured with 2-core CPU,8GB memory, 40GB disk, virtual machine enp0s3 network card connects to the external network through NAT, enp0s8 network card forms a local area network with the host through bridging, the host + 3 virtual machines are interconnected with each other, and the operating system installs gnome for CentOS7.2,
< H1 > deployment Planning: < / H1 >three hosts hostname are cdh102, cdh103 and cdh104, respectively. Plan to install Manager, on cdh102 and install agent on cdh102/103/104
< H1 > installation process: < / H1 >download the offline installation packages of CDH6 and Manager on cdh102, place them in the http service directory, configure YumSource to cdh102 on 102Accord 103Universe to realize offline installation. The database uses the MySQL version recommended by the official website, and Auto-TLS authentication is enabled
. < H1 > debugging process: < / H1 >according to the error report of installing agent, I did the following verification:
1) check the hostnames of three virtual machines
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.56.101 vm101
192.168.56.102 cdh102.pcicdh.com cdh102
192.168.56.103 cdh103.pcicdh.com cdh103
192.168.56.104 cdh104.pcicdh.com cdh104
cat /etc/hostname
cdh103.pcicdh.com
cat /etc/sysconfig/network
HOSTNAME=cdh102.pcicdh.com
did not find any mistakes in writing
2) check whether port 7182 of Manager is accessible
nc -w 1 192.168.56.102 7182
the runtime shows a blank line with no movement. Try other ports and find that it is an error. Does: Ncat: Connection refused, show that the blank line is silent? does it mean that it is accessible?
3) whether ports 9000 and 9001 are free on the host where agent is installed
where 104 appears as follows:
[root@cdh104 ~]-sharp lsof -i:9000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python2 2505 root 4u IPv4 22121 0t0 TCP cdh104.pcicdh.com:cslistener (LISTEN)
[root@cdh104 ~]-sharp ps -ef | grep 2505
root 2505 1068 1 09:24 ? 00:00:58 /usr/bin/python2 /opt/cloudera/cm-agent/bin/../bin/cm status_server
root 4905 2572 0 10:24 pts/0 00:00:00 grep --color=auto 2505
[root@cdh104 ~]-sharp lsof -i:9001
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python2 4618 root 10u IPv4 31088 0t0 TCP localhost:etlservicemgr (LISTEN)
[root@cdh104 ~]-sharp ps -ef | grep 4618
root 4618 1 0 10:01 ? 00:00:13 /usr/bin/python2 /opt/cloudera/cm-agent/bin/cm agent
root 4927 2572 0 10:25 pts/0 00:00:00 grep --color=auto 4618
the same is true for the other two, indicating that agent has been used for 9000 and 9001
.4) check the / var/log/cloudera-scm-agent/cloudera-scm-agent.log log on cdh104 to find the ERROR log
[09/Oct/2018 10:27:24 +0000] 4618 MainThread agent ERROR Heartbeating to cdh102.pcicdh.com:7182 failed.
Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1362, in _send_heartbeat
self.cfg.max_cert_depth)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/https.py", line 139, in __init__
self.conn.connect()
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/httpslib.py", line 80, in connect
sock.connect((self.host, self.port))
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 304, in connect
ret = self.connect_ssl()
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 291, in connect_ssl
return m2.ssl_connect(self.ssl, self._timeout)
SSLError: sslv3 alert bad certificate
[09/Oct/2018 10:27:29 +0000] 4618 MainThread agent ERROR Heartbeating to cdh102.pcicdh.com:7182 failed.
Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1362, in _send_heartbeat
self.cfg.max_cert_depth)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/https.py", line 139, in __init__
self.conn.connect()
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/httpslib.py", line 80, in connect
sock.connect((self.host, self.port))
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 304, in connect
ret = self.connect_ssl()
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/M2Crypto/SSL/Connection.py", line 291, in connect_ssl
return m2.ssl_connect(self.ssl, self._timeout)
SSLError: sslv3 alert bad certificate
from the log, cdh104 cannot send a heartbeat to port 7182 of cdh102, but it can be connected from the nc command in the second point, that is, this problem has not been solved.
5) check / etc/cloudera-scm-agent/config.ini on cdh104
Auto-TLS is enabled during installation, and "use TLS encryption for agents" is also enabled in the Manager Management-> configuration-> Security interface. The use_tls configuration in the config.ini file on cdh104 is 0, so I try to change it to 1. After saving, restart the configuration of the other two virtual machines on agent, on cdh104 with the systemctl restart cloudera-scm-agent.service command without modification. Click the retry installation of cdh104 on the web interface and still report the same error