Last week I started working on the @GEMwrld HPC cluster at the Eidgenössische Technische Hochschule Zürich (ETH). The first task was to implement a kind of system monitoring and resource profiling. We already have a centralized graphical web console, built upon Observium, that collect resources statistics from all of our servers; usually we use SNMP v2c and, just for a few cases, the munin-node daemon (more on this will be in the part 2).
On the ETH cluster we have full access to every node, but the only incoming connections allowed by the ETH firewall are the SSH sessions (on port 22), so there’s a first problem: how can we transport SNMP data from the agent to the poller?
The answer is easy: SSH tunnel! But: usually SNMP uses UDP protocol; making an UDP SSH tunnel is a bit painful. To workaround this issue we had simply used the TCP protocol for SNMP.
First of all you need an SSH tunnel. In this case the tunnel is made by the monitoring server:
ssh -N monitoring@serverip -L 16000:localhost:161
The 161 is the remote TCP port to be forwarded and 16000 is the local port.
To use SNMP on TCP you have to modify the net-snmp daemon (snmpd) command line parameters (and not the snmpd config file); just edit /etc/default/snmpd (on ubuntu/debian, for RHEL the path is /etc/sysconfig/snmpd) as following:
SNMPDOPTS='-LF 6 /var/log/snmpd.log -u snmp -g snmp -I -smux -p /var/run/snmpd.pid TCP:161'
The important part is ‘TCP:161’. This will bind snmpd on every interface and on the TCP port 161 (the default SNMP port).
On the poller side you will need to configure your monitoring system to poll localhost:16000 with tcp protocol. For Observium you can add the host with:
./addhost.php localhost community v2c 16000 tcp
Now there’s another problem. The Zurich GEM cluster is made by eight nodes and I do not want to start an SSH tunnel on every node. Then my idea was to use a kind of proxy on the control node:
net-snmp helped me with the snmpd proxy capabilities: http://www.net-snmp.org/wiki/index.php/Snmpd_proxy