Zabbix version: 3.0.3 (zabbix-server-mysql)
OS: Ubuntu 14.04 Trusty
Number of hosts (enabled/disabled/templates): 28 / 0 / 57
Number of items (enabled/disabled/not supported): 1349 / 161 / 47
Number of triggers (enabled/disabled): 902 / 39
Required server performance, new values per second: 22.86
Zabbix server config:
StartPollers=5
StartPollersUnreachable=2
StartTrappers=5
StartDiscoverers=3
StartHTTPPollers=5
I have template with 3 items like this: net.tcp.port[<IP>,3128]
. Template is applied to 10 servers.
Here is problem: when I enable this items, events like zabbix-agent on <hostname> is not available for 2 minutes
start to randomly appear on 10 hosts where template is applied. Values on graph "Zabbix Server Preformance" (that represents ), representing zabbix[wcache,values]
, start going down from 19-19.5 to 16-17. Values representing zabbix[queue]
stay at 0 as before.
When I disable items, problem disappears.
Zabbix server is not overloaded by I/O or CPU, there is plenty of free memory. Doesn't seem as hardware performance issue. Zabbix agents on hosts are available, I check it with nc -vz <hostname> 10050
.
Nothing abnormal appears in server log or agents logs on this 10 hosts.
I tried increasing ulimit -n
for zabbix server process, it was increased: cat /proc/<zabbix_worker_pid>/limits
now shows Max open files 10240 10240 files
. Didn't help.
I tried increasing number of StartPollers to 10 and 15 – didn't help either.
What is happening to server?
UPD:
Items type: Zabbix agent
All systems are rinning Linux ubuntu 14.04 trusty
Agents on hosts run 3 listeners, 1 collector and 1 active checks process.
For 7 of this 10 hosts zabbix_get -s <host> -t net.tcp.port[<IP>,3128]
works instantly for all 3 items, on other 3 hosts it works for about 3 seconds and returns 0(monitored IPs are not available from that 3 hosts).
Best Answer
Finally:
If:
net.tcp.port[<IP>,<port>]
and trigger using it[<IP>,<port>]
is unavailable by TCP timeoutThen:
"Zabbix-agent on {HOST.NAME} is unawailable" ( trigger expression:
{agent.ping.nodata(2m)} = 1
) start spawning on hosts with this item. Not the trigger for specific item, but the trigger for the agent availability. This is bug, but zabbix guys do not seem to agree:https://support.zabbix.com/browse/ZBX-10868
Zabbix version 3.0.3 for both server and agent.
Possible workarounds:
UserParameter=tcp_connect_check[*], /bin/nc -z "$1" "$2" -w "$3"; echo $?
and create items connect timeout less than in zabbix_agentd.conf. To avoid securely problems, do not enableUnsafeUserParameters
in zabbix_agentd.conf