Zabbix : How to acheive 1s interval with active items

database-performanceperformanceperformance-tuningzabbixzabbix-agent

Is it even possible/recommended ?

We are trying to set up few active items configured with a 1 second interval. However those items don't keep up the desired interval, instead we are gathering a value about every ~30 seconds (as seen on the corresponding graph).

Testing the above with a simple "echo 1" as user paramater on the client side, which should be sent every 1 second without delay, but is not. We deployed a client on the server itself with an item configured the same way as above, which is successfully gathered every seconds.

Our Zabbix setup is relatively new, therefore the underlying MySQL DB is rather small and we haven't that many clients/items. The server is running in a Linux VM and the clients on dedicated Linux hosts (not on a local network).

We looked at the configurations files both on the server/client, but didn't see what could help us achieve this (apart from adding more trappers). This seems not to be a connection issue as the client side buffer should overcome this.

Can't post more links, so here are the things we looked at :

  • Performance tuning page in the Zabbix reference manual
  • Alexei Vladishev zabbix performance tuning slides (found on slideshare)

Best Answer

Thanks to @Richlv and after some tests I made (see the comments) we found the issue. Because active items are processed sequentially and that the command behind those items could take a bit to return, therefore generating some delay stacking up for each item, the agent was doing his best looping through all the items.

As it isn't possible to have parallel processing of active items, the possible solutions in this case could be the following :

  • Increasing the time interval for active items
  • Using zabbix_sender to manually do the work (may need to also implement the client-side buffer that is provided with active items)
  • Perhaps another approach would be to use log file monitoring
  • Running 2 agent on the same client, thus setting up parallel processes, spreading the active items wisely (not a great solution though)
  • Improving performances of the command behind the active items and/or reducing worst-case timing (with a timeout for instance) -- What we did