Aug 26, 2011 2:44 AM
WARNING zen.zenprocess: Queue exceeded maximum length: 101476/100000. Trimming
-
Like (0)
Hi,
I have noticed that some events of several processes have remained in the zenoss console without being removed. Watching the Zenprocess log, I get the Trimming error again even though I've increased the zenprocess queue up to 10.000.
2011-08-16 19:09:28,980 WARNING zen.zenprocess: Queue exceeded maximum length: 101476/100000. Trimming
Can anyone help me find a solution?
I have come to think it may be a queue management problem. I have looked at the PBDaemon.py, where the action is done and I have seen that in this part of the code:
# Set a maximum size on the eventQueue to avoid consuming all RAM.
queueLen = len(self.eventQueue)
if queueLen > self.options.maxqueuelen:
self.log.warn('Queue exceeded maximum length: %d/%d. Trimming',
queueLen, self.options.maxqueuelen)
diff = queueLen - self.options.maxqueuelen
self.eventQueue = self.eventQueue[diff:]
with the last assignment events get lost, that we are not able to process. Is this correct?
Thanks and best regards.
mgcarrete:
As you noted you can increase the queue length. All this length is is a buffer before events are simply discarded. High event queues can occur when daemons first start up, if daemons are reconfiguring too often for the load they carry or if there is a severe bottle neck at zenhub or your mysqld. Can you post the collector's reconfiguration period? Can you also post the load on the collector, number of devices, type of devices, collection interval and hardware (cpu, ram, disk platform) of the collector?
Best,
--Hackman238
Here are the data:
Event Log Cycle Interval (secs): 60
SNMP Performance Cycle Interval (secs): 60
Process Cycle Interval (secs): 60
Config Cycle Interval (mins): 2
Number of devices: 3
Type of devices: Server/Linux
Processes monitored: 2000
Hardware of the collector: 15 Intel Xeon processors and 24GB RAM
mgcarrete:
You're monitoring 2000 processes over 3 devices? As in like 650 processes on each? I just want to be sure I understand.
Best,
--Hackman238
Yes, that's right.
Best regards.
mgcarrete:
I'm not positive that's going to work. That's a pretty special situation. Does your disk IO look like its a bottle neck?
To the zenprocess.conf:
cacheconfigs true
maxqueuelen 500000
eventflushchunksize 200
To the zenhub.conf (the hub that zenprocess is connecting to in the event of multi collectors, etc):
workers 6
cachesize 100000
pcachesize 2000
Restart the zenhub then restart zenprocess. Let me know how it turns out.
Best,--Hackman238
Hi,
Thanks for your answer. Since we haven't got a machine with seven processors (workers 6) we haven´t been able to carry out the test correctly. Besides, we haven`t found the cacheconfigs field in the zenprocess.conf
What we would like to know is how to solve the problem we mentioned early on:
I have noticed that some events of several processes have remained in the zenoss console without being removed. Watching the Zenprocess log, I get the Trimming error again even though I've increased the zenprocess queue up to 10.000.
Because events shouldn't get lost once it is full, as is the case now.
diff = queueLen - self.options.maxqueuelen
self.eventQueue = self.eventQueue[diff:]
Best regards.
mgcarrete:
Sorry, I was under the impression you had 16 processors since you said 15 Xeon's (figured 0-15). I'd back the workers off to 3. Taking a quick peek, you're right, zenprocess doesnt have caching in v2/v3. Anyway, if the event queue is being trimed its because the zenhub is being flooded. When the queue exceeds the limit its trimmed and some events might be discarded. This often leads to lost clear events which would be why some of your events persist. Zenhub becomes very easily flooded. Zenhub can become flooded if its constantly reconfiguring large batches, eventchunking is too large, caching is too small, has too few workers, or if mysqld is badly backed up.
Best,
--Hackman238
For anybody who comes across this thread with similar problems, you might be interested in my post: Understanding Your Zenperfsnmp Event Queue.
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||