Aug 13, 2012 12:55 PM
Can't keep zenhub running
-
Like (0)
I sure hope someone can help me figure this out. Been through all the logs, forums, and anything else I can think of, so as a last resort, I'm reaching out the community. I'm not a power linux user by any means, but know enough to be dangerous, to give you some background. All we need is simple monitoring and alerting, but without Zenhub, we don't get alerts, which isn't good. Zenhub won't stay running for a more than 2-3 days at this point, and it's been that way for months. There's nothing in the zenhub log. It just stops. I found this in zencommand log, which is the most information I can find on the issue.
2012-08-10 21:35:52,849 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 704, in doFetchConfig
self.setPropertyItems(driver.next())
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-10 21:35:53,342 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 752, in start
driver.next()
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-11 03:35:53,484 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 704, in doFetchConfig
self.setPropertyItems(driver.next())
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-11 03:35:53,685 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 752, in start
driver.next()
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
We are running this on Ubuntu Linux 12.04.1 and Zenoss is the latest version: Zenoss 3.2.1.
I'd truly appreciate some help in getting this fixed. Please let me know if there is any other information I can provide. Thank you in advance.
sunnydave:
On the zenoss master su zenoss then zenhub stop. Once it's stopped, ps aux | grep zenhub and kill any remaning workers if there are any. Once done, zenhub start -v10. Keep an eye on the $ZENHOME/log/zenhub.log. Post the log when zenhub dies. Can you also please post your $ZENHOME/etc/zenhub.conf?
Thanks!
Best,
--Shane Scott (Hackman238)
Hi Shane,
First off, really appreciate the response and apologize for the delay in getting back. I been on the road, traveling for work.
I've done the steps above and restarted zenhub with verbose level 10 debugging (-v10). I'll keep an eye out and post the log next time it fails.
As for the config of $ZENHOME/etc/zenhub.conf, not much to it, it appears, with the one line in the conf file commented out.
#PARAMETER | VALUE |
I'll post again when I have the logs after zenhub quits. Thank you again for your time and assistance. Best.
-Dave
sunnydave:
Anytime.
I await the log. Thanks man!
Best,
--Shane Scott (Hackman238)
Hi Shane,
Zenhub quit at some point over the weekend, and I've captured the logs and downloaded the files. However, the log collection was quite large. I've got three (3) files of 10 mb each, and another, which is the latest, of about 5 mb.
I looked through them briefly to see if I could find where the daemon quitting occurred, but honestly, nothing jumped out at me. How would you like me to get you the logs? Thank you again for your assistance.
Dave
Hi Shane,
Additionally, I found this in zencommand.log. In the last few months in trying to track this down, I've looked through all the logs, and this one references zenhub quitting with an error. Not sure if it's related, but just so you have all the information, as well as a potential time stamp of the problem occurring.
2012-08-26 03:01:47,676 INFO zen.zencommand: config: <Products.ZenRRD.zencommand.DeviceConfig instance at 0x40aca28>
2012-08-26 07:59:38,657 ERROR zen.zencommand: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 587, in processSchedule
c.start(self.pool).addBoth(self.finished)
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 342, in start
d = pr.start(self)
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 92, in start
reactor.spawnProcess(self, shell, self.cmdline, env=None)
File "/usr/local/zenoss/python/lib/python2.6/site-packages/Twisted-8.1.0-py2.6-linux-x86_64.egg/twisted/internet/posixbase.py", line 224, in spawnProcess
processProtocol, uid, gid, childFDs)
File "/usr/local/zenoss/python/lib/python2.6/site-packages/Twisted-8.1.0-py2.6-linux-x86_64.egg/twisted/internet/process.py", line 538, in __init__
self._fork(path, uid, gid, executable, args, environment, fdmap=fdmap)
File "/usr/local/zenoss/python/lib/python2.6/site-packages/Twisted-8.1.0-py2.6-linux-x86_64.egg/twisted/internet/process.py", line 373, in _fork
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
2012-08-26 09:10:49,799 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 704, in doFetchConfig
self.setPropertyItems(driver.next())
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-26 09:10:50,500 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 752, in start
driver.next()
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-26 15:10:50,796 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 704, in doFetchConfig
self.setPropertyItems(driver.next())
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
2012-08-26 15:10:51,229 ERROR zen.zencommand: ZenHub is down
Traceback (most recent call last):
File "/usr/local/zenoss/zenoss/Products/ZenRRD/zencommand.py", line 752, in start
driver.next()
File "/usr/local/zenoss/zenoss/Products/ZenUtils/Driver.py", line 64, in result
raise ex
HubDown: ZenHub is down
Thank you.
Dave
Hi Shane,
I zipped up the files and sent them your way. Thank you again. I look forward to hearing back from you. Best regards.
Dave
Oh no! Somebody dump sacked my inbox!
Just kidding. I got your logs sunnydave. I'll take a look and get back to you.
Best,
--Shane Scott (Hackman238)
Hello Gentlemen,
First of all I’m a newbie to Zenoss. I downloaded the 4.2 VM image and loaded on an ESX server. After loading a few devices and a few MIB, I too am seeing my zenhub crash. It will not stay up at all. It immediately shuts down.
When I attempt to restart the service I get the following in the zenhub log:
2012-08-28 09:14:01,227 INFO zen.HubService.RenderConfig: Starting graph retrieval listener on port 8090
2012-08-28 09:14:04,239 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
2012-08-28 09:14:07,242 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
2012-08-28 09:14:07,242 ERROR zen.ZenHub: Unable to send an event
Traceback (most recent call last):
File "/opt/zenoss/Products/ZenHub/zenhub.py", line 576, in sendEvent
self.zem.sendEvent(Event(**kw))
File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 60, in sendEvent
event = self._publishEvent(event)
File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 82, in _publishEvent
publisher.publish(event)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 283, in publish
self._publish("$RawZenEvents", routing_key, event, mandatory=mandatory, immediate=immediate)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 302, in _publish
mandatory, immediate)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 376, in publish
headers=headers, declareExchange=declareExchange)
File "/opt/zenoss/lib/python/zenoss/protocols/amqp.py", line 138, in publish
raise Exception("Could not publish message. Connection may be down")
Exception: Could not publish message. Connection may be down
2012-08-28 09:14:12,730 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
2012-08-28 09:14:15,734 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
I deleted all of the zenhubworker processes with the same results.
Any idea?
Chuck
Chuck:
Looks like your rabbitmq-server daemon might not be running. Check that and restart zenhub.
Best,
--Shane Scott (Hackman238)
Shane,
The rabbitmg-server appears to be running. Performed a restart and checked
status.
Deleted zenhubworkers process and performed a start from the
Zenoss:Daemons. Same result.
Chuck
Chuck:
Interesting. I would double check that mysqld is running as well. Next Double check that $ZENHOMe/etc/globals.conf points to the local rabbit & mysqld instance with valid creds.
Best,
--Shane Scott (Hackman238)
Hi Shane,
I had our sysadmin load up the 4.2 virtual appliance yesterday on our esx host. Firgured if we can't get the 3.2.1 version figured, perhaps we can at least migrate upwards for some stability. I configured it basic with an IP, dns, timezone, local snmp, etc., and also installed sendmail MTA for notifications. I didn't add any nodes at all for monitoring. I did remove an EC2 instance that was pre-populated. Once I rebooted the machine, I'm getting the same error as Chuck. Zenhub won't start at all.
I looked at the globals.conf, and it all appears correct. I certainly didn't make any changes there. Everything is running (including rabbitmg), mysqld, etc.
2012-08-29 10:49:24,997 ERROR zen.ZenHub: Unable to send an event
Traceback (most recent call last):
File "/opt/zenoss/Products/ZenHub/zenhub.py", line 576, in sendEvent
self.zem.sendEvent(Event(**kw))
File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 60, in sendEvent
event = self._publishEvent(event)
File "/opt/zenoss/Products/ZenEvents/MySqlSendEvent.py", line 82, in _publishEvent
publisher.publish(event)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 283, in publish
self._publish("$RawZenEvents", routing_key, event, mandatory=mandatory, immediate=immediate)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 302, in _publish
mandatory, immediate)
File "/opt/zenoss/Products/ZenMessaging/queuemessaging/publisher.py", line 376, in publish
headers=headers, declareExchange=declareExchange)
File "/opt/zenoss/lib/python/zenoss/protocols/amqp.py", line 138, in publish
raise Exception("Could not publish message. Connection may be down")
Exception: Could not publish message. Connection may be down
2012-08-29 10:49:47,766 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
2012-08-29 10:49:50,775 INFO zen.zenoss.protocols.amqp: amqp connection was closed [Errno 104] Connection reset by peer
That's the latest on my end. Thanks again for your time and assistance.
Dave
This sounds alot like message/67159#67159. Have you looked at creating the rabbitmq-env.conf file?
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||