Dec 4, 2009 2:31 PM
Need a hand deleting an OsProcess Instance
-
Like (0)
All, I have a couple OS Processes instanced in Zope that seem to be orphaned. Any ideas on how to delete them would be appreciated. I think something was corrupted. This could have happened during the 2.5.1 upgrade, or it could have happened before that and I just did not realize....
I seem to have an orphaned OS process in the Object database. It looks like this:
ToManyRelationship at /zport/dmd/Processes/osProcessClasses/asterisk/instances
Name rasterisk e1e1d3d40573127e9ee0480caf1283d6
Hi guyverix,
There is currently no asterisk ZenPack installed. It could have been done at one time--that I had know knowledge of anyway.
Their is also an sshd process that I cannot seem to delete. The other 15-20 process classes are all behaving normally.
I still think this seems more like an orphaned object or something. If I could get it deleted, I am sure I can add them back in and be OK again.
Thanks for the ideas though.
--Randy
I am having the same (similar?) problem. I was experimenting and created a custom Process with regex to look for an Oracle process but somehow it got corrupt and now I'm trying to delete it. In the GUI I get an error if I click on it or attempt to delete it. I ran across this thread and attempted the same commands to try and delete it from zendmd but I get the same exact result you do.
>>> for d in dmd.Processes.Oracle.osProcessClasses():
... print d.id
...
oracle
>>>
>>> for d in dmd.Processes.Oracle.osProcessClasses():
... d.manage_delObjects(d.id)
...
Traceback (most recent call last):
File "<console>", line 2, in ?
File "/opt/zenoss/lib/python/OFS/ObjectManager.py", line 529, in manage_delObjects
raise BadRequest, '%s does not exist' % escape(ids[-1])
BadRequest: oracle does not exist
>>>
I’m a complete python noob so any help would be greatly appreciated!
In the Zope Management Interface at:
/zport/dmd/Processes/osProcessClasses/asterisk/instances
I see "normal" devices like these....
/zport/dmd/Devices/Server/Linux/devices/ply-astcc1.zayoms.net/os/processes/usr_sbin_asterisk
I have a similar issue with corruptions in the process table after a 2.4.x -> 2.5.x upgrade. Navigating to /Processes/Zenoss (where Zenoss is my suborganizer for defining Zenoss process monitors), I see that some processes have a count greater than 1. They should all be 1 except for zenhub (which matches both zenhub and zenhubworker). All those with counts greater than 1 are broken.
Using zendmd to look at the Zope database I get:
The broken ones, zeneventlog, and zenwin have 3 identical entries. zenprocess has 3 entries, including a weird view.
I have tried doing:
reindex()
commit()
and this doesn't resolve the errors; however it DID resolve the problem whereby zenprocess wouldn't start at all. It also looks like process data IS now being collected for all processes, including those processes that seem to have corrupt instances. I just can't see / modify the definition of the process.
Any good Zenoss gurus suggest commands / procedures to clean up the database non-destructively?
BTW, I suspect that similar reported errors with zenstatus after upgrade can be attributed to similar instance corruptions in Zope.
Cheers,
Jane
Are you working in a test environment, cpp-zen?? Mine is production and I am a bit loath to experiment until I have a quiet slot, but I did get a command to remove a particular instance of a process which did seem to work but, for me, the problem just moved to another instance. I can't currently go back and try removing each instance.
To get all the instances of a Process, use:
for d in dmd.Processes.Zenoss.osProcessClasses.zenhub.instances():
print d.id
where Zenoss is my process SubOrganizer - just omit this if all your process definitions are directly under /Processes,
and zenhub is the process you want to find instances for.
If you have a corruption, I think you will see more instances than actually exist, probably with identical hex number identifiers. From this output, the first line is index 0, the second is index 1, and so on.
To delete an instance, (say the third line), use:
dmd.Processes.Zenoss.osProcessClasses.zenhub.instances._remove(dmd.Processes.Zenoss.osProcessClasses.zenhub.instances._objects[2])
If anyone who has this problem can safely try removing spurious / all instances and feedback here, that would be great.
You will also need to reindex() and commit() again or things certainly will be messed-up and you may go back to zenprocess not starting again.
I guess the other thing to try is going back to the earlier append here and trying the little script with the d.manage_delObjects(d.id).
Cheers,
Jane
This is a production environment but I have some leaniency to work on it since things got wonky after the 2.5 upgrade.
However, I ran the code:
for d in dmd.Processes.Oracle.osProcessClasses.oracle.instances():
print d.id
My output was simply:
bin_bash
ssh
Which makes sense because I think I have my regex messed up so I'm picking up the processes from zenoss when its running oracle commands. But I'm not seeing any hex like you implied. Would I still be able to run the delete command on these two instances? If so, what would the command be? (Sorry I'm a total zendmd/python noob)
Perhaps we should wait and see if anyone from development can expand a little on what these hex numbers are?? On my test system (which doesn't have messed up processes), when I use zendmd to list process instances for devices then some have a hex number after the name and some don't and I can't see any pattern as to whether they have the hex or not.
Any gurus out there who can divulge more about the internals of the process table?
Cheers,
Jane
Hi Jane, thanks for the help with this. Here is what I have done. My issue was with a "stuck" asterisk process.
I went (in zenoss, of course) to all the asterisk servers, and deleted the osProcess instance on all of them (from the device OS tab).
I then ran the for: loop in zendmd to list what is left (basically the stuck instance was left.
I then ran the _remove call that you provided -- thanks again for the help
I committed it
Things for that instance and the asterisk process class were now empty, and the UI was working fine again.
I then remodelled the asterisk devices, and the correct osProcess instances showed back up in the UI.
All is well for me.
I suspect there is/was some problem in the zenmodel routine we run nightly. But, I did model the devices involved (all the asterisk servers) and things seem fine. I'll let a full remodel run tonight, and report if the issue re-introduces itself.
I am not sure if these problems were cause by corrupt data, or if the zenmodel routine is somehow not working due to a deeper issue. But getting to a clean baseline should help. I'll post more after the weekend when I know more.
I am flagging your response as "correct" as it does correctly show how to remove an osProcess instance, which was the original request I made in the thread.
Thanks again for your help.
--Randy
Brilliant! I hoped that might work but couldn't really try it out currently on my production system. At an appropriate time, I will follow what you have done.
Still be nice to hear from someone about the internals of the process table though
Cheers,
Jane
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||