Need a hand deleting an OsProcess Instance

Up to Discussions in zenoss-users

1 2 Previous Next 166509 Views 20 Replies Latest reply: Dec 16, 2009 8:55 PM by rbilder

rbilder

36 posts since
Jun 25, 2008

Currently Being Moderated

Dec 4, 2009 2:31 PM

All, I have a couple OS Processes instanced in Zope that seem to be orphaned. Any ideas on how to delete them would be appreciated. I think something was corrupted. This could have happened during the 2.5.1 upgrade, or it could have happened before that and I just did not realize....

I seem to have an orphaned OS process in the Object database. It looks like this:

ToManyRelationship at /zport/dmd/Processes/osProcessClasses/asterisk/instances

Name rasterisk e1e1d3d40573127e9ee0480caf1283d6

Normally the instances would be a related to a specific device, but for some reason, not this one.

I can run this:

>>> for d in dmd.Processes.osProcessClasses.asterisk.instances():
... print d.id
rasterisk e1e1d3d40573127e9ee0480caf1283d6
>>>

But, I cannot seem to delete it, I get this...

>>> for d in dmd.Processes.osProcessClasses.asterisk.instances():
... d.manage_delObjects(d.id)
...
Traceback (most recent call last):
File "<console>", line 2, in ?
File "/usr/local/zenoss/zenoss/lib/python/OFS/ObjectManager.py", line 529, in manage_delObjects
raise BadRequest, '%s does not exist' % escape(ids[-1])
BadRequest: rasterisk e1e1d3d40573127e9ee0480caf1283d6 does not exist
>>>

Not sure if I am using the correct method though. Any ideas would be welcome.

Thanks again

--Randy

Like (0)

guyverix
846 posts since
Jul 10, 2007

Currently Being Moderated

1. Dec 7, 2009 4:15 AM (in response to rbilder)
Re: Need a hand deleting an OsProcess Instance

Huh... /asterisk/instances.. Looking at that makes me wonder if you had the Asterisk ZenPack installed at one point. If that is the case, and you did not install it on the upgrade, you might need to install it, then move your asterisk device to a temporary class and then uninstall the asterisk ZenPack so that zope can cleanly remove the references.

Report Abuse

Like (0)
rbilder
36 posts since
Jun 25, 2008

Currently Being Moderated

2. Dec 7, 2009 11:09 AM (in response to guyverix)
Re: Need a hand deleting an OsProcess Instance

Hi guyverix,

There is currently no asterisk ZenPack installed. It could have been done at one time--that I had know knowledge of anyway.

Their is also an sshd process that I cannot seem to delete. The other 15-20 process classes are all behaving normally.

I still think this seems more like an orphaned object or something. If I could get it deleted, I am sure I can add them back in and be OK again.

Thanks for the ideas though.

--Randy

Report Abuse

Like (0)
cpp_zen
10 posts since
Dec 7, 2009

Currently Being Moderated

3. Dec 7, 2009 7:48 PM (in response to rbilder)
Re: Need a hand deleting an OsProcess Instance

I am having the same (similar?) problem. I was experimenting and created a custom Process with regex to look for an Oracle process but somehow it got corrupt and now I'm trying to delete it. In the GUI I get an error if I click on it or attempt to delete it. I ran across this thread and attempted the same commands to try and delete it from zendmd but I get the same exact result you do.

>>> for d in dmd.Processes.Oracle.osProcessClasses():
...      print d.id
...
oracle
>>>

>>> for d in dmd.Processes.Oracle.osProcessClasses():
...      d.manage_delObjects(d.id)
...
Traceback (most recent call last):
File "<console>", line 2, in ?
File "/opt/zenoss/lib/python/OFS/ObjectManager.py", line 529, in manage_delObjects
    raise BadRequest, '%s does not exist' % escape(ids[-1])
BadRequest: oracle does not exist
>>>

I’m a complete python noob so any help would be greatly appreciated!

Report Abuse

Like (0)
guyverix
846 posts since
Jul 10, 2007

Currently Being Moderated

4. Dec 7, 2009 8:26 PM (in response to cpp_zen)
Re: Need a hand deleting an OsProcess Instance

Can you copy the process information from the GUI and put it in here like this?

Name Regex Monitor Count
mysqld sbin\/mysqld True 4

Report Abuse

Like (0)
cpp_zen
10 posts since
Dec 7, 2009

Currently Being Moderated

5. Dec 7, 2009 8:28 PM (in response to guyverix)
Re: Need a hand deleting an OsProcess Instance

(Not trying to hijack the thread but here's mine)

Name Regex Monitor Count
oracle oracle.* False 2

I set monitor to false to see if that made a difference, but of course it didn't. I can't find which 2 servers it thinks has the process either.

Report Abuse

Like (0)
guyverix
846 posts since
Jul 10, 2007

Currently Being Moderated

6. Dec 7, 2009 8:41 PM (in response to cpp_zen)
Re: Need a hand deleting an OsProcess Instance

go into hostname:8080/manage. Click find >> advanced. CTRL check OSProcess, OSProcessClass, OSProcessOrganizer. Now put Oracle down in the expr field and hit find. It should return the device instances that are still looking at that process. I have never deleted entries in this fashion before, so I dont know if it is safe to blow them away without doing bad things to the Database.. Having said that, if you Drill down into opProcessOrganizer >> osProcessClasses you should see your OS process that is stuck.. However it might be safer to delete and re-add the two devices that have this osProcess active on them since you can see them in the database entry.

Report Abuse

Like (0)
rbilder
36 posts since
Jun 25, 2008

Currently Being Moderated

7. Dec 8, 2009 10:59 AM (in response to guyverix)
Re: Need a hand deleting an OsProcess Instance

In the Zope Management Interface at:

    /zport/dmd/Processes/osProcessClasses/asterisk/instances

I see "normal" devices like these....

    /zport/dmd/Devices/Server/Linux/Asterisk/devices/tmp-ast1.zayoms.net/os/processes/bin_bash 805ce91b69aa08ea4738563e41788860

          /zport/dmd/Devices/Server/Linux/devices/ply-astcc1.zayoms.net/os/processes/usr_sbin_asterisk

They seem to behave normally. You can click them and get to the device listed.

But, the instance in question shows:

       rasterisk e1e1d3d40573127e9ee0480caf1283d6

This does not look like any of the other of the Asterisk instances (or, for any of the OSProcesses we monitor for that matter).

I suspect the device has been deleted, or this is some sort of corruption.

The question then, is how do I delete it? When you click this object you get a Zenoss site error. If you attempt to delete it (from the Zope Mgr), you get a message, "BadRequest: rasterisk e1e1d3d40573127e9ee0480caf1283d6 does not exist".

I've tried using zendmd also, using code shown in the original post, and some variants, as I try to get it deleted from the object database.

--Randy

Report Abuse

Like (0)
jcurry
1,021 posts since
Apr 15, 2008

Currently Being Moderated

8. Dec 11, 2009 5:08 AM (in response to rbilder)
Re: Need a hand deleting an OsProcess Instance

I have a similar issue with corruptions in the process table after a 2.4.x -> 2.5.x upgrade. Navigating to /Processes/Zenoss (where Zenoss is my suborganizer for defining Zenoss process monitors), I see that some processes have a count greater than 1. They should all be 1 except for zenhub (which matches both zenhub and zenhubworker). All those with counts greater than 1 are broken.

Using zendmd to look at the Zope database I get:
The broken ones, zeneventlog, and zenwin have 3 identical entries. zenprocess has 3 entries, including a weird view.

I have tried doing:
reindex()
commit()

and this doesn't resolve the errors; however it DID resolve the problem whereby zenprocess wouldn't start at all. It also looks like process data IS now being collected for all processes, including those processes that seem to have corrupt instances. I just can't see / modify the definition of the process.

Any good Zenoss gurus suggest commands / procedures to clean up the database non-destructively?

BTW, I suspect that similar reported errors with zenstatus after upgrade can be attributed to similar instance corruptions in Zope.

Cheers,
Jane

Report Abuse

Like (0)
cpp_zen
10 posts since
Dec 7, 2009

Currently Being Moderated

9. Dec 11, 2009 11:42 AM (in response to jcurry)
Re: Need a hand deleting an OsProcess Instance

Wow, good stuff! Running reindex() then commit() in zendmd got zenprocess finally started for me as well. I still get an error when I click on or try to delete that "oracle" process I created, but I can live with that if zenprocess will actually continue to function on everything else. Did this work for anyone else?

Report Abuse

Like (0)
jcurry
1,021 posts since
Apr 15, 2008

Currently Being Moderated

10. Dec 11, 2009 12:46 PM (in response to cpp_zen)
Re: Need a hand deleting an OsProcess Instance

Are you working in a test environment, cpp-zen?? Mine is production and I am a bit loath to experiment until I have a quiet slot, but I did get a command to remove a particular instance of a process which did seem to work but, for me, the problem just moved to another instance. I can't currently go back and try removing each instance.

To get all the instances of a Process, use:
for d in dmd.Processes.Zenoss.osProcessClasses.zenhub.instances():
print d.id

where Zenoss is my process SubOrganizer - just omit this if all your process definitions are directly under /Processes,
and zenhub is the process you want to find instances for.
If you have a corruption, I think you will see more instances than actually exist, probably with identical hex number identifiers. From this output, the first line is index 0, the second is index 1, and so on.

To delete an instance, (say the third line), use:
dmd.Processes.Zenoss.osProcessClasses.zenhub.instances._remove(dmd.Processes.Zenoss.osProcessClasses.zenhub.instances._objects[2])

If anyone who has this problem can safely try removing spurious / all instances and feedback here, that would be great.
You will also need to reindex() and commit() again or things certainly will be messed-up and you may go back to zenprocess not starting again.

I guess the other thing to try is going back to the earlier append here and trying the little script with the d.manage_delObjects(d.id).

Cheers,
Jane

Report Abuse

Like (0)
cpp_zen
10 posts since
Dec 7, 2009

Currently Being Moderated

11. Dec 11, 2009 1:07 PM (in response to jcurry)
Re: Need a hand deleting an OsProcess Instance

This is a production environment but I have some leaniency to work on it since things got wonky after the 2.5 upgrade.

However, I ran the code:

for d in dmd.Processes.Oracle.osProcessClasses.oracle.instances():
print d.id

My output was simply:

bin_bash
ssh

Which makes sense because I think I have my regex messed up so I'm picking up the processes from zenoss when its running oracle commands. But I'm not seeing any hex like you implied. Would I still be able to run the delete command on these two instances? If so, what would the command be? (Sorry I'm a total zendmd/python noob)

Report Abuse

Like (0)
jcurry
1,021 posts since
Apr 15, 2008

Currently Being Moderated

12. Dec 11, 2009 3:12 PM (in response to cpp_zen)
Re: Need a hand deleting an OsProcess Instance

Perhaps we should wait and see if anyone from development can expand a little on what these hex numbers are?? On my test system (which doesn't have messed up processes), when I use zendmd to list process instances for devices then some have a hex number after the name and some don't and I can't see any pattern as to whether they have the hex or not.

Any gurus out there who can divulge more about the internals of the process table?

Cheers,
Jane

Report Abuse

Like (0)
rbilder
36 posts since
Jun 25, 2008

Currently Being Moderated

13. Dec 11, 2009 3:19 PM (in response to jcurry)
Re: Need a hand deleting an OsProcess Instance

Hi Jane, thanks for the help with this. Here is what I have done. My issue was with a "stuck" asterisk process.

I went (in zenoss, of course) to all the asterisk servers, and deleted the osProcess instance on all of them (from the device OS tab).
I then ran the for: loop in zendmd to list what is left (basically the stuck instance was left.
I then ran the _remove call that you provided -- thanks again for the help
I committed it
Things for that instance and the asterisk process class were now empty, and the UI was working fine again.
I then remodelled the asterisk devices, and the correct osProcess instances showed back up in the UI.

All is well for me.

I suspect there is/was some problem in the zenmodel routine we run nightly. But, I did model the devices involved (all the asterisk servers) and things seem fine. I'll let a full remodel run tonight, and report if the issue re-introduces itself.

I am not sure if these problems were cause by corrupt data, or if the zenmodel routine is somehow not working due to a deeper issue. But getting to a clean baseline should help. I'll post more after the weekend when I know more.

I am flagging your response as "correct" as it does correctly show how to remove an osProcess instance, which was the original request I made in the thread.

Thanks again for your help.

--Randy

Report Abuse

Like (0)
jcurry
1,021 posts since
Apr 15, 2008

Currently Being Moderated

14. Dec 11, 2009 3:28 PM (in response to rbilder)
Re: Need a hand deleting an OsProcess Instance

Brilliant! I hoped that might work but couldn't really try it out currently on my production system. At an appropriate time, I will follow what you have done.

Still be nice to hear from someone about the internals of the process table though

Cheers,
Jane

Report Abuse

Like (0)