[ic] Restart and stop problems (multilple processes)

Daniel Hutchison interchange-users@icdevgroup.org
Wed Feb 5 17:31:01 2003


On Wed, 2003-02-05 at 14:41, Mike Heins wrote:
> Quoting Dorothy Puma (dorothy@digilink.net):
> > Sach Jobb wrote:
> > >>Has any one else seen this same behavior with restarting or stopping the
> > >>interchange process, and have an idea on how to fix it???
> > > 
> > > 
> > > Mine works okay but i'm using OpenBSD.
> > > 
> > > However, you should be able to do this manually, no? If it's Slowlaris you
> > > could do something like 'ps -ef | grep interchange'. Look at what the PID
> > > is and simply kill it ('kill $PID'). Then just start it normally.
> > > 
> > > I suppose you could automate this process using some sort of combination
> > > involving 'ps' and 'cut' (or just making your own pid file?) but since the
> > > program 'interchange' is itself simply a perl script perhaps the solution
> > > likes in hacking it.
> > 
> > I have been doing the kill manually, it's just a pain. I've been running 
> > interchange since the good old minivend days and never had this problem. 
> >   I would hack the perl script, but wouldn't know where to start :-)
> > 
> 
> I would like to fix this, and I imagine it is simple.
> 
> Unfortunately I have no more Solaris machines and have no clients
> who use them, so I cannot test this. Without some authoritative
> debug info and without a platform to test on, there is not much I can do.

FWIW, I did try to look into this issue a little more in depth. However,
I havn't had a whole lot of time. I havn't found anything definite yet,
just some suspicions.  

>From what I can tell, the problem seems to be in the locking of the pid
file.  Eg. interchange attempts to lock the pid file when it starts up. 
If it can't lock the pid file, it assumes another interchange process is
running.  What I suspect is that interchange locks the pid file before
it forks. Since on solaris, locks created with flock() aren't inherited
across forks.   As a result, when the parent process exits the pid file
becomes unlocked.  When interchange is then run with the shutdown
command it detects that the pid file unlocked and thinks that there
isn't a running interchange process.

What I have done is verify that the default install of interchange on my
solaris box uses the flock() function to lock the pid file.  I've also
created a mini perl program that just locks files based off the code in
interchange.  The file locking works fine until I throw a fork() in
it...

Anyway, I hope this helps a bit.  

-daniel