Bug #3222: Problem with launching external process from Wt application deployed as FastCGI app - Wt - Redmine

Actions

Copy link

Bug #3222

closed

Problem with launching external process from Wt application deployed as FastCGI app

Added by Alan Finley about 11 years ago. Updated almost 11 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Roel Standaert

Target version:

3.3.4

Start date:

05/28/2014

Due date:

% Done:

Estimated time:

Description

I need to launch an external process from my Wt app and read its output.

This is a simple piece of code where I launch pwd command using QProcess:

QProcess proc;
proc.start("pwd");
proc.waitForFinished(3000);

if (proc.state() == QProcess::NotRunning)
{
    if (proc.exitStatus() == QProcess::NormalExit)
    {
        qDebug() << "script output:" << proc.readAll().trimmed();
    }
    else
    {
        qWarning() <<  "script crashed";
    }
}
else
{
    qWarning() << "script TIMEOUT!";
    proc.terminate();
}

It works fine when I launch my app as Wt built-in server.

Then I try to launch my app as a FastCGI app under nginx:

export WT_APP_ROOT=/var/www/test/approot && spawn-fcgi -n -p 10118 -d /var/www/test/docroot -- /var/www/test/docroot/my_app.wt

If I enable shared-process option in wt_config.xml everything works fine too. But if I switch to dedicated-process, an external process never does it's job. I can launch any external tool and every time it terminates on timeout in my code.

What is the problem with dedicated-process option?

Files

Download all files

respawn.patch (1.58 KB) respawn.patch	Proper worker process respawn patch	Alan Finley, 06/30/2014 10:54 AM
respawn.patch (1.89 KB) respawn.patch	Proper worker process respawn patch	Alan Finley, 06/30/2014 12:30 PM
bug_3222.patch (3.94 KB) bug_3222.patch		Roel Standaert, 07/04/2014 02:18 PM

Actions

Copy link

Updated by Wim Dumon about 11 years ago

Status changed from New to Feedback

Do you get any kind of error indication? Does the process even start?

Wim.

Actions

Copy link

Updated by Koen Deforche about 11 years ago

Assignee set to Wim Dumon

Actions

Copy link

Updated by Alan Finley about 11 years ago

Wim Dumon wrote:

Do you get any kind of error indication? Does the process even start?

I tried to lauch 'touch /tmp/%session_id%' using QProcess and the file was succesfully created. Then QProcess doesn't finish correctly and terminates on timeout. Then after it terminates my Wt app becomes unresponsive for about half a minute.

And I don't get any error messages except for: QProcess: Destroyed while process is still running.

Actions

Copy link

Updated by Alan Finley about 11 years ago

I've also figured out that processes spwaned by QProcess become zombies after I destroy QProcess object.

Actions

Copy link

Updated by Alan Finley about 11 years ago

I've made some more tests.

I can reproduce the problem when I have Wt dedicated-process option on and deploy my app with apache+mod_fcgid or nginx+spawn-fcgi.

I tried to add a custom SIGCHLD signal handler to catch a signal from a process launched by QProcess:

void my_sigchld_handler(int sig)
{
    qDebug() << "my_sigchld_handler" << sig;
}

struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = my_sigchld_handler;

sigaction(SIGCHLD, &sa, NULL);

With the dedicated-process option on my handler is never called. If I switch to shared-process everything works as expected.

Actions

Copy link

Updated by Alan Finley about 11 years ago

It seems that this is actually not a Wt problem. I've managed to create a simple test program which performs the same actions as the Wt worker processes management mechanism.

http://stackoverflow.com/questions/24424707/problems-with-qprocess-after-fork-and-execv

Actions

Copy link

Updated by Wim Dumon about 11 years ago

Hi Alan,

Do you call waitpid() somewhere? A parent process must acknowledge the death of a child process, until it does, the child process is a zombie.

BR,

Wim.

Actions

Copy link

Updated by Alan Finley about 11 years ago

I've got a very useful comment on StackOverflow:

It's generally considered bad form to make glibc calls from within signal handlers. You're spawning from within your SIGCHLD handler, and I wonder if that's hosing up your signal handlers. Try setting a global flag and have main spawn new workers.

I moved worker processes spawning from signal handler to the main function using global flag and it fixed the problem. In your Server::handleSigChld function for shared-process configuration you should do the same.

As for dedicated-process configuration, you have some problems with threading in Server::handleRequestThreaded function. If I disable WT_THREADED define (which turns on boost::asio::io_service usage), QProcess works fine too.

I home this will be helpfull for you.

Actions

Copy link

Updated by Alan Finley about 11 years ago

Some useful atricle: http://www.gnu.org/software/libc/manual/html_node/Nonreentrancy.html

Quote:

Handler functions usually don't do very much. The best practice is to write a handler that does nothing but set an external variable that the program checks regularly, and leave all serious work to the program.

Actions

Copy link

#10

Updated by Alan Finley about 11 years ago

File respawn.patch respawn.patch added

I've made a simple patch for Wt in our project. It fixes worker process respawn for shared-process configuration.

Actions

Copy link

#11

Updated by Alan Finley about 11 years ago

File respawn.patch respawn.patch added

I've attached a wrong patch in the previous comment. This one is proper.

Actions

Copy link

#12

Updated by Koen Deforche about 11 years ago

Status changed from Feedback to InProgress
Assignee changed from Wim Dumon to Roel Standaert
Target version set to 3.3.4

Actions

Copy link

#13

Updated by Roel Standaert about 11 years ago

Status changed from InProgress to Resolved

I moved the spawning of child processes that replace dead ones out of the signal handler. I piggybacked it into handleRequest, though, instead of having a separate thread for this.

Actions

Copy link

#14

Updated by Alan Finley about 11 years ago

Roel Standaert wrote:

I moved the spawning of child processes that replace dead ones out of the signal handler. I piggybacked it into handleRequest, though, instead of having a separate thread for this.

And what about dedicated-process configuration? When a new process is created after wt_.ioService().post(boost::bind(&Server::handleRequest, this, serverSocket)) it has the same problem.

Actions

Copy link

#15

Updated by Roel Standaert about 11 years ago

I don't see the issue with dedicated-process? The only situation where a process is spawned from a signal handler is when using shared processes, and a child process died. With dedicated processes the process is always started from handleRequest, not from a signal handler.

Actions

Copy link

#16

Updated by Alan Finley about 11 years ago

Roel Standaert wrote:

I don't see the issue with dedicated-process? The only situation where a process is spawned from a signal handler is when using shared processes, and a child process died. With dedicated processes the process is always started from handleRequest, not from a signal handler.

The initial problem was with QProcess which could not finish correctly.

With shared-process config this problem is solved by removing new processes spawning from signal handler. With shared-process config this problem is not solved.

Here is what I wrote in the 8-th comment:

As for dedicated-process configuration, you have some problems with threading in Server::handleRequestThreaded function. If I disable WT_THREADED define (which turns on boost::asio::io_service usage), QProcess works fine too.

Actions

Copy link

#17

Updated by Roel Standaert about 11 years ago

Status changed from Resolved to InProgress

Ah yes, I see. That needs to still be fixed, indeed.

Actions

Copy link

#18

Updated by Roel Standaert about 11 years ago

Status changed from InProgress to Resolved

The problem was this (from the sigprocmask man page):

A child created via fork(2) inherits a copy of its parent's signal mask; the signal mask is preserved across execve(2).

Unblocking all signals after the fork solved the issue.

Actions

Copy link

#19

Updated by Alan Finley about 11 years ago

Roel, can I get your patch that fixes the whole problem?

Actions

Copy link

#20

Updated by Roel Standaert about 11 years ago

File bug_3222.patch bug_3222.patch added

It will be on the master soon, but here's the full patch.

Actions

Copy link

#21

Updated by Alan Finley about 11 years ago

Roel Standaert wrote:

It will be on the master soon, but here's the full patch.

Could you explain how it should work when a server receives several SIGCHLD signals at once?

What happens when I do this: kill -9 27443 27444 27445? I don't see any SIGCHLD counter in the code, only a flag handleSigChld_.

Actions

Copy link

#22

Updated by Roel Standaert about 11 years ago

Good point. Seems like I overlooked that. I will fix that.

Actions

Copy link

#23

Updated by Roel Standaert about 11 years ago

Ah no, the while loop in doHandleSigChld() will loop until all dead children are handled.

Actions

Copy link

#24

Updated by Alan Finley about 11 years ago

Roel Standaert wrote:

Ah no, the while loop in doHandleSigChld() will loop until all dead children are handled.

But that loop has break statements in both conditional branches. Shouldn't it quit after first child respawn?

Actions

Copy link

#25

Updated by Alan Finley about 11 years ago

Alan Finley wrote:

Roel Standaert wrote:

> Ah no, the while loop in doHandleSigChld() will loop until all dead children are handled.

But that loop has break statements in both conditional branches. Shouldn't it quit after first child respawn?

Sorry, those breaks are in inner loops :)

Actions

Copy link

#26

Updated by Koen Deforche almost 11 years ago

Status changed from Resolved to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Wt

Bug #3222

Problem with launching external process from Wt application deployed as FastCGI app

Updated by Wim Dumon about 11 years ago

Updated by Koen Deforche about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Wim Dumon about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Koen Deforche about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Roel Standaert about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Alan Finley about 11 years ago

Updated by Koen Deforche almost 11 years ago