Project

General

Profile

Actions

Bug #1708

closed

WServer::stop() does not return in Wt-3.2.3

Added by Стойчо Стефанов Stoycho Stefanov almost 12 years ago. Updated almost 11 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
02/18/2013
Due date:
% Done:

0%

Estimated time:

Description

Hey,

it seems to me that WServer::stop() does not return when using Wt-3.2.3. As far as I can remember it worked fine with Wt-3.2.1.

I'm trying to restart my server by sending SIGINT from within my application. The logout() calling (Line 43 in the example) is just workaround that provides a link which starts a new session.

I just verified it with Wt-3.2.1. and it works without the need of calling logout(). WServer::stop() returns and the server starts again properly.

I would be nice to fix it before Wt-3.3.0 release.

best regards,

Stoycho


Files

serverRestart.cpp (2.69 KB) serverRestart.cpp server restart example Стойчо Стефанов Stoycho Stefanov, 02/18/2013 09:14 AM
wt_config.xml (21.9 KB) wt_config.xml config file Стойчо Стефанов Stoycho Stefanov, 03/26/2013 04:45 PM
Actions #1

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Sorry,

but I haven't got a time to test with the release candidate (Wt-3.3.0 RC1).

regards,

stoycho

Actions #2

Updated by Koen Deforche almost 12 years ago

  • Status changed from New to InProgress
  • Assignee set to Koen Deforche
  • Target version set to 3.3.0
Actions #3

Updated by Koen Deforche almost 12 years ago

Hey,

This works for me with Wt 3.3.0 ... Could you verify ?

Regards,

koen

Actions #4

Updated by Koen Deforche almost 12 years ago

  • Status changed from InProgress to Resolved
Actions #5

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hey,

sorry, but can't. I don't build it jet, but I'll give you a feedback when I can verify it.

regards,

Stoycho

Actions #6

Updated by Koen Deforche almost 12 years ago

  • Status changed from Resolved to Closed
Actions #7

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hi Koen,

it still not work for me even with the latest git.

When I click the restart button the last output I see is from http://redmine.webtoolkit.eu/attachments/1024/serverRestart.cpp#L83 and the server is hanging on until I press F5 in the browser.

I just tested it with wt-3.2.1 and it works as I expect. When I click the 'restart' button after a short delay where the loading indicator is shown the server starts and a new session is loaded.

Could look at this once again, please!

best regards,

Stoycho

Actions #8

Updated by Koen Deforche almost 12 years ago

  • Status changed from Closed to InProgress
Actions #9

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hi,

it could be browser dependent. I cannot recognise any systematic in it, but sometimes it works with Safari (5.1.7), sometimes get stuck and after manual refresh (F5) it works again correctly. With Firefox (16.0.1) I never see the expected behaviour. I really do not have idea what is wrong. Moreover, it seems that sometimes it does not work even with wt-3.2.1.

regards,

Stoycho

Actions #10

Updated by Koen Deforche almost 12 years ago

  • Status changed from InProgress to Feedback

Hey,

I couldn't reproduce a single time, trying with various browsers. My Firefox version was 19 however (damned out-updating ...).

Perhaps there are other things that matter ? How many threads you are using, boost version, ... ?

Regards,

koen

Actions #11

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hey,

I'm a bit confused now :) Where can I see how many threads I use. I do not specify any number and a default one has to be used I suppose. I'm using the built-in server and boost 1.46.0.

Regards,

Stoycho

Actions #12

Updated by Koen Deforche almost 12 years ago

Hey,

If you're not specifying a parameter then you probably use the default (10) too so that cannot be the difference.

Do you use a wt_config.xml file ?

Regards,

koen

Actions #13

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hey,

yes, I do use the wt_config.xml. Why?

regards,

stoycho

Actions #14

Updated by Koen Deforche almost 12 years ago

Hey Stoycho,

Perhaps a setting in there is what causes the problem (I still cannot reproduce it).

Could you attach it too ?

Regards,

koen

Actions #15

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hey Koen,

Here it is. I hope it helps

regards,

stoycho

Actions #16

Updated by Koen Deforche almost 12 years ago

  • Target version changed from 3.3.0 to 3.3.1

Hey,

I tried and tried but to no avail, I cannot reproduce any problem.

When it hangs, could you then provide stack traces ?

Just to be sure, I run your example unmodified and get output like this:

koen@vierwerf:~/project/wt/git/wt/test/interactive/test$ ../../../build-boost-1_52/test/interactive/test/test.wt --docroot . --http-address 0.0.0.0 --http-port 9090 -c wt_config.xml 
3.3.0
try start the server
10.10.0.6 - - [2013-Apr-08 22:10:07.816883] "GET / HTTP/1.1" 200 1941
10.10.0.6 - - [2013-Apr-08 22:10:07.881714] "GET /?wtd=eTQkfkEBkaPOQnOI&request=style HTTP/1.1" 200 0
10.10.0.6 - - [2013-Apr-08 22:10:07.882365] "GET /?&wtd=eTQkfkEBkaPOQnOI HTTP/1.1" 200 1941
10.10.0.6 - - [2013-Apr-08 22:10:07.922052] "GET /?wtd=eTQkfkEBkaPOQnOI&request=style HTTP/1.1" 200 91
10.10.0.6 - - [2013-Apr-08 22:10:07.929604] "GET /?wtd=eTQkfkEBkaPOQnOI&sid=677854565&htmlHistory=true&deployPath=%2F&request=script&rand=3154322972 HTTP/1.1" 200 34479
10.10.0.6 - - [2013-Apr-08 22:10:07.957904] "GET /resources/themes/default/wt.css HTTP/1.1" 304 0
10.10.0.6 - - [2013-Apr-08 22:10:08.047529] "GET /favicon.ico HTTP/1.1" 404 85
10.10.0.6 - - [2013-Apr-08 22:10:08.107396] "POST /?wtd=eTQkfkEBkaPOQnOI HTTP/1.1" 200 49
Return from waitForShutdown
server.stop()
10.10.0.6 - - [2013-Apr-08 22:10:11.405791] "POST /?wtd=eTQkfkEBkaPOQnOI HTTP/1.1" 200 49
continue
try start the server
10.10.0.6 - - [2013-Apr-08 22:10:13.086736] "POST /?wtd=eTQkfkEBkaPOQnOI HTTP/1.1" 200 72
10.10.0.6 - - [2013-Apr-08 22:10:13.115010] "GET /?&wtd=eTQkfkEBkaPOQnOI HTTP/1.1" 200 1941
10.10.0.6 - - [2013-Apr-08 22:10:13.172366] "GET /?wtd=ZhHcoXLwbhZGfhKP&request=style HTTP/1.1" 200 0
10.10.0.6 - - [2013-Apr-08 22:10:13.172895] "GET /?&wtd=ZhHcoXLwbhZGfhKP HTTP/1.1" 200 1941
10.10.0.6 - - [2013-Apr-08 22:10:13.210032] "GET /favicon.ico HTTP/1.1" 404 85
10.10.0.6 - - [2013-Apr-08 22:10:13.241330] "GET /?wtd=ZhHcoXLwbhZGfhKP&request=style HTTP/1.1" 200 91
10.10.0.6 - - [2013-Apr-08 22:10:13.248788] "GET /?wtd=ZhHcoXLwbhZGfhKP&sid=-263709915&htmlHistory=true&deployPath=%2F&request=script&rand=2210745258 HTTP/1.1" 200 34478
10.10.0.6 - - [2013-Apr-08 22:10:13.361526] "GET /favicon.ico HTTP/1.1" 404 85
10.10.0.6 - - [2013-Apr-08 22:10:13.420722] "POST /?wtd=ZhHcoXLwbhZGfhKP HTTP/1.1" 200 50
Return from waitForShutdown
server.stop()
10.10.0.6 - - [2013-Apr-08 22:10:13.957939] "POST /?wtd=ZhHcoXLwbhZGfhKP HTTP/1.1" 200 50
continue
try start the server
10.10.0.6 - - [2013-Apr-08 22:10:15.909405] "POST /?wtd=ZhHcoXLwbhZGfhKP HTTP/1.1" 200 72
10.10.0.6 - - [2013-Apr-08 22:10:15.940956] "GET /?&wtd=ZhHcoXLwbhZGfhKP HTTP/1.1" 200 1942
10.10.0.6 - - [2013-Apr-08 22:10:15.997660] "GET /?wtd=ENGyNrmZ4sYmBMnQ&request=style HTTP/1.1" 200 0
10.10.0.6 - - [2013-Apr-08 22:10:15.998187] "GET /?&wtd=ENGyNrmZ4sYmBMnQ HTTP/1.1" 200 1941
10.10.0.6 - - [2013-Apr-08 22:10:16.066404] "GET /?wtd=ENGyNrmZ4sYmBMnQ&request=style HTTP/1.1" 200 91
10.10.0.6 - - [2013-Apr-08 22:10:16.069888] "GET /favicon.ico HTTP/1.1" 404 85
10.10.0.6 - - [2013-Apr-08 22:10:16.073758] "GET /?wtd=ENGyNrmZ4sYmBMnQ&sid=-167724224&htmlHistory=true&deployPath=%2F&request=script&rand=2796697482 HTTP/1.1" 200 34476
10.10.0.6 - - [2013-Apr-08 22:10:16.192922] "GET /favicon.ico HTTP/1.1" 404 85
10.10.0.6 - - [2013-Apr-08 22:10:16.252283] "POST /?wtd=ENGyNrmZ4sYmBMnQ HTTP/1.1" 200 50
Return from waitForShutdown
server.stop()
10.10.0.6 - - [2013-Apr-08 22:10:16.582470] "POST /?wtd=ENGyNrmZ4sYmBMnQ HTTP/1.1" 200 50
continue
try start the server
10.10.0.6 - - [2013-Apr-08 22:10:18.178777] "POST /?wtd=ENGyNrmZ4sYmBMnQ HTTP/1.1" 200 72
10.10.0.6 - - [2013-Apr-08 22:10:18.207588] "GET /?&wtd=ENGyNrmZ4sYmBMnQ HTTP/1.1" 200 1942
...

Regards,

koen

Actions #17

Updated by Стойчо Стефанов Stoycho Stefanov almost 12 years ago

Hey,

1. Start web server

3.3.0
try start the server
10.0.2.80 - - [2013-Apr-09 16:58:40.349948] "POST /?wtd=eHI8wIcaDY30nM1v HTTP/1.1" 200 72
10.0.2.80 - - [2013-Apr-09 16:58:40.380889] "GET /?&wtd=eHI8wIcaDY30nM1v HTTP/1.1" 200 1941
10.0.2.80 - - [2013-Apr-09 16:58:40.431373] "GET /?&wtd=RJlXG9XKN3Dwaam5 HTTP/1.1" 200 1941
10.0.2.80 - - [2013-Apr-09 16:58:40.461339] "GET /?wtd=RJlXG9XKN3Dwaam5&request=style HTTP/1.1" 200 88
10.0.2.80 - - [2013-Apr-09 16:58:40.550618] "GET /?wtd=RJlXG9XKN3Dwaam5&sid=-72178433&htmlHistory=true&deployPath=%2F&request=script&rand=2087555690 HTTP/1.1" 200 34387
10.0.2.80 - - [2013-Apr-09 16:58:40.650366] "POST /?wtd=RJlXG9XKN3Dwaam5 HTTP/1.1" 200 49

2. Click on "restart" button

Return from waitForShutdown
server.stop()

Here hangs it and the loading indicator is shown. After:

3. pressing F5

continue
try start the server
10.0.2.80 - - [2013-Apr-09 17:00:24.904557] "GET /?&wtd=8N2aFX6TIQBahR8r HTTP/1.1" 200 1943
10.0.2.80 - - [2013-Apr-09 17:00:24.923545] "POST /?wtd=8N2aFX6TIQBahR8r HTTP/1.1" 200 72
10.0.2.80 - - [2013-Apr-09 17:00:24.968529] "GET /?&wtd=H82QP4sRxLSY6B3X HTTP/1.1" 200 1942
10.0.2.80 - - [2013-Apr-09 17:00:25.010412] "GET /?wtd=H82QP4sRxLSY6B3X&request=style HTTP/1.1" 200 88
10.0.2.80 - - [2013-Apr-09 17:00:25.097637] "GET /?wtd=H82QP4sRxLSY6B3X&sid=-710642230&htmlHistory=true&deployPath=%2F&request=script&rand=4203343128 HTTP/1.1" 200 34390
10.0.2.80 - - [2013-Apr-09 17:00:25.204582] "POST /?wtd=H82QP4sRxLSY6B3X HTTP/1.1" 200 50

Just let me know how could I provide you better stack traces if these are not accurate enough.

Regards,

Stoycho

Actions #18

Updated by W X over 11 years ago

Hi,

A similar issue happened to me also with wt3.3.0. Calling WServer::stop seem to block in WIOService::stop() when joining the WT threads that don't seem to ever stop.

(using_timer is false in boost::thread, so join() on that thread will wait infinitely for a thread that doesn't stop).

This is the stack trace:

    ntdll.dll!NtWaitForMultipleObjects()  + 0x15 bytes  
    [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] 
    kernel32.dll!WaitForMultipleObjectsEx()  + 0x8e bytes   
    kernel32.dll!WaitForMultipleObjects()  + 0x18 bytes 
>   .dll!boost::this_thread::interruptible_wait(void * handle_to_wait_for=0x0000132c, boost::detail::timeout target_time={...})  Line 454 + 0x34 bytes  C++
    .dll!boost::thread::join()  Line 271 + 0x2f bytes   C++
    .dll!Wt::WIOService::stop()  Line 91    C++
    .dll!Wt::WServer::stop()  Line 221  C++
    .dll!CServer::Stop()  Line 289  C++
    .dll!CController::stopWebServer()  Line 83  C++
    App.exe!Stop Server initiated from GUI button

WServer::Stop() used to work seamlessly with wt_3.2.1. I'm using boost 1.46.1.

Could this be related to this boost::asio fix in boost 1.53.0: https://svn.boost.org/trac/boost/ticket/7552.

Is there a recommended BOOST version for each WT release ?

Thanks!

Actions #19

Updated by Wim Dumon over 11 years ago

Is there any thread that has Wt stuff on its stack?

We have no specific recommendation for a boost release, we recommend to use a recent boost version. On the windows install instruction page, a few boost versions are listed where we identified annoying problems (windows specific).

Stoycho seems to be talking about a bug on Linux, you now report this on Windows. The issue is that while shutting down cleanly, we wait for the io_service to become idle (i.e. no asynchronous callbacks scheduled anymore). Any task, timer, ... that is still scheduled will prevent the io_service to shut down. I'm not sure if we can guarantee memleak-free shutdown if we force the io_service to stop. If Wt leaves something on the io_service, it's a bug; if you posted something in Wt's io_service, it's a feature ;-)

Wim.

Actions #20

Updated by W X over 11 years ago

Yes, for me it happens on windows.

The stack trace is for the main thread. From this thread, some callbacks are posted indeed to WT just before calling WServer::stop() (the callbacks do some internal cleanup after which they update the UI by deleting and hiding displayed widgets and by creating and showing some sort of disconnect page). The 'post' is done with server.post(pApp->sessionId(), boost::bind(&MyWebApplication:m, pApp));

We keep pApp in some internal list because WApplication::instance() returns 0 on non WT thread.

At this point I'm not sure what is happening, but beside the blockage, the disconnect widgets are not shown.

The same code worked well with wt 3.2.1 (meaning that the disconnect page was shown and then the server stops).

So I guess we fall into the second case: it's a feature (?!) what feature ? :)

Should I open a new issue to WT?

Actions #21

Updated by Wim Dumon over 11 years ago

Hi WX,

Stack traces of the other threads may reveal if the application is in a deadlock.

It's better to keep a list of session ID's rather than pointers to applications, since otherwise you also have to manage potential race condition between applications being deleted and the possibility to still have a dereference to it from a different thread.

If it consistently doesn't work, it's worth submitting a bug report (test case much appreciated!), especially if it worked with 3.2.1.

BR,

Wim.

Actions #22

Updated by W X over 11 years ago

I'll try to come up with a test case, but until now I couldn't reproduce the blockage with a small test case.

I could reproduce the situation where the widgets are not shown after a sequence like this:

WServer.post(MyApp::closingroutine)

wait - without this, consistently the disconnect widget is not shown, but this time with wt 3.2.1 also

Wserver.stop().

MyApp::closingroutine - shows a disconnect page/widget. So, maybe my expectations, that the posted callback is executed before WServer::stop ends, are wrong.

In the meantime I used another approach:

  • changed the post with getting the Updatelock for the application and calling directly "MyApp::closingroutine".
    Is it possible to end up with other issues with this approach ?
Actions #23

Updated by W X over 11 years ago

Regarding your statement that "It's better to keep a list of session ID's rather than pointers to applications", is it possible then to lookup the pointer to the application using the session id ? I need it when posting the callback.

Actions #24

Updated by Wim Dumon over 11 years ago

We bind when we register the callback to the server (I believe the chat example does this).

You can also post to a static method, which then uses WApplication::instance() (or in short wApp) to forward the call to the appropriate application object.

Or you can post a lambda function that does dynamic_cast<MyApp *>(wApp)->myMethod().

Or maybe keep a reference to both the application pointer and the session id?

UpdateLocks are quite dangerous wrt deadlocks, make sure that you don't hold any other related lock (including other session locks, like the one for an event you're currently handling) when you use them. post() avoids this mess.

BR,

Wim.

Actions #25

Updated by W X over 11 years ago

I moved this discussion to http://redmine.webtoolkit.eu/boards/2/topics/6637 to avoid polluting the initial issue.

Wrt to comment [#21], I didn't say it is a deadlock. It looks to me more like and indefinite wait for the worker threads to finish (this is why I thought it might be related to boost issue: https://svn.boost.org/trac/boost/ticket/7552). The worker threads stack seem to show them polling on IO completion ports.

Thanks for recommendations on using the wApp pointer. In my application we store both the sessionId and the wApp pointer in a map indexed by sessionId(), so I guess we should be safe (?).

If I use a static callback I should get the same wApp that was used when post() was called (with a call to WApplication::instance()), because the callback is called on a Wt thread, right? (that seems more appealing then storing the app pointers).

Wrt to holding the UpdateLock, I don't get any other locks directly while using the UpdateLock (well, unless Wt does it internally, which is out of my control).

Actions #26

Updated by Koen Deforche over 11 years ago

  • Target version changed from 3.3.1 to 3.3.2
Actions #27

Updated by Koen Deforche almost 11 years ago

  • Target version deleted (3.3.2)

Since a lot of stuff change in the wthttpd, for good, which also uses the thread pool, perhaps this has been solved?

We need some confirmation form you guys using the latest Wt 3.3.2 release candidate?

Actions #28

Updated by Стойчо Стефанов Stoycho Stefanov almost 11 years ago

Hey Koen,

it works for me at least w.r.t. http://redmine.webtoolkit.eu/issues/1708#note-17. Here is the output of the use-case:

3.3.2
try start the server
10.0.2.80 - - [2014-Feb-26 16:26:35.001342] "GET /?wtd=nByIPvecDoWr1ch4 HTTP/1.1" 200 2088
10.0.2.80 - - [2014-Feb-26 16:26:35.009657] "POST /?wtd=nByIPvecDoWr1ch4 HTTP/1.1" 200 72
10.0.2.80 - - [2014-Feb-26 16:26:35.039774] "GET /?wtd=GJetRcDlhnDR4dC1 HTTP/1.1" 200 2088
10.0.2.80 - - [2014-Feb-26 16:26:35.078957] "GET /?wtd=GJetRcDlhnDR4dC1&request=style HTTP/1.1" 200 88
10.0.2.80 - - [2014-Feb-26 16:26:35.082643] "GET /resources/themes/default/wt.css HTTP/1.1" 404 85
10.0.2.80 - - [2014-Feb-26 16:26:35.085732] "GET /resources/moz-transitions.css HTTP/1.1" 200 5405
10.0.2.80 - - [2014-Feb-26 16:26:35.149432] "GET /?wtd=GJetRcDlhnDR4dC1&sid=-2128662183&tz=60&htmlHistory=true&deployPath=%2F&request=script&rand=3233340224 HTTP/1.1" 200 34914
10.0.2.80 - - [2014-Feb-26 16:26:35.229145] "POST /?wtd=GJetRcDlhnDR4dC1 HTTP/1.1" 200 51
10.0.2.80 - - [2014-Feb-26 16:26:38.395269] "POST /?wtd=GJetRcDlhnDR4dC1 HTTP/1.1" 200 51
Return from waitForShutdown
server.stop()
continue
try start the server
10.0.2.80 - - [2014-Feb-26 16:26:44.273275] "GET /?wtd=GJetRcDlhnDR4dC1 HTTP/1.1" 200 2090
10.0.2.80 - - [2014-Feb-26 16:26:44.282975] "POST /?wtd=GJetRcDlhnDR4dC1 HTTP/1.1" 200 72
10.0.2.80 - - [2014-Feb-26 16:26:44.332651] "GET /?wtd=5OqFhafeKh7aHWXi HTTP/1.1" 200 2089
10.0.2.80 - - [2014-Feb-26 16:26:44.368124] "GET /?wtd=5OqFhafeKh7aHWXi&request=style HTTP/1.1" 200 88
10.0.2.80 - - [2014-Feb-26 16:26:44.379907] "GET /resources/themes/default/wt.css HTTP/1.1" 404 85
10.0.2.80 - - [2014-Feb-26 16:26:44.380875] "GET /resources/moz-transitions.css HTTP/1.1" 200 5405
10.0.2.80 - - [2014-Feb-26 16:26:44.442281] "GET /?wtd=5OqFhafeKh7aHWXi&sid=520071826&tz=60&htmlHistory=true&deployPath=%2F&request=script&rand=504284216 HTTP/1.1" 200 34914
10.0.2.80 - - [2014-Feb-26 16:26:44.525397] "POST /?wtd=5OqFhafeKh7aHWXi HTTP/1.1" 200 49

Regards,

Stoycho

Actions #29

Updated by Koen Deforche almost 11 years ago

  • Status changed from Feedback to Resolved
Actions #30

Updated by Koen Deforche almost 11 years ago

  • Status changed from Resolved to Closed
  • Target version set to 3.3.2
Actions

Also available in: Atom PDF