Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close websockets when app is terminated #269

Merged
merged 9 commits into from
May 18, 2022
Merged

Close websockets when app is terminated #269

merged 9 commits into from
May 18, 2022

Conversation

The-EDev
Copy link
Member

@The-EDev The-EDev commented Nov 6, 2021

As mentioned in #34. Crow doesn't properly close websocket connections when terminated, instead closing the socket directly. Which could cause a problem with websocket clients if they're not configured to handle unclean closures.

This Draft PR blocks the main thread until all websocket connections have been cleanly closed (which closes #34).

It does so by keeping track of the number of currently open websockets, which is incremented up once the connection object is created, and decremented whenever it is destroyed. The SIGINT or SIGTERM is sent to both the connection and the server. the server's do_accept method is halted and the thread is then blocked until the websocket_count reaches 0, the connections just call the close() method, which automatically decrements the count once complete.

There are a few things I'm not comfortable with in the PR:

  1. The fact that I'm sending the websocket count from the App through the router's handle_upgrade method (which makes little sense) and to the websocket connection.
  2. Continuing the subject of sending metadata, I've (for now) hardcoded the signals into the websocket::connection constructor, which makes Crow::signal_clear() and Crow::signal_add() methods completely useless, since they don't affect the websocket connections.
  3. There's a problem where deleting a websocket connection and attempting to call io_service.stop() on the related IO Context causes a SEGFAULT. (More details below)
  4. I seem to have added a bunch of spaces and tabs where they shouldn't be.

The fourth problem is no big deal. The other 3 however I do need help in.

To deal with numbers 1 & 2 my Idea is to create a new class named either crow::connection_context or crow::metadata which would contain the websocket count, signals, and potentially other data that the App needs to send to connections, adaptors, etc..
I didn't work on it because I need other opinions since this is an architectural change.

The third issue is quite a bit more complicated. Here's the text I wrote (on Gitter) while looking into it:

I got websockets to close when the app exits, problem is that I seem to have uncovered a bigger bug doing this. It seems that stopping io_contexts (io_services) references connection objects, and websockets get deleted once they're closed, therefore causing a segfault. I need to understand how io_contexts work and how to fix this issue before I even think of committing the rest of the code

One theory I had was that the adaptor(ASIO::TCP::SOCKET) was somehow being referenced by the io_service as it was getting stopped, and since it was owned by the websocket::Connection, deleting it would cause the io_service to attempt to dereference what is now a null pointer.
My hypothesis was proven correct when I separated the adaptor and only deleted it, leaving the connection intact. And got the same SEGFAULT.
However I was proven wrong when I did the opposite, deleting only the connection and leaving the adaptor intact. I still got SEGFAULT
It seems that for whatever reason both need to be in memory for things to close properly, and I have no idea why.

I've already worked around the issue using the should_close_ boolean. Which prevents the websocket object from being deleted after being closed. I particularly don't like this because A. it ignores the base issue that something that doesn't exist is being referenced. and B. It potentially causes a memory leak since there's no actual reference to the websocket connection:
Here's the code responsible for creating websocket connections (notice how it is created in the heap and left to deal with itself)

Crow/include/crow/routing.h

Lines 403 to 406 in b5137c5

void handle_upgrade(const request& req, response&, SocketAdaptor&& adaptor) override
{
new crow::websocket::Connection<SocketAdaptor>(req, std::move(adaptor), open_handler_, message_handler_, close_handler_, error_handler_, accept_handler_);
}

terminated.
This is incomplete and needs more work.
@The-EDev The-EDev added the help wanted Contributions are requested label Nov 6, 2021
@The-EDev
Copy link
Member Author

The-EDev commented Nov 6, 2021

CI build killed because more work needs to be done anyway.

@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 9, 2021
@crow-clang-format
Copy link

Actually, This isn't a bad place for a bot like me

@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 15, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 19, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 19, 2021
@The-EDev The-EDev added the discussion The viability / implementation of the issue is up for debate label Nov 21, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 24, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 24, 2021
@CrowCpp CrowCpp deleted a comment from crow-clang-format bot Nov 27, 2021
@The-EDev The-EDev removed help wanted Contributions are requested discussion The viability / implementation of the issue is up for debate labels Apr 27, 2022
@crow-clang-format
Copy link

--- include/crow/http_server.h	(before formatting)
+++ include/crow/http_server.h	(after formatting)
@@ -161,16 +161,16 @@
 
         void stop()
         {
-            should_close_ = false; //Prevent the acceptor from taking new connections
+            should_close_ = false;                                                 //Prevent the acceptor from taking new connections
             while (handler_->websocket_count.load(std::memory_order_release) != 0) //Wait for the websockets to close properly
             {
             }
-            for(auto& io_service : io_service_pool_)
+            for (auto& io_service : io_service_pool_)
             {
                 if (io_service != nullptr)
                 {
-                CROW_LOG_INFO << "Closing IO service " << &io_service;
-                io_service->stop(); //Close all io_services (and HTTP connections)
+                    CROW_LOG_INFO << "Closing IO service " << &io_service;
+                    io_service->stop(); //Close all io_services (and HTTP connections)
                 }
             }
 
--- include/crow/websocket.h	(before formatting)
+++ include/crow/websocket.h	(after formatting)
@@ -106,7 +106,7 @@
 
 
                 signals_.clear();
-                for (auto snum: handler_->signals())
+                for (auto snum : handler_->signals())
                     signals_.add(snum);
 
                 // Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
@@ -118,13 +118,14 @@
                 s.getDigestBytes(digest);
 
                 signals_.async_wait(
-                    [&](const boost::system::error_code& e, int /*signal_number*/){
-                        if (!e){
-                        CROW_LOG_INFO << "quitting " << this;
-                        do_not_destroy_ = true;
-                        close("Quitter");
-                        }
-                    });
+                  [&](const boost::system::error_code& e, int /*signal_number*/) {
+                      if (!e)
+                      {
+                          CROW_LOG_INFO << "quitting " << this;
+                          do_not_destroy_ = true;
+                          close("Quitter");
+                      }
+                  });
                 start(crow::utility::base64encode((unsigned char*)digest, 20));
             }
 
@@ -657,14 +658,14 @@
             bool is_close_handler_called_{false};
 
 
-                // **WARNING**
-                // SETTING THIS PREVENTS THE OBJECT FROM BEING DELETED,
-                // AND WILL ABSOLUTELY CAUSE A MEMORY LEAK!!
-                // ONLY USE IF THE APPLICATION IS BEING TERMINATED!!
-                bool do_not_destroy_{false};
-                // **WARNING**
-
-                std::atomic<int>& websocket_count_;
+            // **WARNING**
+            // SETTING THIS PREVENTS THE OBJECT FROM BEING DELETED,
+            // AND WILL ABSOLUTELY CAUSE A MEMORY LEAK!!
+            // ONLY USE IF THE APPLICATION IS BEING TERMINATED!!
+            bool do_not_destroy_{false};
+            // **WARNING**
+
+            std::atomic<int>& websocket_count_;
 
             std::function<void(crow::websocket::connection&)> open_handler_;
             std::function<void(crow::websocket::connection&, const std::string&, bool)> message_handler_;

@The-EDev
Copy link
Member Author

However I was proven wrong when I did the opposite, deleting only the connection and leaving the adaptor intact. I still got SEGFAULT

This makes sense because the adaptor is moved to be a member of the websocket connection, thus deleting the connection would also delete the adaptor, which is most likely the reason behind the segfault.

@The-EDev
Copy link
Member Author

The-EDev commented May 18, 2022

Adding adaptor_.shutdown_readwrite(); seems to resolve the segfault issue.. Will make a commit soon

@crow-clang-format
Copy link

--- include/crow/websocket.h	(before formatting)
+++ include/crow/websocket.h	(after formatting)
@@ -106,7 +106,7 @@
 
 
                 signals_.clear();
-                for (auto snum: handler_->signals())
+                for (auto snum : handler_->signals())
                     signals_.add(snum);
 
                 // Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
@@ -118,12 +118,13 @@
                 s.getDigestBytes(digest);
 
                 signals_.async_wait(
-                    [&](const boost::system::error_code& e, int /*signal_number*/){
-                        if (!e){
-                        CROW_LOG_INFO << "Quitting Websocket: " << this;
-                        close("Server Application Terminated");
-                        }
-                    });
+                  [&](const boost::system::error_code& e, int /*signal_number*/) {
+                      if (!e)
+                      {
+                          CROW_LOG_INFO << "Quitting Websocket: " << this;
+                          close("Server Application Terminated");
+                      }
+                  });
                 start(crow::utility::base64encode((unsigned char*)digest, 20));
             }
 

@The-EDev The-EDev marked this pull request as ready for review May 18, 2022 09:41
@The-EDev
Copy link
Member Author

@mrozigor @luca-schlecker I believe this PR is ready to be merged, Could you please take a look at it? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Close WebSocket connections on exit
2 participants