Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apparent race condition with sync requests #92

Closed
alphapapa opened this issue Jul 9, 2018 · 14 comments
Closed

Apparent race condition with sync requests #92

alphapapa opened this issue Jul 9, 2018 · 14 comments
Milestone

Comments

@alphapapa
Copy link

alphapapa commented Jul 9, 2018

I have found that, somehow, it is possible to put request into a state in which synchronous curl requests return before the curl process has finished, so the request-response object has a data value of nil as soon as it returns.

For example, when request is in this state, a synchronous request returns immediately, and its data is nil:

(request-response-data (request "http://example.com" :sync t))  ;=> nil

But if I store the request object in a variable and then access its data, it works. Notice that the data element of the struct, the third element, is nil, but then when I access it with request-response-data after the setq returns, the data is available:

(setq argh (request "http://example.com" :sync t)) ;=> #s(request-response nil nil nil nil nil "http://example.com/" ...)
(request-response-data argh)                       ;=> data

But when I run another Emacs process and run this same code, it works correctly:

(request-response-data (request "http://example.com" :sync t)) ;=> data

Making this even more confusing, in the "broken" Emacs process, when I run the same code with edebug, everything works correctly. So when request is in this state, everything works correctly when run in edebug, but fails when run normally, which makes this very hard to debug. :(

I have not been able to figure out what happens to put request into this state. It might be connected with request-log-level, but even when setting that back to -1, it does not fix the problem.

Perhaps it's some kind of race condition?

Note: This library appears to be unmaintained now (Hello, @tkf?), so I'm filing this issue for future reference, as this is not the first time this has happened to me and wasted a few hours of time trying to debug it. Maybe someone else will encounter the same problem and be able to figure it out...

@alphapapa
Copy link
Author

After much digging I've found that when using the synchronous curl backend, the complete callback (which is used to prevent returning from request--curl-sync until the curl command finishes) is not passed through the chain of calls. It appears to be because the settings field is not set on the response object in the request--curl function. Adding it in that function at this position seems like the natural thing to do:

    (setf (request-response--buffer response) buffer)
    (setf (request-response-settings response) settings)  ; <-
    (process-put proc :request-response response)

This correctly propagates the :success callback.

However, this problem seems to remain unfixed. Adding timestamps to request-log shows that request--curl-sync (which should not return until all the callbacks have completed) returns before the callbacks run, which means that the request has not yet been parsed and set on the response object. Here's an except from the log:

08:52:43:094: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:322: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:327: [debug] `request--curl-sync' WAITING FOR CURL...
08:52:43:328: [debug] `request--curl-sync' done
08:52:43:328: [debug] `request' RETURNING: RESPONSE DATA: nil  ;; <- The (request ... :sync t) call returning before the callback, which is run on the next line.
08:52:43:352: [debug] REQUEST--CURL-CALLBACK event = finished

08:52:43:352: [debug] REQUEST--CURL-CALLBACK proc = #<process request curl>
08:52:43:352: [debug] REQUEST--CURL-CALLBACK buffer = #<buffer  *request curl*>
08:52:43:352: [debug] REQUEST--CURL-CALLBACK symbol-status = nil

@alphapapa
Copy link
Author

alphapapa commented Jul 9, 2018

Despite further digging I have been unable to figure out the problem. It is very strange, because when I start a new Emacs process, it works correctly at first, but at some point, it stops working correctly, and request--curl-sync returns before the callbacks have run.

I haven't been able to figure out what the catalyst is. Sometimes it seems to start happening after a period of time has passed. Other times it seems to happen immediately, even the first time I make a request.

  • I can't find any stale processes running in the background.
  • I've compared the debug output from a "good" session and a "bad" session, and the only difference is that the callbacks are run after request returns, with no apparent reason.
  • I've used the following code in an Org source block to check the values of all request variables, saving the outputs to files and using diff to compare their values in a "good" and a "bad" session, but I found no difference:
  (let ((result))
    (mapatoms (lambda (atom)
                (when (string-prefix-p "request-" (symbol-name atom))
                  (push (cond ((ignore-errors (symbol-value atom))
                               (list (symbol-name atom) (s-replace "|" "--" (format "%S" (symbol-value atom)))))
                              (t (list (symbol-name atom) nil)))
                        result))))
    result)

It's the same code running each time, but after some event happens, the callbacks start running at the wrong time, and AFAICT it never works again correctly in that Emacs session.

However, let-binding request-backend to the url-retrieve backend makes the sync requests work correctly. So maybe the workaround is to simply use the url-retrieve backend for synchronous requests (that is, when curl isn't required for some reason).

alphapapa added a commit to alphapapa/pocket-lib.el that referenced this issue Jul 9, 2018
alphapapa added a commit to alphapapa/pocket-lib.el that referenced this issue Jul 12, 2018
@alphapapa
Copy link
Author

For future reference, here's a patch with the debugging code I added:

diff --git a/emacs.d/elpa/request-20170131.1747/request.el b/emacs.d/elpa/request-20170131.1747/request.el
index 815745f..c6c084c 100644
--- a/emacs.d/elpa/request-20170131.1747/request.el
+++ b/emacs.d/elpa/request-20170131.1747/request.el
@@ -175,7 +175,7 @@ (defmacro request-log (level fmt &rest args)
          (log-level (request--log-level-as-int request-log-level))
          (msg-level (request--log-level-as-int request-message-level)))
      (when (<= level (max log-level msg-level))
-       (let ((msg (format "[%s] %s" ,level
+       (let ((msg (format "%s: [%s] %s" (format-time-string "%H:%M:%S:%3N") ,level
                           (condition-case err
                               (format ,fmt ,@args)
                             (error (format "
@@ -191,7 +191,7 @@ (defmacro request-log (level fmt &rest args)
                (goto-char (point-max))
                (insert msg "\n"))))
          (when (<= level msg-level)
-           (message "REQUEST %s" msg))))))
+           (message "%s: REQUEST %s" (format-time-string "%H:%M:%S:%3N") msg))))))
 
 
 ;;; HTTP specific utilities
@@ -579,6 +579,8 @@ (defun request-cookie-alist (host &optional localpart secure)
     (setf (request-response--timer response)
           (run-at-time timeout nil
                        #'request-response--timeout-callback response)))
+  ;; This is happening BEFORE the callback is called.
+  (request-log 'debug "`request' RETURNING: RESPONSE DATA: %s" (request-response-data response))
   response)
 
 (defun request--clean-header (response)
@@ -638,8 +640,7 @@ (defun request--parse-data (response parser)
 (cl-defun request--callback (buffer &key parser success error complete
                                     timeout status-code response
                                     &allow-other-keys)
-  (request-log 'debug "REQUEST--CALLBACK")
-  (request-log 'debug "(buffer-string) =\n%s"
+  (request-log 'debug "`request--callback' (buffer-string) =\n%s"
                (when (buffer-live-p buffer)
                  (with-current-buffer buffer (buffer-string))))
 
@@ -660,9 +661,9 @@ (defun request--parse-data (response parser)
     ;; Note: Try to do this even `error-thrown' is set.  For example,
     ;; timeout error can occur while downloading response body and
     ;; header is there in that case.
-
+    (request-log 'debug "`request--callback' RESPONSE DATA: %s" (request-response-data response))
     ;; Parse response body
-    (request-log 'debug "error-thrown = %S" error-thrown)
+    (request-log 'debug "`request--callback' error-thrown = %S" error-thrown)
     (condition-case err
         (request--parse-data response parser)
       (error
@@ -673,12 +674,13 @@ (defun request--parse-data (response parser)
          (setq error-thrown err)
          (request-log 'error "Error from parser %S: %S" parser err))))
     (kill-buffer buffer)
-    (request-log 'debug "data = %s" data)
+    (request-log 'debug "`request--callback' data = %s" (pp-to-string data))
 
     ;; Determine `symbol-status'
     (unless symbol-status
       (setq symbol-status (if error-thrown 'error 'success)))
-    (request-log 'debug "symbol-status = %s" symbol-status)
+    (request-log 'debug "`request--callback' symbol-status = %s" symbol-status)
+
 
     ;; Call callbacks
     (let ((args (list :data data
@@ -689,18 +691,20 @@ (defun request--parse-data (response parser)
              (cb (if success-p success error))
              (name (if success-p "success" "error")))
         (when cb
-          (request-log 'debug "Executing %s callback." name)
+          (request-log 'debug "`request--callback' Executing %s callback: %s.  DATA: %s" name success (request-response-data response))
           (request--safe-apply cb args)))
 
       (let ((cb (cdr (assq (request-response-status-code response)
                            status-code))))
         (when cb
-          (request-log 'debug "Executing status-code callback.")
+          (request-log 'debug "`request--callback' Executing status-code callback.")
           (request--safe-apply cb args)))
 
-      (when complete
-        (request-log 'debug "Executing complete callback.")
-        (request--safe-apply complete args)))
+      (if complete
+          (progn
+            (request-log 'debug "`request--callback' Executing complete callback.")
+            (request--safe-apply complete args))
+        (request-log 'debug "`request--callback' NO COMPLETE CALLBACK.")))
 
     (setq done-p t)
 
@@ -1039,7 +1043,8 @@          (default-directory (expand-file-name home-directory))
                     (apply #'request--curl-command url :files* files*
                            :response response settings)))
          (proc (apply #'start-file-process "request curl" buffer command)))
-    (request-log 'debug "Run: %s" (mapconcat 'identity command " "))
+    (request-log 'debug "`request--curl' Run: %s" (mapconcat 'identity command " "))
+    (request-log 'debug "`request--curl' Data: %s" (plist-get settings :data))
     (setf (request-response--buffer response) buffer)
     (process-put proc :request-response response)
     (set-process-coding-system proc 'binary 'binary)
@@ -1168,8 +1173,10 @@ (defun request--curl-callback (proc event)
                   :complete (lambda (&rest _) (setq finished t))
                   settings)
       (let ((proc (get-buffer-process (request-response--buffer response))))
-        (while (and (not finished) (request--process-live-p proc))
-          (accept-process-output proc))))))
+        (while (and (not finished) (process-live-p proc))
+          (request-log 'debug "`request--curl-sync' WAITING FOR CURL...")
+          (accept-process-output proc))
+        (request-log 'debug "`request--curl-sync' done")))))
 
 (defun request--curl-get-cookies (host localpart secure)
   (request--netscape-get-cookies (request--curl-cookie-jar)

@dickmao
Copy link
Collaborator

dickmao commented Oct 5, 2018

My reading of the code is that things are as they should be. The :sync t only ensures the curl process finishes before returning control, i.e.,

1171        (while (and (not finished) (process-live-p proc))

The request--curl-callback function only executes when emacs sees a change in process status.

1047    (set-process-sentinel proc #'request--curl-callback)))

That timing seems independent of whatever request.el is doing.

@alphapapa
Copy link
Author

@dickmao No, things are not as they should be:

  1. (request url :sync t) behaves differently depending on which backend is used. That's a bug.
  2. The behavior in question usually does not happen the first time a request is made. At some point, something is mutated in the Emacs process, and the behavior changes from that point. That's a bug.
  3. The point of synchronous requests is that, when the request completes, the response data is available. Were this not the case, the calling code would have to spin, waiting for the response data to become available. In that case, there would be no reason to use synchronous requests in the first place. This problem essentially defeats the purpose of the feature; it's a bug.

Your analysis of the code may have found the problem, though. The :sync t code should not return until the response has been processed.

alphapapa added a commit to alphapapa/matrix-client.el that referenced this issue Oct 10, 2018
It's impossible to upload files with url.el unless
url-http-create-request is overridden with a fixed version, and that
must be done with function advice (which applies to everything in
Emacs), because doing it with cl-letf doesn't work (who knows why--it
works fine for other things).  Obviously we don't want to override
that function globally, even though it works fine now, because when
url.el changes in the future, that might break something.  And it's
not a good idea, anyway.

So let's switch back to request.el.  Frankly, I don't even remember
why I switched to my url-with-retrieve-async function.  There is this
pesky bug at tkf/emacs-request#92, but that
only applies to synchronous requests, which we aren't using.
@titaniumbones
Copy link
Collaborator

@alphapapa is there a branch or forrk that fixes this problem? I'm using synchronous requests (because i din't understand how to do async code in elisp!) and seem to be encountering this bug. Thank you!

@dickmao
Copy link
Collaborator

dickmao commented Nov 5, 2018

If it's not a serious elisp project, I'd do something like:

(lexical-let (done-p)
  (with-local-quit  
    (request ... 
             :complete (lambda () setq done-p t)))
  (loop until done-p
        do (sleep-for 0 1000)))

Otherwise, you can follow the example in "Wrapping asynchronous function" in https://github.com/kiwanami/emacs-deferred . You don't need to fully understand what's going on upfront to get it working.

@titaniumbones
Copy link
Collaborator

@alphapapa would you be willing to prepare a PR from your patch? I can then try to test and merge. Thank you!

@alphapapa
Copy link
Author

@titaniumbones The patch above doesn't fix the problem, it just adds debugging code. @dickmao's analysis may help pinpoint where to fix the problem. I don't expect I'll be able to work on this issue anytime soon. Thanks for taking the initiative.

@titaniumbones
Copy link
Collaborator

I'm wondering if a starting point might be to devise a test that will (almost) always catch this issue, but pass if we can figure out a fix?

@alphapapa
Copy link
Author

Good luck. I wasn’t able to reliably reproduce it in hours of attempts. Sometimes it would happen immediately, sometimes much later, but once it happened, it happened until I restarted Emacs.

dickmao added a commit to dickmao/emacs-request that referenced this issue Nov 30, 2018
See dickmao's comment in tkf#92 regarding why `:sync` does not behave as
expected.

Previously, `request--curl-sync` would merely return as soon as the
curl process finished *without regard to
`#'request--curl-callback`.  This change attaches the semaphore to the
right place.

(Attempt) Fixes tkf#92
@dickmao
Copy link
Collaborator

dickmao commented Nov 30, 2018

My reading of the code is that things are as they should be. The :sync t only ensures the curl process finishes before returning control, i.e.,

1171        (while (and (not finished) (process-live-p proc))

Actually I'm wrong (and you're right) -- this should probably be an or instead of an and. I believe the while-loop exits when process-live-p becomes false -- the finished semaphore gets short shrift.

But my earlier comment about the timing of curl-callback being independent of whatever request.el is doing holds. I attempted a fix in #99 .

@titaniumbones titaniumbones added this to the 0.40 milestone Nov 30, 2018
dickmao added a commit to dickmao/emacs-request that referenced this issue Dec 6, 2018
See dickmao's comment in tkf#92 regarding why `:sync` does not behave as
expected.

Previously, `request--curl-sync` would merely return as soon as the
curl process finished *without regard to
`#'request--curl-callback`.  This change attaches the semaphore to the
right place.

(Attempt) Fixes tkf#92
@dickmao dickmao mentioned this issue Dec 6, 2018
@odanoburu
Copy link

can anyone confirm this is solved in the development branch? I was having a problem where I couldn't signal the error which occurred with a sync request, but now it seems to work.

if this is indeed closed, can we also make a new release?

@odanoburu
Copy link

gentle bump

dickmao added a commit to dickmao/emacs-request that referenced this issue Jun 21, 2019
See dickmao's comment in tkf#92 regarding why `:sync` does not behave as
expected.

Previously, `request--curl-sync` would merely return as soon as the
curl process finished *without regard to
`#'request--curl-callback`.  This change attaches the semaphore to the
right place.

(Attempt) Fixes tkf#92
dickmao added a commit to dickmao/emacs-request that referenced this issue Jun 21, 2019
See dickmao's comment in tkf#92 regarding why `:sync` does not behave as
expected.

Previously, `request--curl-sync` would merely return as soon as the
curl process finished *without regard to
`#'request--curl-callback`.  This change attaches the semaphore to the
right place.

(Attempt) Fixes tkf#92
dickmao added a commit to dickmao/emacs-request that referenced this issue Jun 21, 2019
See dickmao's comment in tkf#92 regarding why `:sync` does not behave as
expected.

Previously, `request--curl-sync` would merely return as soon as the
curl process finished *without regard to
`#'request--curl-callback`.  This change attaches the semaphore to the
right place.

(Attempt) Fixes tkf#92
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants