Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API call to pubsub/sub does not return HTTP header until first message arrives #3304

Closed
keks opened this issue Oct 14, 2016 · 33 comments · Fixed by #3366
Closed

API call to pubsub/sub does not return HTTP header until first message arrives #3304

keks opened this issue Oct 14, 2016 · 33 comments · Fixed by #3366

Comments

@keks
Copy link
Contributor

keks commented Oct 14, 2016

This way calling "go-ipfs-api".Request.Send() blocks until the first message is received. Which is not nice.

keks added a commit to keks/go-ipfs that referenced this issue Oct 15, 2016
keks added a commit to keks/go-ipfs that referenced this issue Oct 15, 2016
keks added a commit to keks/go-ipfs that referenced this issue Oct 15, 2016
keks added a commit to keks/go-ipfs that referenced this issue Nov 7, 2016
keks added a commit to keks/go-ipfs that referenced this issue Nov 7, 2016
keks added a commit to keks/go-ipfs that referenced this issue Nov 7, 2016
…yet. fixes ipfs#3304 (ipfs#3305)"

This reverts commit 68d8a29.

License: MIT
Signed-off-by: Jan Winkelmann <[email protected]>
keks added a commit to keks/go-ipfs that referenced this issue Nov 7, 2016
Revert "http api: makes sure header is sent even when r is not ready yet. fixes ipfs#3304 (ipfs#3305)"
This reverts commit 68d8a29.

License: MIT
Signed-off-by: Jan Winkelmann <[email protected]>

License: MIT
keks added a commit to keks/go-ipfs that referenced this issue Nov 7, 2016
Revert "http api: makes sure header is sent even when r is not ready yet. fixes ipfs#3304 (ipfs#3305)"
This reverts commit 68d8a29.

License: MIT
Signed-off-by: Jan Winkelmann <[email protected]>
@keks
Copy link
Contributor Author

keks commented Nov 7, 2016

I guess we can reopen this...

This is also related to #3367.

@keks
Copy link
Contributor Author

keks commented Nov 10, 2016

@whyrusleeping can you reopen? Thanks!

@Kubuxu Kubuxu reopened this Nov 10, 2016
@keks
Copy link
Contributor Author

keks commented Nov 10, 2016

Thanks for reopening.
I currently think the easiest way to fix this is to send a {"subscribed":true, "topics": [...]} message in the beginning. That way the problem just doesn't occur anymore. However, existing clients will probably be confused by that message, though I believe atm there is only JS and Go, right? So that's easy to address.
The alternative would be to send some whitespace character, but that seems quite hackish.

What do you think of this?

ping @haadcode

@whyrusleeping
Copy link
Member

@keks i'm open to changing the api here. the pubsub code is clearly marked as experimental, so if we need to change things to make it better, lets do it.

@keks
Copy link
Contributor Author

keks commented Nov 10, 2016

Sure.

If we're not going to make big changes here, I feel like this corner case should be well-documented so when someone stumbles into it they don't have to dig as deep as I did. It's cool doing it once, but already at the second time it's a waste of time. You get to know the code base though, which is good I guess :)

@whyrusleeping
Copy link
Member

haha, where do you think that documentation would best fit?

@keks
Copy link
Contributor Author

keks commented Nov 10, 2016

I hoped you knew that :)

Am 10. November 2016 21:49:09 MEZ, schrieb Jeromy Johnson [email protected]:

haha, where do you think that documentation would best fit?

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#3304 (comment)

@whyrusleeping
Copy link
Member

Maybe we should write a small guide on 'how to write a new ipfs command' ?

@keks
Copy link
Contributor Author

keks commented Nov 14, 2016

Let's move this discussion to #3367 and concentrate on the the pubsub issue here.

@keks
Copy link
Contributor Author

keks commented Nov 14, 2016

I looked into how the PubsubSub command works and it seems kind of tricky to sneek in something that isn't a "floodsub".Message. Neither sending a newline nor a message that says {"subscribed":true} seems to be possible without either restructuring the command code big time or changing the floodsub message type.

I lean towards rewriting the pubsub command code but that seems like it will be some work. What do you think @whyrusleeping?

@whyrusleeping
Copy link
Member

@keks we could do something like send a message towards the user with no topic ID set, and then catch that as a 'hello, welcome' message. That seems unintrusive enough to me.

@keks
Copy link
Contributor Author

keks commented Nov 15, 2016

Yeah, but that seems clumsy from a protocol perspective. I had the idea of a "status" topic, similar to the status channel on irc. You would then receive a message saying "you are now subscribed to topic x". I think that is unintrusive and expressive, I'll proceed later today.

Another problem ist that I need to import floodsub/pb to create messages because "floodsub".Message is a struct that only embeds *"floodsub/pb".Message. Alternatively I can make a function in floodsub that generates this kind of messages. What do you think? Import the generated protobuf code or create these messages in the floodsub package?

@haadcode
Copy link
Member

Sorry to be late to the game. Yeah, this has been a problem in js-ipfs-api too, so getting a confirmation message/callback would be great!

@whyrusleeping
Copy link
Member

@keks we could have a different struct defined in the floodsub commands code that is basically the same as the floodsub.Message struct, but has an extra field that would allow us to patch in that "hello".

I'm hesitant to add a message that actually says something because then users have to ignore that every time. It needs to be very easily distinguishable from data (such that making the mistake of considering that message to be data becomes very difficult).

@ghost
Copy link

ghost commented Nov 16, 2016

This sounds like a commands/corehttp implementation detail leaking through. Can we fix this there before adding complexity in different place?

@keks
Copy link
Contributor Author

keks commented Nov 16, 2016

@whyrusleeping We can make *-ipfs-api ignore that message

@lgierth I think you mean commands/http? In that case I agree and I tried, but I didn't find a way. I believe the relevant function is sendResponse, and here we don't know whether the request body has been read completly or whether MultipartReader.NextPart has been called already. That means we don't know whether it's safe to Write to the output. I presume you know the http part pretty well so I'd love to stand corrected.

Edit: I guess we could add a ReadyToFlush() method to Response that is called by the command to indicate that from now on it's fine to flush. However it seems wrong to bloat interfaces for a rare corner case. Also I'm still not sure (a) what actually should happen in that method and (b) how to handle that in sendResponse.

Edit2: Maybe we could add a Flush method to Response that produces a FlushError (I made that up) on the Response.Read() that is caught by flushCopy, which goes on to call Flush on the http.ResponseWriter? Kind of ugly because it uses errors for control flow but well...

@keks
Copy link
Contributor Author

keks commented Nov 17, 2016

By the way, why don't we just write to the output Writer directly? Why have the extra copying step in flushCopy? Being able to just type assert the response to an http.Flusher would be super convenient and solve this problem elegantly.
I'm currently doing what I proposed in Edit2, but these changes don't go into commands/http either but into commands.response. So again we don't know whether the target is a flushable http.ResponseWriter or something else.

Another thing I did is dig into how the commands are called.
Apparently the code that decides whether a command is called locally or remotely is in main.go. It seems requests like add or pubsub/sub will be called using the HTTP API anyway, because commandShouldRunOnDaemon will return a client for them.
Is that enough to assume that these command's output will be processed by sendResponse, so we don't need to add checks for the DoFlushErr all over the IPFS code base?

I know this again does not seem very clean, but we simply can't do it in commands/http alone as long as we don't pass the actual io.Writer (be it os.Stdout or an http.ResponseWriter) to Command.Call.

What do you propose how to proceed @whyrusleeping @lgierth?

@haadcode
Copy link
Member

Actually, having tried to fix things on JS side, getting this fixed is required to make things work nicely in js-ipfs-api. Hope this is prioritized.

@keks
Copy link
Contributor Author

keks commented Nov 17, 2016

Yeah, I'm trying to wrap my head around this. I don't come up with an elegant solution though, the command interface is really weird.

@whyrusleeping
Copy link
Member

the command interface is really weird.

Yeah. Agreed.

@keks the reasoning for the flushCopy function was that things like ipfs ping, that send periodic small updates, were being buffered together. So ipfs ping would hang for a bit, then send me back a bunch of pings, then hang... So the solution we needed was to flush after each part of the response was written back.

@whyrusleeping
Copy link
Member

@keks

We can make *-ipfs-api ignore that message

Yeah, I think this is probably our best solution for now. We should be careful to make sure that api clients don't come to expect this hello message though. Once we fix the underlying problem, we will want to remove it

@keks
Copy link
Contributor Author

keks commented Nov 18, 2016

@whyrusleeping

the reasoning for the flushCopy function was that things like ipfs ping, that send periodic small updates, were being buffered together. So ipfs ping would hang for a bit, then send me back a bunch of pings, then hang... So the solution we needed was to flush after each part of the response was written back.

Well but why don't we just expose the output writer to the command? Sure, we couldn't set the channel as output, but what is the benefit of that? Why not simply write a Marshaller from the type e.g. like this:

type PubsubSubMarshaller struct {
  w io.Writer
}

func (m *PubsubSubMarshaller) Write(msg *PubsubMessage) error {
  ...
}

func NewPubsubSubMarshaller(w io.Writer) {
  ...
  if f, ok := w.(http.Flusher); ok {
    ...
  }
}

You would call all this in the Command.Run function. Not that you could also call Flush there. And then, instead of sending values to the output channel, just call PubsubSubMarshaller.Write().

I'd love to be convinced that the way we do things now are actually a good idea but this seems pretty over-engineered. E.g. the channel abstraction appears not to be helpful at all and you just lose information about what you are working with.

I tried to find a way to do this with the current structure but failed. I made a type that passes on the Reads and returns 0, nil in case we should flush. For this I needed a seperate goroutine that reads from the reader and returns the value in a channel so we can select it together with the flushCh channel that is written to when Response.Flush is being called. This didn't really work very well though.

Really, please. Let's just do the Marshalling and Unmarshalling "manually" (e.g. create a type that does this for us) and expose the http.ResponseWriter or os.Stdout directly. We can still flush in between calls to Marshall ourselves - or even in the Marshaller.

I am pretty tired and I hope you get what I'm getting at. Let's pass the response writer to Command.Call instead of copying the output to it. I even presume this will improve performance due to less copying.

@haadcode
Copy link
Member

I have an elegant workaround for this in my (test) code:

out <- string("{}")

Works beautifully ;)

@keks
Copy link
Contributor Author

keks commented Nov 18, 2016

I was trying to solve the more general problem as @lgierth suggested in #3304 (comment). Also, if what you post works, out <- "{}" should also do the trick :)

@keks
Copy link
Contributor Author

keks commented Nov 18, 2016

@haadcode where do you put this? I remember having tried to put something like that before this line, but the commands lib only expects *floodsub.Message because that's what the supplied Marshaler accepts.

Really, there is too much magic going on here. Let's keep it simple and pass the io.Writer to Command.Call and marshal manually. @whyrusleeping @lgierth what do you think about this? I'd love to go forward in the next couple of hours.

Also if I should proceed we should open a new issue that reflects the current take on things.

@keks
Copy link
Contributor Author

keks commented Nov 18, 2016

what do you think about this? I'd love to go forward in the next couple of hours.

Oh and also @Kubuxu

@keks
Copy link
Contributor Author

keks commented Nov 18, 2016

Here is a gist that gives an example for my approach: https://gist.github.com/keks/0f2412fa616129cfc1af3ff011f67d7c

Imagine the dashes in file names were slashes - gist doesn't allow directories

@whyrusleeping
Copy link
Member

@keks At first glance that seems fine to me, the only caveat is we have to make sure to handle the case where we arent using the API. In that case we arent marshaling to json, we're using the objects directly. The abstraction here is that the commands shouldnt care that they are going over an http transport, this commands lib was designed to be agnostic to how it was being run. (i'm not saying this is the best design, its just what we have and we can't break the api for users at this point)

If you can make the changes you propose without having to change the API of every other command, then go right ahead.

@haadcode
Copy link
Member

@haadcode where do you put this? I remember having tried to put something like that before this line, but the commands lib only expects *floodsub.Message because that's what the supplied Marshaler accepts.

I put the out <- string("{}") right there where you tried it too. Dunno why it works, but it does the trick.

@keks
Copy link
Contributor Author

keks commented Nov 19, 2016

@whyrusleeping

In that case we arent marshaling to json, we're using the objects directly.

Thanks for weighing in. I wasn't aware of this, but this definitely makes sense. That makes my proposal difficult, though.
I guess we could use different responder types that each have some methods specific to the actual transport, and you need to type-assert to access them.

If you can make the changes you propose without having to change the API of every other command, then go right ahead.

I'll try to find a design that also works for sending values directly (no API) and a good transitioning plan so we don't need to change all at once.

@haadcode

I put the out <- string("{}") right there where you tried it too. Dunno why it works, but it does the trick.

Ohhh, now I get what's happening. I was testing with ipfs pubsub sub, and the unmarshaling failed. The strings are actually being sent over the wire, but the client can't parse them. I'll try to find a fix, but I'm not yet sure where to look.

@whyrusleeping
Copy link
Member

@keks maybe we can expose some kind of "Start Response" method through the response object that different commands can call (optionally) as needed.

@keks
Copy link
Contributor Author

keks commented Nov 21, 2016

So I have a workaround ready in https://github.com/keks/go-ipfs/tree/feat/instaflush but it needs #3402 to merge first. I realized fixing changing the command API needs more time, planning, and people actively involved in the process. Would love to chat about this during 33C3!

@haadcode
Copy link
Member

I tested @keks' PRs above this morning and the tests I wrote for js-ipfs-api pass with flying colors. That is to say: what is PRed here fixes the problems we've been having on the JS side! \o/

👍 for getting these merged asap.

@keks keks closed this as completed Dec 9, 2016
liliuhai added a commit to peersafe/go-ipfs that referenced this issue Dec 26, 2016
* commit '6f3ae5da293f971c09e3e6fc0763aeca04ba98d3':
  rename DataSize to FileSize
  update HashOnRead validation to properly support cids
  http api: makes sure header is sent even when r is not ready yet. fixes ipfs#3304 (ipfs#3305)
  fix add/cat of small files
  raw dag: make raw nodes work in cat and get, add tests
  feat: make metrics injection log an error instead of warning
  test: check if metrics work in sharness
  Fix metrics being injected after node initalization
  unixfs: allow use of raw merkledag nodes for unixfs files
  ds-help: add helper func to convert from Cid to DsKey and the reverse
  cli: refactor to expose argument parsing functionality

# Conflicts:
#	core/commands/add.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@keks @whyrusleeping @Kubuxu @haadcode and others