Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update for the go-ipfs-cmds refactor #5035

Merged
merged 11 commits into from
Sep 19, 2018
Merged

update for the go-ipfs-cmds refactor #5035

merged 11 commits into from
Sep 19, 2018

Conversation

keks
Copy link
Contributor

@keks keks commented May 24, 2018

This branch contains changes to go-ipfs so we can use it with this branch of go-ipfs-cmds: ipfs/go-ipfs-cmds#112

@keks keks requested a review from Kubuxu as a code owner May 24, 2018 11:33
@ghost ghost assigned keks May 24, 2018
@ghost ghost added the status/in-progress In progress label May 24, 2018
@keks keks force-pushed the feat/cmds2 branch 7 times, most recently from 8d7381a to 2daf766 Compare May 25, 2018 11:14
@keks keks force-pushed the feat/cmds2 branch 2 times, most recently from 473884a to 9cd9bd4 Compare August 16, 2018 16:05
@keks
Copy link
Contributor Author

keks commented Aug 22, 2018

@Stebalien sharness test t0272-urlstore.shfails and I'm not sure how to fix it.

The issue is that after adding a file through the urlstore, the subsequent get call will successfully fetch everything and then error with failed to fetch all nodes, so it seems like some if the DAG nodes are not properly added. However, the resulting file matches the file that was added (i.e. file3 matches file3.actual in the trash directory). I traced the error through the code and it is returned by res.Emit(reader), where reader is a *"unixfs/io".PBDagReader.

The furthest I have traced the error is that it is returned by a promise in the PBDagReader, and now I'm somewhat lost because that can only return either an error or the data, and we have all the data, so where does the error come from? I was hoping it was some race condition and the nodes are not written yet when the call to get comes in, but sleeping for a second before doing that doesn't help either.

I'm pretty puzzled and meandering through unfamiliar code, so I thought you might have an idea where that issue might come from.

@Stebalien
Copy link
Member

I'm pretty puzzled and meandering through unfamiliar code, so I thought you might have an idea where that issue might come from.

The answer is: not your bug, your patch just makes us correctly report errors. I'll look into it.

@keks
Copy link
Contributor Author

keks commented Aug 23, 2018

That did the trick, thanks! 🌈

@keks keks force-pushed the feat/cmds2 branch 4 times, most recently from d1e73de to 57e7b62 Compare August 27, 2018 16:53
@keks
Copy link
Contributor Author

keks commented Aug 27, 2018

@Stebalien I removed the remaining debug output and rebased to master. This is now ready for review.

// Inject metrics before we do anything
err := mprome.Inject()
if err != nil {
log.Errorf("Injecting prometheus handler for metrics failed with message: %s\n", err.Error())
return fmt.Errorf("Injecting prometheus handler for metrics failed with message %s", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't want to return here. This can fail for various reasons.

if ch, ok := out.(chan interface{}); ok {
out = (<-chan interface{})(ch)
// don't emit nil or Single{nil}
if out == nil || out == (cmds.Single{Value: nil}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not emitting nil correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did this is that I had some command sometimes emit a nil at the beginning. This led to the http response emitter call the preamble with nil, which led to wrong state in the http response. I don't see a reason to actually send nils, so I figured it's easiest to just ignore them. I don't remember which command that was, though.

Do you think this is wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried we should be closing the output (if we're emitting a Single). However, this is probably fine (we'll close anyways).


_, err := statGetFormatOptions(req)
if err != nil {
res.SetError(err, cmdkit.ErrClient)
// REVIEW NOTE: We didn't return here before, was that correct?
return cmdkit.Errorf(cmdkit.ErrClient, err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a bug.

}

defer func() {
err := wfd.Close()
if err != nil {
re.SetError(err, cmdkit.ErrNormal)
re.CloseWithError(cmdkit.Errorf(cmdkit.ErrNormal, err.Error()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful not to swallow any errors that may have caused this condition. We should probably:

  1. Use a named return value.
  2. Store the close error in, e.g., err2.
  3. Replace the returned error if it's nil.

Sound reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Is this what you had in mind?

if !ok {
// TODO or just return the error here?
log.Error(e.New(e.TypeErr(outReader, v)))
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return the error (IMO).

}
select {
case <-time.After(interval):
case <-req.Context.Done():
return
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return req.Context.Err()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would return an error and non-zero exit code to the user, even though everything was in order. Do you think this is a good idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. You're right.

}

cmds.EmitOnce(res, BlockStat{
err = cmds.EmitOnce(res, &BlockStat{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can return directly.

@@ -26,6 +26,7 @@ fi
# it's too late to pass in --verbose, and --verbose is harder
# to pass through in some cases.
test "$TEST_VERBOSE" = 1 && verbose=t
test "$TEST_IMMEDIATE" = 1 && immediate=t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, if you want we can remove this, but in that case I'd open a new PR for this.

When running single tests, I usually run them with ./txxxx-foo-bar.sh -v -i, which results in the test aborting immediately after an error. This makes finding the failing test in a large test file a lot easier. For some reason this behaviour could not be set using the env, meaning that it could not be used when invoked with make. This line allows enabling this mode using TEST_IMMEDIATE=1 make.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, sounds useful, let's keep it (although we should probably document it in the sharness readme).

@Stebalien
Copy link
Member

@keks: This has been unblocked. Once this is finished, we can merge ipfs/go-ipfs-cmds#112 and ship it.

@keks
Copy link
Contributor Author

keks commented Sep 17, 2018

@Stebalien The sharness test t0252 consistently fails:

expecting success: 
  ipfs refs $ADIR_HASH &&
  ipfs repo gc &&
  ipfs refs $ADIR_HASH

Error: failed to fetch all nodes
not ok 6 - gc okay after adding incomplete node
#	
#	  ipfs refs $ADIR_HASH &&
#	  ipfs repo gc &&
#	  ipfs refs $ADIR_HASH

I can't tell whether this is because of something else of my changes. I don't really see a connection. Also, sometimes the robust repo gc (t0087) fails, but I got used to that one.

@ghost ghost assigned Stebalien Sep 17, 2018
@Stebalien
Copy link
Member

The test was broken. I've fixed it.

}
r, ok := v.(*filestore.ListRes)
if !ok {
// TODO or just return that error? why didn't we do that before?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien What do you think should we do here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm not sure why we didn't.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the gc change.

@@ -99,7 +99,7 @@ order to reclaim hard disk space.
}
}
if errs {
res.SetError(fmt.Errorf("encountered errors during gc run"), cmdkit.ErrNormal)
outChan <- &GcResult{Error: "encountered errors during gc run"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually want to return the error here. We spit out errors as we encounter them but this is supposed to cause the command to exit with an exit status of 1.

We should probably just switch this command away from the legacy commands system. Then we can drop the goroutine and just emit the values normally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure what I was thinking here...

@Stebalien
Copy link
Member

Fixing the gc command should fix travis. I'm not sure why jenkins didn't fail...

@keks
Copy link
Contributor Author

keks commented Sep 19, 2018

Fixing the gc command should fix travis. I'm not sure why jenkins didn't fail...

As far as I can see the issue is that it doesn't report all gc errors (some of the greps fail). That doesn't mean we shouldn't use SetError for the final error (we should), but I believe there is another error here, but I doubt it's in the commands library. I also had that test fail in other branches when reviewing PRs, so I'm confident it has nothing to do with this change set. I just ran the test with the race detector on and it didn't report anything 🤷‍♂️

@keks
Copy link
Contributor Author

keks commented Sep 19, 2018

Looks like making repo gc use cmds 2 fixed the instability. @Stebalien I think I addressed all your concerns, want to merge this?

@Stebalien Stebalien changed the title [WIP] branch to track and test several changes to cmds library branch to track and test several changes to cmds library Sep 19, 2018
@Stebalien
Copy link
Member

@keks LGTM but I'd like @magik6k to take a look.

@magik6k Any feedback and/or objections to these changes? I've done a pretty thorough code review (don't feel like you need to do a line-by-line audit) but I'd like to get a final sign-off from you before we actually merge this into go-ipfs.

Summary:

  1. We now return errors instead of using SetError. This alone ends up removing a bunch of code.
  2. We've removed support for multiple "error" responses. This was never really properly supported in, e.g., javascript, but go technically supported it.
  3. We've made a bunch of low-level changes to reduce race conditions in the commands lib (esp around error handling).

@Stebalien Stebalien changed the title branch to track and test several changes to cmds library update for the go-ipfs-cmds refactor Sep 19, 2018
keks and others added 10 commits September 19, 2018 14:35
excerpt of commit messages:
- update postrun functions in core/commands
- sharness: allow setting -i with TEST_IMMEDIATE=1
- cmds Run func returns error now
- gx update cmdkit to 1.1.2 and cmds to 2.0.0-beta1

License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
`ipfs refs` won't work because we don't have the referenced objects (ipfs refs
fetches everything it prints).

License: MIT
Signed-off-by: Steven Allen <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: keks <[email protected]>
License: MIT
Signed-off-by: Steven Allen <[email protected]>
License: MIT
Signed-off-by: Steven Allen <[email protected]>
@Stebalien Stebalien merged commit 3538257 into master Sep 19, 2018
@ghost ghost removed the status/in-progress In progress label Sep 19, 2018
@Stebalien Stebalien deleted the feat/cmds2 branch September 19, 2018 22:30
@Stebalien
Copy link
Member

🎉 🎉 🎉

@keks
Copy link
Contributor Author

keks commented Sep 19, 2018

🎉 🎊

hacdias pushed a commit to ipfs/boxo that referenced this pull request Jan 27, 2023
update for the go-ipfs-cmds refactor

This commit was moved from ipfs/kubo@3538257
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants