Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/utp #1789

Closed
wants to merge 2 commits into from
Closed

Feat/utp #1789

wants to merge 2 commits into from

Conversation

whyrusleeping
Copy link
Member

Implement utp support

@jbenet jbenet added the status/in-progress In progress label Oct 4, 2015
@whyrusleeping
Copy link
Member Author

still WIP btw

@jbenet
Copy link
Member

jbenet commented Oct 4, 2015

exciting! 👍 👏

@ghost ghost mentioned this pull request Oct 4, 2015
88 tasks
useLocalAddr := true
for _, p := range raddr.Protocols() {
if p.Name == "utp" {
useLocalAddr = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the lib problem? we need to fix it. in many situations people will not be able to open another port. they need to be able to specify the one port everything happens through.

@jbenet
Copy link
Member

jbenet commented Oct 4, 2015

  • problem: just adding a utp addr doesn't work: Fixed. wrong binary :/
> ipfs config Addresses.Swarm
[
  "/ip4/0.0.0.0/tcp/4001",
  "/ip4/0.0.0.0/udp/4003/utp"
]
> ipfs daemon
Initializing daemon...
Swarm listening on /ip4/10.0.1.12/tcp/4001
Swarm listening on /ip4/108.54.113.161/tcp/37304
Swarm listening on /ip4/127.0.0.1/tcp/4001
Swarm listening on /ip4/127.94.0.1/tcp/4001
API server listening on /ip4/127.0.0.1/tcp/5001
Gateway (readonly) server listening on /ip4/0.0.0.0/tcp/8080
Daemon is ready

no utp. :(

with debug logs i see:

DEBU[06:24:05:000] Config.Addresses.Swarm:[/ip4/0.0.0.0/tcp/4001 /ip4/0.0.0.0/udp/4003/utp]  module=core
DEBU[06:24:05:000] Config.Addresses.Swarm:[/ip4/0.0.0.0/tcp/4001] (filtered)  module=core

it's getting filtered out.

fixed. wrong binary. :(

@@ -305,7 +294,7 @@ func (s *Swarm) dial(ctx context.Context, p peer.ID) (*Conn, error) {
remoteAddrs = addrutil.Subtract(remoteAddrs, ila)
remoteAddrs = addrutil.Subtract(remoteAddrs, s.peers.Addrs(s.local))

log.Debugf("%s swarm dialing %s -- local:%s remote:%s", s.local, p, s.ListenAddresses(), remoteAddrs)
log.Errorf("%s swarm dialing %s -- local:%s remote:%s", s.local, p, s.ListenAddresses(), remoteAddrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should stay as debug

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, its totally an error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a statement that is always executed. and it doesnt have an error, it's just informative. i already switched it back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, i guess you missed the sarcasm. I'll use tags next time

@jbenet
Copy link
Member

jbenet commented Oct 4, 2015

Perf is not good. i'm getting <1MB/s with ipfs cat on the same machine.

Also CPU is pegged and RAM is ~300MB, though the file is 200MB, and only have 50MB of it...

finished:

200.00 MB 4m20s

that's 0.77 MB/s

@jbenet
Copy link
Member

jbenet commented Oct 4, 2015

in another run, with an isolated network of 5 nodes:

> IPFS_PATH=4 ipfs get -o r4 QmXipyrjt2Vzjg2vV7uhgCi8GNEFhrEdXrirtDcYhqGGEc
Saving file(s) to r4
200.00 MB 1m7s

but

> IPFS_PATH=2 ipfs cat QmXipyrjt2Vzjg2vV7uhgCi8GNEFhrEdXrirtDcYhqGGEc >r2

is just completely stuck. daemon responds to other queries though.

>  IPFS_PATH=2 ipfs id -f='<id>\n'
QmdpL8zu4GTPvse64R2LFrxDZRC9gnj8hvirfnwRq4q3Xs
> IPFS_PATH=2 ipfs bitswap stat
bitswap status
    provides buffer: 0 / 256
    blocks received: 1
    dup blocks received: 0
    dup data received: 0
    wantlist [5 keys]
        QmZymseMQJKLVmzx1W8GyKzUL1zcgF9NEXVAwTxnWvSVWS
        QmetGENiJ9YCUyUuYyJDpGHXLYichh9hqZnsBACqUaFc5F
        QmYVkwVgmtbWKdfAjNrVTbytaKUTF3ixXP5YbBMfHrhGQ6
        QmS3JfCbMmydsh1TtPiDGE39eimoP5nsej8JVjjjpEjcAp
        QmSgusHjqd3pT79FUWv8TmGwUWA2E4F4tHmBV5Dn7CbqhL
    partners [2]
        QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
        QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b
  • (3) QmNRSUw5jiDmzb4xq33wgHC5acRibJ8kZiv6iUfnsfPBDb is creator.
  • (4) QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b has it (one of the partners)

> IPFS_PATH=3 ipfs id -f='<id>\n'
QmNRSUw5jiDmzb4xq33wgHC5acRibJ8kZiv6iUfnsfPBDb
> IPFS_PATH=3 ipfs bitswap stat
bitswap status
    provides buffer: 0 / 256
    blocks received: 0
    dup blocks received: 0
    dup data received: 0
    wantlist [0 keys]
    partners [2]
        QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
        QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b
> IPFS_PATH=4 ipfs id -f='<id>\n'
QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b
> IPFS_PATH=4 ipfs bitswap stat
bitswap status
    provides buffer: 0 / 256
    blocks received: 1120
    dup blocks received: 314
    dup data received: 82317612
    wantlist [0 keys]
    partners [2]
        QmNRSUw5jiDmzb4xq33wgHC5acRibJ8kZiv6iUfnsfPBDb
        QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59

so neither of these list (2) QmdpL8zu4GTPvse64R2LFrxDZRC9gnj8hvirfnwRq4q3Xs as a partner!


and yet they're all connected.

> IPFS_PATH=2 ipfs swarm peers
/ip4/10.0.1.12/udp/63558/utp/ipfs/QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
/ip4/127.0.0.1/udp/4100/utp/ipfs/QmTGQs7w852bh43JA44YKg2t3Gar4KRBk4fiswL9zgFXuj
/ip4/127.0.0.1/udp/63548/utp/ipfs/QmNRSUw5jiDmzb4xq33wgHC5acRibJ8kZiv6iUfnsfPBDb
/ip4/127.0.0.1/udp/63555/utp/ipfs/QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
/ip4/127.94.0.1/udp/63557/utp/ipfs/QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
/ip4/127.94.0.1/udp/63561/utp/ipfs/QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b
> IPFS_PATH=3 ipfs swarm peers
/ip4/10.0.1.12/udp/4101/utp/ipfs/QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59
/ip4/127.0.0.1/udp/4100/utp/ipfs/QmTGQs7w852bh43JA44YKg2t3Gar4KRBk4fiswL9zgFXuj
/ip4/127.0.0.1/udp/4102/utp/ipfs/QmdpL8zu4GTPvse64R2LFrxDZRC9gnj8hvirfnwRq4q3Xs
/ip4/127.0.0.1/udp/4104/utp/ipfs/QmdTUjEc5UtJ97aMReJTYGqA2Uvp6LG5VEzxfxm2S3485b
> IPFS_PATH=4 ipfs swarm peers
/ip4/127.0.0.1/udp/4100/utp/ipfs/QmTGQs7w852bh43JA44YKg2t3Gar4KRBk4fiswL9zgFXuj
/ip4/127.0.0.1/udp/63551/utp/ipfs/QmNRSUw5jiDmzb4xq33wgHC5acRibJ8kZiv6iUfnsfPBDb
/ip4/127.94.0.1/udp/4102/utp/ipfs/QmdpL8zu4GTPvse64R2LFrxDZRC9gnj8hvirfnwRq4q3Xs
/ip4/127.94.0.1/udp/63556/utp/ipfs/QmZtLnbWEnYyyNiVt7wobVdhXEPKxuhYEapPywF4zGbg59

this appears to be either bitswap being incredibly stupid, or the transport totally failing.


and yes, an iptb restart fixed the problem -- (2) succeeds. but the problem remains-- that shouldn't ever happen.

@jbenet
Copy link
Member

jbenet commented Oct 4, 2015

  • in another run, stalled at 170MB for 20s, then finished.

@anacrolix
Copy link
Contributor

Are the nodes that seem to stall, purely receiving at that time on utp. That is to say, are they only calling utp.Conn.Read? No writes?

@whyrusleeping
Copy link
Member Author

@anacrolix There may be interleaved reads and writes. we do duplex communication over the same pipe

@jbenet
Copy link
Member

jbenet commented Oct 5, 2015

@anacrolix no, they would be writing too, at least at some point. other things happening would cause samll amts of data to be written to all nodes (dht random queries for bootstrapping, bitswap wantlist updates, etc)

@jbenet
Copy link
Member

jbenet commented Oct 5, 2015

this stalling may be more a problem in ipfs than in utp. i'm not sure.

@whyrusleeping whyrusleeping force-pushed the feat/utp branch 2 times, most recently from 430b9df to e75c69a Compare October 5, 2015 05:41
@@ -265,3 +209,100 @@ func MultiaddrNetMatch(tgt ma.Multiaddr, srcs []ma.Multiaddr) ma.Multiaddr {
}
return nil
}

type ProtoDialer interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbenet I've taken and made the dialers an interface, just like you always wanted!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might belong somewhere better than here though, and the interface probably could use some tweaks, but this is the composition that i'm wanting.

@whyrusleeping
Copy link
Member Author

I added a utp option to iptb so you can easily generate a testnet using utp:

$ iptb init -n 10 --utp

@whyrusleeping
Copy link
Member Author

@jbenet how did you set up the iptb cluster when you reproduced the hangs?

@jbenet
Copy link
Member

jbenet commented Oct 6, 2015

@whyrusleeping i just set them up normally, and then changed all the addresses manually.

@jbenet
Copy link
Member

jbenet commented Oct 6, 2015

they were all able to talk to each other though

@whyrusleeping
Copy link
Member Author

so you set them to bootstrap to eachother?

}

// get the local net.Addr manually
la, err := manet.ToNetAddr(laddr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do this before the Listen. if errors, saves the listener. also this is currently wrong as the listener will not be closed if this errors here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, i do not think the transport should start listening just by being created. reuseport should allow listening after, (yes after a dial has opened on the same port).

@jbenet
Copy link
Member

jbenet commented Oct 27, 2015

@whyrusleeping this changeset is not viable. it has errors, reintroduces bugs, and reverts a ton of work that went into getting this interfaces to be well layered. (e.g. the swarm SHOULD NOT care at all about specific transports).

License: MIT
Signed-off-by: Jeromy <[email protected]>
@whyrusleeping
Copy link
Member Author

@jbenet
#1789 (comment)
#1789 (comment)

We can call the things something other than transports. Otherwise, how do you want the interface to look?

@whyrusleeping
Copy link
Member Author

@jbenet instead of going back and forth in comments for this, lets just have a hangout and hash it out

@whyrusleeping whyrusleeping mentioned this pull request Nov 2, 2015
47 tasks
@rht
Copy link
Contributor

rht commented Nov 25, 2015

It looks like the CR got stalled.

Will 0.4.0 ship first? The sooner this happens, the sooner pending modularizations (commands/files, gateway, go-libp2p, cmd/*) can happen? Even if modularization is bottlenecked by gx pathing, this has still smaller decision tree than utp interface's?

@jbenet
Copy link
Member

jbenet commented Nov 25, 2015

I think we did hash it out-- just needs rebasing over the new transports?

@whyrusleeping
Copy link
Member Author

yeah, this is going to go into libp2p, i have a PR ready, doesnt look like i've pushed it yet.

@rht
Copy link
Contributor

rht commented Nov 25, 2015

Which means, the remaining bottleneck for 0.3.10 is ipfs-update?

@whyrusleeping
Copy link
Member Author

yeap. ipfs-update is all thats in the way of 0.3.10. i'm pushing pretty much everything else into 0.4.0

@jbenet jbenet removed the status/in-progress In progress label Dec 9, 2015
@Kubuxu Kubuxu deleted the feat/utp branch February 27, 2017 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants