Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dgraphloader not loading schema quickly, or client should expose a way to wait for schema update to finish #1184

Closed
srh opened this issue Jul 13, 2017 · 5 comments
Assignees
Labels
investigate Requires further investigation
Milestone

Comments

@srh
Copy link

srh commented Jul 13, 2017

Suppose you do this:

$ ~/go/bin/dgraphloader -r ~/1million.rdf.gz -s data/21million.schema 

Processing data/21million.schema

Processing /home/srh/1million.rdf.gz
[Request:      1] Total RDFs done:    72024 RDFs per second:   35878 Time Elapse
[Request:      1] Total RDFs done:    72024 RDFs per second:   17973 Time Elapse
[Request:      1] Total RDFs done:    72024 RDFs per second:   11989 Time Elapse
[Request:      1] Total RDFs done:    72024 RDFs per second:    8995 Time Elapse
[Request:      1] Total RDFs done:    72024 RDFs per second:    7197 Time Elapse
^C 10s 

Then when I run curl localhost:8080/query -XPOST -d 'schema { }' I get back {"schema":null}.

You would think that it would have the schema in place before loading the data.

It looks like the cause of this is that in client/mutations.go we have AddSchema being defined as

func (d *Dgraph) AddSchema(s protos.SchemaUpdate) error {
	if err := checkSchema(s); err != nil {
		return err
	}
	d.schema <- s
	return nil
}

It just pushes the schema onto a channel.

Is the reason for this that because schema updates iterate over the db, we want to make them out-of-band? It looks like there's no way in the client to wait for a particular schema update to complete.

@manishrjain
Copy link
Contributor

It used to wait. @janardhan1993 ?

@srh
Copy link
Author

srh commented Jul 13, 2017

It seems like it does

	// wait for schema changes to be done before starting mutations
	time.Sleep(1 * time.Second)

which I guess wasn't enough.

@janardhan1993 janardhan1993 self-assigned this Jul 13, 2017
@janardhan1993 janardhan1993 added investigate Requires further investigation kind/bug Something is broken. labels Jul 13, 2017
@janardhan1993 janardhan1993 added this to the v0.8 milestone Jul 13, 2017
@srh
Copy link
Author

srh commented Jul 13, 2017

As I discussed with @janardhan1993 maybe this is more of a feature request. It seems like it would be nice if dgraphloader supported waiting for the schema update to be applied before loading data. Or it would be useful in general if clients could call AddSchema and then wait for the result, because subsequent queries might need to use indexes it specifies.

@janardhan1993 janardhan1993 modified the milestones: v0.8Maybe, v0.8 Jul 17, 2017
@manishrjain manishrjain modified the milestones: v0.8Maybe, v0.8.1 Jul 26, 2017
@manishrjain manishrjain removed the kind/bug Something is broken. label Jul 27, 2017
@manishrjain
Copy link
Contributor

We can add an API, SetSchemaBlocking, which can wait until the schema is applied. And then, the user can start sending the data after this has returned.

janardhan1993 pushed a commit that referenced this issue Jul 28, 2017
*Exit on pressing Ctrl-c thrice, change raft maxmsgsize to 256 kb 
*Fixes #1193 #1184
@janardhan1993
Copy link
Contributor

We set schema wait until it is set and then only start loading data.

@manishrjain manishrjain added the investigate Requires further investigation label Mar 22, 2018
jarifibrahim pushed a commit that referenced this issue Mar 16, 2020
Important changes
```
 - Changes to overlap check in compaction.
 - Remove 'this entry should've been caught' log.
 - Changes to write stalling on levels 0 and 1.
 - Compression is disabled by default in Badger.
 - Bloom filter caching in a separate ristretto cache.
 - Compression/Encryption in background.
 - Disable cache by default in badger.
```

The following new changes are being added from badger
`git log ab4352b00a17...91c31ebe8c22`

```
91c31eb Disable cache by default (#1257)
eaf64c0 Add separate cache for bloom filters (#1260)
1bcbefc Add BypassDirLock option (#1243)
c6c1e5e Add support for watching nil prefix in subscribe API (#1246)
b13b927 Compress/Encrypt Blocks in the background (#1227)
bdb2b13 fix changelog for v2.0.2 (#1244)
8dbc982 Add Dkron to README (#1241)
3d95b94 Remove coveralls from Travis Build(#1219)
5b4c0a6 Fix ValueThreshold for in-memory mode (#1235)
617ed7c Initialize vlog before starting compactions in db.Open (#1226)
e908818 Update CHANGELOG for Badger 2.0.2 release. (#1230)
bce069c Fix int overflow for 32bit (#1216)
e029e93 Remove ExampleDB_Subscribe Test (#1214)
8734e3a Add missing package to README for badger.NewEntry (#1223)
78d405a Replace t.Fatal with require.NoError in tests (#1213)
c51748e Fix flaky TestPageBufferReader2 test (#1210)
eee1602 Change else-if statements to idiomatic switch statements. (#1207)
3e25d77 Rework concurrency semantics of valueLog.maxFid (#1184) (#1187)
4676ca9 Add support for caching bloomfilters (#1204)
c3333a5 Disable compression and set ZSTD Compression Level to 1 (#1191)
0acb3f6 Fix L0/L1 stall test (#1201)
7e5a956 Support disabling the cache completely. (#1183) (#1185)
82381ac Update ristretto to version  8f368f2 (#1195)
3747be5 Improve write stalling on level 0 and 1
5870b7b Run all tests on CI (#1189)
01a00cb Add Jaegar to list of projects (#1192)
9d6512b Use fastRand instead of locked-rand in skiplist (#1173)
2698bfc Avoid sync in inmemory mode (#1190)
2a90c66 Remove the 'this entry should've caught' log from value.go (#1170)
0a06173 Fix checkOverlap in compaction (#1166)
0f2e629 Fix windows build (#1177)
03af216 Fix commit sha for WithInMemory in CHANGELOG. (#1172)
23a73cd Update CHANGELOG for v2.0.1 release. (#1181)
465f28a Cast sz to uint32 to fix compilation on 32 bit (#1175)
ea01d38 Rename option builder from WithInmemory to WithInMemory. (#1169)
df99253 Remove ErrGCInMemoryMode in CHANGELOG. (#1171)
8dfdd6d Adding changes for 2.0.1 so far (#1168)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigate Requires further investigation
Development

No branches or pull requests

3 participants