Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack Overflow when I tried to save my stuff #85

Closed
freddycct opened this issue Apr 12, 2014 · 18 comments
Closed

Stack Overflow when I tried to save my stuff #85

freddycct opened this issue Apr 12, 2014 · 18 comments

Comments

@freddycct
Copy link

ERROR: stack overflow
in h5d_write at ~/.julia/v0.3/HDF5/src/plain.jl:1778

@ihnorton
Copy link
Member

Please provide some context. Otherwise this is not a bug report that we can do very much about.

@freddycct
Copy link
Author

type Edge
    speed::Float64
    distance::Float64

    function Edge(distance::Float64, speed::Float64)
        edge = new()
        edge.distance = distance
        edge.speed = speed
        edge
    end
end

type Bus_Stop
    id::Int64
    name::ASCIIString
    latitude::Float64
    longitude::Float64
    edges::Dict{Bus_Stop, Edge}

    Bus_Stop(id::Int64) = (bs = new(); bs.id = id; bs)
end

bus_stops = Dict{Int64, Bus_Stop}()

/* after doing some operations, creating Bus_Stop(s) in the dictionary bus_stops */

@save "file.jld" bus_stops

@timholy
Copy link
Member

timholy commented Apr 14, 2014

There's still not enough detail to replicate the problem. I entered all the code that is quoted above (note I made some edits to the order, and added quote marks---three backticks in a row), but I don't get any errors.

@freddycct
Copy link
Author

Have you created Edges too? Just to let you know there can be a situation like this...

A -> B -> C -> A

Cyclical graph, I think that's where the stack overflow comes from. And my graph is not that huge, 4000 nodes with 6000 Edges.

@timholy
Copy link
Member

timholy commented Apr 14, 2014

It's best if you can post a complete example that demonstrates the problem; it sometimes depends on details that I may never guess. (Bonus points if the example is also fairly minimal, although sometimes that's not straightforward, and any bug report is better than none at all.) In an open-source community where people are contributing code for free, it's also just good citizenship to avoid expecting that busy academics with priorities of their own will go to the trouble to try to read your mind 😄.

Nevertheless in this case I decided to try, and managed to come up with an example that demonstrates a problem even if it's not perhaps exactly what you're seeing. Here's a complete example:

type Edge
    speed::Float64
    distance::Float64

    function Edge(distance::Float64, speed::Float64)
        edge = new()
        edge.distance = distance
        edge.speed = speed
        edge
    end
end

type Bus_Stop
    id::Int64
    name::ASCIIString
    latitude::Float64
    longitude::Float64
    edges::Dict{Bus_Stop, Edge}

    Bus_Stop(id::Int64) = (bs = new(); bs.id = id; bs)
end

A = Bus_Stop(1)
B = Bus_Stop(2)
A.edges = [B => Edge(1.0,1.0)];
B.edges = [A => Edge(1.0,1.0)];

using HDF5, JLD

@save "/tmp/test.jld" A B

As far as fixing it goes, I will need to think this through a little bit. We need to make sure that any solution also works for the case where somebody starts from a pre-existing file, loads in the graph, adds some new nodes, and appends the new variables to the file.

@freddycct
Copy link
Author

Cool, so this simple test case didn't terminate quickly, I think this is what happens too. Thanks !!!

@freddycct
Copy link
Author

using HDF5, JLD

type Edge
    a::Float64
end

type Foo
    id::Int64
    edges::Dict{Foo, Edge}

    Foo(id::Int64) = (foo = new(); foo.id = id; foo)
end

begin
    A = Foo(1)
    B = Foo(2)
    A.edges = [B => Edge(1.0)]
    B.edges = [A => Edge(1.0)]

    @save "test.jld" A
end

Simplified the test case. I wonder is there a solution yet?

@timholy
Copy link
Member

timholy commented Apr 19, 2014

Not one that I've had time to implement, or really even think about. (Sorry. I have way too many packages I maintain, and open issues in many of them, as well as papers to write, grants to write, my own code to write, etc. HDF5 would benefit greatly from a new maintainer; I'd be happy to transfer this package to a JuliaLang location.) If you care to tackle this yourself and submit a PR, you're likely to get a solution faster.

As I'm sure you understand, the problem is handling cyclic dependencies. I suspect the best solution would be for the JLDFile to maintain a Set of known objects in the file. Before writing a new object, check to see if it's already there; if so, don't bother writing it again. You'd have to think through how to populate the Set; for a pre-existing file, I suspect it would be a bad idea to have to read the entire file when it's opened just to populate the Set. I suspect one could do something more clever, because of course you're not going to be able to add a reference to an object in the file unless you've read that object. So the Set could be a list of objects that have either been read or written within the current session. I think that would solve the problem, but of course it's only by implementation that one can be sure.

@freddycct
Copy link
Author

I will try to fix it, what's "PR" ? How do I pass you the solution, if I have one? Can I commit and git push?

@timholy
Copy link
Member

timholy commented Apr 19, 2014

Sorry, PR=pull request. Here's the workflow:

  1. Inside your package directory (probably ~/.julia/v0.3 if you're on Linux), go into Grid/src and make the changes that fix the problem.
  2. Run the runtests.jl script from within Grid/test to see if you've broken anything else 😄. If so, fix it.
  3. Add a test for your case to one of the test files (here, probably jld.jl). That will ensure that other people don't break your fix. "A bug fix is for today; a test is forever!"
  4. When you're happy, commit your changes to your own repository. For example, git commit -a from the command line, or use git gui. Write a commit comment describing your changes.
  5. Type Pkg.submit("HDF5"). Open the link that it gives you, and submit your pull request.
  6. Once I (or @simonster) has had a chance to look at your submission, you should expect to receive suggestions for improvement; the spirit of these is (1) to teach you more about issues that your own learning may not have yet led you to, and (2) to ensure the quality of code for other users. We'll walk you through how to make revisions when the time comes (it's not hard).

If you're new to git, it is awesome but can be intimidating at first. Fortunately the Julia package manager handles most of the issues in "simple" cases. If you encounter trouble, there's a lovely book, and a lot of online help. Nevertheless, if you have any questions, don't hesitate to ask---I'm happy to try to help shorten your learning curve. Three git-related points:

  • If you get into trouble, you can always say git checkout src/jld.jl to get a fresh copy of a file you've messed up. That will discard any changes you've made that have not been committed.
  • If you get to an intermediate stage that works and doesn't break anything else, you can make a commit. That way you "save" your work in case you get into trouble and have to revert.
  • The Pkg.submit command will send me all the commits you have made; you don't have to put them all in a single commit (although that's fine, too).

@timholy
Copy link
Member

timholy commented Apr 19, 2014

Oh, I should have also said: more detail about workflow is in the Package manager docs. You might be especially interested in the part about configuring git. Once it starts on the example of creating a new package, that's probably less helpful to you, and I didn't find a detailed description of how to make changes to an existing package. In particular, Pkg.submit() doesn't seem to be documented anywhere (CC @StefanKarpinski). Still, the documentation of the functions may also be quite helpful.

@StefanKarpinski
Copy link
Member

Pkg.submit() is still kind of experimental, but it does seem to work pretty well. I should probably figure out how to fix JuliaLang/julia#5998 and then start using the ability to make pull requests on GitHub a bit more universally and document these GitHub-interaction methods.

@freddycct
Copy link
Author

Can't do it, I tried to use a global Set variable to store the objects that have been written. But ended up with this error which involves the symbol table.

HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
#000: H5O.c line 246 in H5Oopen(): unable to open object
major: Symbol table
minor: Can't open object
#1: H5O.c line 1357 in H5O_open_name(): object not found
major: Symbol table
minor: Object not found
#2: H5Gloc.c line 430 in H5G_loc_find(): can't find object
major: Symbol table
minor: Object not found
#3: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#4: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
major: Symbol table
minor: Callback failed
#5: H5Gloc.c line 385 in H5G_loc_find_cb(): object '1' doesn't exist
major: Symbol table
minor: Object not found
ERROR: Error opening object /_refs/A/2g/1
in h5o_open at ~/.julia/v0.3/HDF5/src/plain.jl:1875

@timholy
Copy link
Member

timholy commented Apr 19, 2014

Not exactly sure what you tried, but a couple of points:

  • If you try to write to a file that's already open in a process, it will fail. When in doubt quit your Julia session and start a fresh one.
  • Simply adding a behind-the-scenes Set object shouldn't cause errors in the HDF5 library, since its only use is to check (on the Julia side) whether an object needs to be written.
  • It seems like you were trying to read an object from the file that doesn't yet exist. If you were only testing writing, you probably don't need to read anything from the file.

@freddycct
Copy link
Author

type Bar
    id::Int64
    edges::Array{Bar}

    Bar(id::Int64) = (bar = new(); bar.id = id; bar)
end

begin
    A = Bar(5)
    B = Bar(2)

    A.edges = Array(Bar, 1)
    A.edges[1] = B

    @save "test.jld" A B
end

When I start another julia session,

using HDF5, JLD
begin
       @load "test.jld"
end

The problem is that the Bar composite type does is not recognised. Another issue is

A.edges[1] == B

returns false. Seems to me that @save does not save the type definition and object references. That's why I cannot easily fix the circular dependency problem too.

@timholy
Copy link
Member

timholy commented Apr 20, 2014

The problem is that the Bar composite type does is not recognised

Modulo one issue which I'm (slowly) working on to finish resolving #81, there's enough information in the file to reconstruct the type. h5dump is your friend; see the info in the _types group.

However, in typical uses I'm not sure it would be all that useful: if you construct a Bar type in Main, but if in your code the Bar type is defined inside MyModule, then while they would both appear to be named Bar the two would not be equivalent and would not dispatch properly. Hence for "real" use cases I've decided (up for debate, of course) not to try to rebuild the type. Instead, the addrequire function is probably the most relevant. By happenstance, I improved that aspect of the documentation just yesterday, so if you're puzzled about not having noticed it previously, that's why 😄.

So if you create a file called Bar.jl, you can get your types back by saving this way:

begin
    A = Bar(5)
    B = Bar(2)

    A.edges = Array(Bar, 1)
    A.edges[1] = B

    jldopen("test.jld", "w") do file
        addrequire(file, "Bar")
        write(file, "A", A)
        write(file, "B", B)
    end
end

Another issue is A.edges[1] == B returns false

This is going to be the trickier one. With the writing command above, here's what's actually happening (I encourage you to follow along with h5dump):

  • The addrequire statement is handled, adding an item to the _require group of the file
  • A is written. In the "A" dataset, it makes a reference to _refs/A/1 and _refs/A/2. _refs/A/1 holds the integer value (5). _refs/A/2 refers to a whole new group 2g: this stores the instantiation of B, but not by the name of "B". (Julia doesn't store the fact that A.edges[1] is named B.)
  • dataset B is written. It makes a reference to _refs/B/1 (holding the integer value 2) and NULL (since edges is undefined for B). While this has the same data as _refs/A/2g, it's a separately-constructed object, and hence == returns false.

Note that if you defined isequal for the Bar type to compare values, rather than the default pointer addresses, you would get true for this statement, but that would still not be what you want because you'd actually have two different objects in memory; updating one of them would not change the other.

The Set idea should still be your friend:

  • If B is written first (creating dataset "B" that refers to _refs/B/1 and NULL), then when A is written, rather than creating the new group _refs/A/2g it should refer to the data in "B"
  • If A is written first, then when B is written it should refer to the data in _refs/A/2g.

However, I'm now skeptical that this will suffice; all it would do would be to fix up the references in the file, and reading would still create two separate objects.

Hmm. It seems like we need to introduce some type that indicates a Julia cross-reference, not to be evaluated by HDF5. When your Set construct detects that a referred object is already present in the file, it should trigger writing as a Julia cross-reference. Reading the B dataset should also trigger the reading (and naming) of A, so that all this works together.

@freddycct
Copy link
Author

I have been thinking of how to address this issue. It seems that I need to read the entire HDF5 specifications. A simple fix to the current jld.jl does not seem sufficient because the current code save values instead of references. I tried Julia built-in serialization too and it turns out that it also does not store references. Refer to this JuliaLang/julia#6590

@timholy
Copy link
Member

timholy commented Apr 20, 2014

I certainly hope you don't have to read the entire HDF5 spec, I wouldn't wish that on anyone! I've had to read much too much of it myself. I didn't even have prior experience with HDF5 before starting this package, it just seemed like someone had to implement a good saving format so I bit the bullet.

For your problem, I'm not certain of the right way forward either, but one option would seem to be to define a type CrossReference inside JLD. In the file you'd write the fields of the CrossReference type as values, but upon reading a CrossReference object would be decoded as a reference. I haven't thought through the precise details of making that work, but it seems like it should be doable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants