-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
??- returns only the last tuple of a sequence #294
Comments
What cascading version and cascalog versions are you using? This reminds me of an iterator bug we fixed a while ago. — On Sat, Oct 31, 2015 at 6:15 PM, Timothy Galebach
|
I'm using cascalog 2.1.1. I haven't explicitly declared anything wrt cascading; I've just been following the project's readme to get started. Relevant portion of project.clj below: :dependencies [[org.clojure/clojure "1.7.0"]
[cascalog "2.1.1"]]
:profiles { :dev {:dependencies [[org.apache.hadoop/hadoop-core "1.2.1"]]}}
:jvm-opts ["-Xms768m" "-Xmx768m"]) |
Yeah, this is fixed in 3.0.0-SNAPSHOT, which I think I the latest version off of master. Want to give that a shot? We're due for a new release for sure. — On Sat, Oct 31, 2015 at 5:22 PM, Timothy Galebach
|
Same issue occurs with these dependencies: :dependencies [[org.clojure/clojure "1.7.0"]
[cascalog/cascalog-core "3.0.0-SNAPSHOT"]] Is there a working project.clj I could take a look at? Once this gets resolved I'm guessing it will come down to a documentation issue, and I'm happy to submit a pull request for that. I also had some initial frustrations because the documentation doesn't mention needing to run (bootstrap-emacs) in cider, so that should probably be fixed as well. |
For some reason my internet connection's preventing me from launching a repl (by blocking dependency downloads in leiningen), but I THINK, based on a different bug, I have a guess about what's causing this. Can you give this branch a try? Check out the discussion here: #251 Along with this fix: #280 for some more background on the issue. Also, Any updates on documentation you want to send over would be huge. |
Trying that branch now, trying to build it and put in the local repo, but running into the issue that the sub-modules (cascalog-checkpoint, midje, etc) depend on cascalog-core, so I'm not able to compile them initially. I don't usually structure projects like this--how do you compile this structure? |
Ah, sorry- first, run "lein sub install" in the base directory. Thanks for trying this out! — On Sun, Nov 1, 2015 at 12:45 PM, Timothy Galebach
|
OK, that works for compilation/local repo installation. Unfortunately the bug still persists. If it's helpful, the log output in the repl says that Cascading 2.5.3 is being used currently. Thanks for the help so far! Have a project I'm transitioning over to hadoop as it's grown a lot, and I'd really like to go with cascalog on it, so hopefully can sort this out. |
This looks very related to #292. The folks over at that ticket figured out that this issue only shows up with Clojure 1.7.0. |
OK, I'll try going back to 1.6, thanks! |
That fixed it. I'm going to submit a pull request for docs that are a bit more current in a bit. |
This just bit me as well; Can confirm that switching to 1.6 fixes the issue, but it would be nice to have a 1.7 compatible fix. |
@metasoarous totally hear you. I'm happy to review any pull requests from folks who want to take this on! I'm not using Cascalog for my work these days, so I don't have time to fix bugs like this myself, but I am available on a consulting basis to fix bugs or add features. |
Hi @sritchie: I appreciate the offer. Right now, 1.7 isn't critical for us, but if it becomes necessary we'll keep that in mind. I mostly just wanted to add a second data point for posterity's sake :-) |
http://dev.clojure.org/jira/browse/CLJ-1738 1.7 Compatibility Notes: iterator-seq change, it could help ? Direction of this ticket changed at Rich's request. Prior description capture here: Clojure code that uses iterator-seq to wrap Java iterators that return the same mutable object on every call are broken by the chunked iterator-seq changes from CLJ-1669. Some examples where this occurs: Hadoop ReduceContextImpl$ValueIterator Approach: Switch iterator-seq back to non-chunked and change eduction to use the chunking iterator-seq strategy as that was the original target. Retain the use of the chunked iterator seq in sequence over the TransformerIterator. |
only ??- ??<- use iteraltor-seq |
@Nightlord this is really interesting, and probably the reason for the bug. Looks like a change like this may work: (defn iter-seq [iter f]
(if (.hasNext iter)
(lazy-seq
(cons (f (.next iter))
(iter-seq iter f))))) |
@sritchie fix ??-, ci build problem, add profile 1.6,1.7. build success. |
The following input on cascalog.playground:
returns
gives the correct result (10 unique names and ages).
The text was updated successfully, but these errors were encountered: