Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sortedTakeBy and sortedReverseTakeBy to Aggregator.scala #527

Merged
merged 1 commit into from
Jun 7, 2016
Merged

Add sortedTakeBy and sortedReverseTakeBy to Aggregator.scala #527

merged 1 commit into from
Jun 7, 2016

Conversation

joshualande
Copy link
Contributor

You can accomplish the same thing by passing Ordering.by(fn) directly to the sortedTake function, but I think adding these new functions makes the code a bit easier to use and will make downstream code more readable.

I didn’t see any existing unit tests on Aggregator.sortedTake, so I didn’t add any tests to this code. But would be willing to add some additional tests if we feel it is necessary.

@joshualande
Copy link
Contributor Author

Fixing a merging/pushing issue. I think the PR should be good now.

@@ -183,11 +183,21 @@ object Aggregator extends java.io.Serializable {
def sortedTake[T: Ordering](count: Int): MonoidAggregator[T, PriorityQueue[T], Seq[T]] =
new mutable.PriorityQueueToListAggregator[T](count)
/**
* Same as sorteTake, but using a function that returns a value that has an Ordering.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ("sorteTake")

@avibryant
Copy link
Contributor

These seem fine, but I wonder if the most usable API would be to add sort and sortBy to Preparer - I'm imagining that this would return some SortedPreparer object that had methods like head, last, reverse, and take (each of which would produce an Aggregator).

@joshualande
Copy link
Contributor Author

@avibryant

I fixed the typo! Great catch.

With your alternate API suggestion, is the idea that the code to build an equivalent function would be:

Preparer.sortBy[Type](_.timestamp).take(10)

vs

Aggregator.sortedTakeBy[Type, Long](10)(_.timestamp)

If so, would we then make Aggregator.sortedTakeBy a thin wrapper around the Preparer code?

One last question is, wouldn't it make more sense to call the functions in SortedPreparer 'max' and 'min' instead of 'head' and tail for clarity?

Thanks for your help!!!

@@ -183,11 +183,21 @@ object Aggregator extends java.io.Serializable {
def sortedTake[T: Ordering](count: Int): MonoidAggregator[T, PriorityQueue[T], Seq[T]] =
new mutable.PriorityQueueToListAggregator[T](count)
/**
* Same as sortedTake, but using a function that returns a value that has an Ordering.
*/
def sortedTakeBy[T, U: Ordering](count: Int)(fn: T => U): MonoidAggregator[T, PriorityQueue[T], Seq[T]] =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this sortByTake mirroring the style of using scala names? This is like sortBy(fn).take(count)

@johnynek
Copy link
Collaborator

johnynek commented Jun 6, 2016

+1 to add to the Preparer, but I would leave your implementations as is (just call the corresponding sortedTake method).

@johnynek
Copy link
Collaborator

johnynek commented Jun 7, 2016

@joshualande can you branch this diff off develop? I think you have master here, which is why you have all these tags and merges mentioned

@joshualande
Copy link
Contributor Author

@johnynek great catch. I messed it the merge, then fixed it, then messed it up again, but I think its all good now :) Still getting the hang of github PRs!

I updated the function names and comments. I would be happy to try the Preparing approach later when I've got more free time, but do you think it would make more sense to ship this first since the API seems pretty straightforward?

@johnynek johnynek merged commit 0ba730e into twitter:develop Jun 7, 2016
@avibryant
Copy link
Contributor

@joshualande I think it would actually be Preparer[Type].sortBy(_.timestamp).take(10) but yes, that's what I was thinking.

Re: min vs head. I guess once I think of it as a sorted list, then head makes more sense and is more consistent with what we do on TypedPipe. But I can see preferring maxBy to sortBy followed by head.

@joshualande
Copy link
Contributor Author

@avibryant @johnynek thanks so much for your help!!! I really appreciate your suggestions, and it is great to have a chance to contribute to the codebase!

I'll give this Preparer code a try when I've got a bit more free time!!

@johnynek
Copy link
Collaborator

johnynek commented Jun 7, 2016

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants