Skip to content

Conversation

@shahidki31
Copy link
Contributor

When invoking MatrixFactorizationModel.recommendProducts(Int, Int) with a non-existing user, a java.util.NoSuchElementException is thrown:

java.util.NoSuchElementException: next on empty iterator
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:63)
at scala.collection.IterableLike$class.head(IterableLike.scala:107)
at scala.collection.mutable.WrappedArray.scala$collection$IndexedSeqOptimized$$super$head(WrappedArray.scala:35)
at scala.collection.IndexedSeqOptimized$class.head(IndexedSeqOptimized.scala:126)
at scala.collection.mutable.WrappedArray.head(WrappedArray.scala:35)
at org.apache.spark.mllib.recommendation.MatrixFactorizationModel.recommendProducts(MatrixFactorizationModel.scala:169)

What changes were proposed in this pull request?

Throw a better exception, like "user-id/product-id doesn't found in the model", for a non-existent user/product

How was this patch tested?

Added UT

@shahidki31 shahidki31 changed the title [SPARK-18230][MLLib]Throw better exception,for a non-existing user/product [SPARK-18230][MLLib]Throw better exception, if the user/product doesn't exist Jul 9, 2018
@shahidki31 shahidki31 changed the title [SPARK-18230][MLLib]Throw better exception, if the user/product doesn't exist [SPARK-18230][MLLib]Throw better exception, if the user or product doesn't exist Jul 9, 2018
@shahidki31 shahidki31 changed the title [SPARK-18230][MLLib]Throw better exception, if the user or product doesn't exist [SPARK-18230][MLLib]Throw a better exception, if the user or product doesn't exist Jul 9, 2018
@shahidki31 shahidki31 force-pushed the checkInvalidUserProduct branch from b0c31a7 to b3ef34f Compare July 10, 2018 07:02
Copy link

@jianran jianran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the IllegalArgumentException better than SparkException?

@shahidki31
Copy link
Contributor Author

@jianran please refer the PR, #15809. In this PR, I am checking if the 'userFeatures.lookup(user)', is empty or not.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see the JIRA for a discussion. The main reason is that NoSuchElementException doesn't really make sense semantically and has no useful message attached.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to assert about the exact message string. Maybe assert it contains the user ID. But just checking for the exception also seems close enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is you now look up the user and product twice. Instead, the callers should probably check whether userFeatures.lookup(user) is empty before calling .head and otherwise throw the exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed the method to 'validateAndGetUser', where it check, whether the user exist or not and it returns the corresponding user feature. Similarly for the product also.
Please let me know if anymore changes required.

@shahidki31 shahidki31 force-pushed the checkInvalidUserProduct branch 4 times, most recently from d7315c4 to b07e07e Compare July 15, 2018 08:49

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm new to the code base, I understand Either and Option isn't used a lot in public APIs in Spark but shouldn't it be annotated that the functions throws a certain type of exception (being more explicit with the Exception) ?

like @throws(classOf[IllegalArgumentException]) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning Option could be reasonable, but, I think the caller would end up giving up with an exception anyway. There isn't an obviously better thing to do.

Exceptions all behave as if unchecked in Scala and don't get expressed in method signatures in the byte code. @throws is really for Java compatibility where it's important for the Java language to express that a checked exception is thrown (it makes a compile-time difference in Java).

That's not necessary here, but scaladoc'ing exceptions like this is indeed a good idea.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation Sean :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the pattern match? .lookup should already be known to return a Seq of the type of the RDD's values which is already known to be Array[Double].

The name of the method isn't really accurate either; it gets a user vector, not user.

val vec = userFeatures.lookup(user)
require(vec.nonEmpty, ...)
vec.head

How about one method instead of two, that takes the RDD as an arg? might cut down on the code expansion here. I'm on the fence about whether this warrants a new method since there are only two call sites, but it's reasonable.

Copy link
Contributor Author

@shahidki31 shahidki31 Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment.
I have removed both the methods. Because each method has to give distinct exception, like "user not found" in the first method and "product not found" in the second method. So, it is better to remove both the methods, instead of making one.
I have added the validation code inside the methods such as predict, recommendProduct etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen Please let me know if anymore changes required.

@shahidki31 shahidki31 force-pushed the checkInvalidUserProduct branch from b07e07e to 01d43c8 Compare July 15, 2018 16:23
@SparkQA
Copy link

SparkQA commented Jul 15, 2018

Test build #4211 has finished for PR 21740 at commit 01d43c8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shahidki31
Copy link
Contributor Author

Hi @srowen. The build has passed.

@srowen
Copy link
Member

srowen commented Jul 16, 2018

Merged to master. Your JIRA handle is "shahid" right?

@asfgit asfgit closed this in cf97045 Jul 16, 2018
@shahidki31
Copy link
Contributor Author

Thanks @srowen. yes, my JIRA handle is "shahid".

@shahidki31 shahidki31 deleted the checkInvalidUserProduct branch July 16, 2018 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants