Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-1399] multiclass-mcc metric enhancements #14874

Merged
merged 3 commits into from
Aug 29, 2019
Merged

Conversation

tlby
Copy link
Contributor

@tlby tlby commented May 3, 2019

Description

multiclass-mcc metric enhancements

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Rename metric from "PCC" to "MMCC" because though the math is derived
    from Pearson CC, it's utility is as a multiclass extension of Mathews CC.
  • Harden mx.metric.MMCC.update to more variations of input format, similar to
    mx.metric.Accuracy.update.

Comments

  • If this change is a backward incompatible change, why must this change be made.
    The name change from mx.metric.PCC() to mx.metric.MMCC() for this metric has not yet been included in any releases or tagged branches, and for anyone following master the code update would be trivial.

@tlby tlby requested a review from szha as a code owner May 3, 2019 23:01
 * Rename metric from "PCC" to "mMCC" because though the math is derived
   from Pearson CC, it's utility is as a multiclass extension of Mathews CC.
 * Harden mx.metric.mMCC.update to more variations of input format, similar to
   mx.metric.Accuracy.update.
@tlby
Copy link
Contributor Author

tlby commented May 4, 2019

verified that the additional checks in mx.metric.MMCC.update() are not a bottleneck.
bench.py

@tlby
Copy link
Contributor Author

tlby commented May 4, 2019

Oh, I should clarify this here; I discovered that use cases like the fastText example from gluon-nlp expect metrics to be comfortable taking predictions already .argmax() resolved. mx.metric.Accuracy has a special case for that and this patch makes mx.metric.MMCC flexible in the same way.

@szha
Copy link
Member

szha commented May 5, 2019

@tlby thanks for the patch! Since PCC has already been merged for almost a month, there might be people who already depend on PCC. If there is, then renaming PCC to MMCC will likely break them. Since PCC hasn't made into a release yet, it's technically not breaking a released API, but with that in mind, do you feel necessary to rename the class?

@tlby
Copy link
Contributor Author

tlby commented May 5, 2019

I don't think it's absolutely necessary, but I definitely regret the name choice because while the implementation came from a discrete derivation of the Pearson correlation coefficient, all a user cares about is that it's a multiclass extension of Matthews correlation coefficient. When it comes to using these metrics Pearson is for continuous and Matthews is for discrete predictions. This metric is discrete so I feel it was misnamed. I later discovered that the formula I used is equivalent to one mentioned in Wikipedia as an extension of Matthews.

Perhaps we should leave "pcc" as an alias for the sake of reverse compatibility?

@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label May 6, 2019
@karan6181
Copy link
Contributor

@szha Could you please review it again? Thanks!

@piyushghai
Copy link
Contributor

@szha Gentle ping...

@vandanavk
Copy link
Contributor

Is this PR good to go?

@szha
Copy link
Member

szha commented Jun 16, 2019

@karan6181 @piyushghai @vandanavk the class rename needs to be reverted

@karan6181
Copy link
Contributor

@szha I think @tlby addressed your review comments by renaming the class. Could you please review it again? Thanks!

@szha
Copy link
Member

szha commented Jul 21, 2019

Thanks @tlby! The only thing left is that we need test cases to cover both code paths.

@piyushghai
Copy link
Contributor

@szha Is this PR good to merge now ?

@lanking520 lanking520 merged commit 196d1f4 into apache:master Aug 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants