Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algos visibility #1542

Merged
merged 8 commits into from
Nov 5, 2021
Merged

Algos visibility #1542

merged 8 commits into from
Nov 5, 2021

Conversation

miguelgfierro
Copy link
Collaborator

Description

I was talking with a colleague that is interested in graph recommenders. I realized that it is not straightforward for people that don't know the repo to know that sometimes there are quickstart notebooks and sometimes deep dive notebooks.

Not sure what is the best way to give visibility to all the content. Maybe we could consider adding the deep dive into the main readme? Other option could make a comment in the deep dive, referencing the quickstart, and viceversa, in the markdown of the notebook.

@anargyri what do you think?

Related Issues

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging branch and not to main branch.

@anargyri
Copy link
Collaborator

anargyri commented Oct 8, 2021

How about including both deep dive and quick start for each algorithm in the table?
It seems that increasingly people use this table as a comprehensive list of the notebooks that are available.

@miguelgfierro
Copy link
Collaborator Author

How about including both deep dive and quick start for each algorithm in the table?

Makes sense

@miguelgfierro
Copy link
Collaborator Author

miguelgfierro commented Oct 21, 2021

I'm going to put here some options:

Option 1 (Original):

Algorithm Environment Type Description
Alternating Least Squares (ALS) PySpark Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability

Option 2 (add extra column):

Algorithm Environment Type Description Example
Alternating Least Squares (ALS) PySpark Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability Quick start / Deep dive

Option 3 (remove environment column and add it in description):

Algorithm Type Description Example
Alternating Least Squares (ALS) Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability Quick start in PySpark / Deep dive in PySpark

Option 3.5 (remove environment column and add it in description):

Algorithm Type Description Example
Alternating Least Squares (ALS) Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability Quick start / Deep dive

Option 4 (option 3 with reordered columns):

Algorithm Example Type Description
Alternating Least Squares (ALS) Quick start in PySpark / Deep dive in PySpark Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability

Option 5 (variation of 4 removing the word "Python" to save space):

Algorithm Python Notebook Type Description
Alternating Least Squares (ALS) Quick start in Spark / Deep dive in Spark Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability

@laserprec
Copy link
Contributor

laserprec commented Oct 21, 2021

I'm going to put here some options:

I would vote for option 3, as our eyes are naturally drawn to either ends of the page. I am not sure how important it is to provide the environment information to users. As a first time user, I didn't find the environment info that much helpful because I would try out the algorithm regardless of the environment anyways if it is interesting 😄.
I would prefer
image
without the clutter.

Copy link
Contributor

@laserprec laserprec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference (modified option 3) 😄

Option 3.5 (remove environment column and add it in description):

Algorithm Type Description Example
Alternating Least Squares (ALS) Collaborative Filtering Matrix factorization algorithm for explicit or implicit feedback in large datasets, optimized by Spark MLLib for scalability and distributed computing capability Quick start / Deep dive

@anargyri
Copy link
Collaborator

I would vote for option 2 or option 3.5. I think each algorithm we have maps 1-1 to an environment, so both notebooks use the same environment. You can specify it in a separate column or omit it. I tend towards including the env info, because e.g. some people want to try the repo on a machine without GPU and it is convenient to see which methods may be slower because of this.

@miguelgfierro
Copy link
Collaborator Author

@laserprec I added column 3.5 in the list.

@yueguoguo @loomlike @gramhagen @wutaomsft any preference?

@miguelgfierro
Copy link
Collaborator Author

ok, I'll implement 3.5 as suggested by @anargyri and @laserprec

@miguelgfierro
Copy link
Collaborator Author

@anargyri @laserprec, please feel free to review again. If you find a typo or small thing that need to be change, as in other times, feel free to change it directly

@codecov-commenter
Copy link

Codecov Report

Merging #1542 (69cbc64) into staging (2306b2b) will decrease coverage by 0.23%.
The diff coverage is n/a.

❗ Current head 69cbc64 differs from pull request most recent head 7ee14fb. Consider uploading reports for the commit 7ee14fb to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           staging    #1542      +/-   ##
===========================================
- Coverage    62.36%   62.13%   -0.24%     
===========================================
  Files           85       84       -1     
  Lines         8503     8437      -66     
===========================================
- Hits          5303     5242      -61     
+ Misses        3200     3195       -5     
Flag Coverage Δ
pr-gate 62.13% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
recommenders/datasets/movielens.py 68.44% <0.00%> (-5.47%) ⬇️
recommenders/models/newsrec/newsrec_utils.py 66.66% <0.00%> (-3.34%) ⬇️
recommenders/models/geoimc/geoimc_data.py 39.13% <0.00%> (-2.54%) ⬇️
...ommenders/models/deeprec/io/sequential_iterator.py 14.34% <0.00%> (-1.51%) ⬇️
recommenders/models/lightfm/lightfm_utils.py 62.50% <0.00%> (-1.02%) ⬇️
recommenders/models/geoimc/geoimc_algorithm.py 85.50% <0.00%> (-0.80%) ⬇️
...ders/models/deeprec/models/sequential/sum_cells.py 44.75% <0.00%> (-0.77%) ⬇️
recommenders/models/surprise/surprise_utils.py 79.31% <0.00%> (-0.69%) ⬇️
recommenders/models/newsrec/models/base_model.py 30.48% <0.00%> (-0.43%) ⬇️
recommenders/models/newsrec/models/layers.py 52.99% <0.00%> (-0.40%) ⬇️
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 448982d...7ee14fb. Read the comment docs.

@miguelgfierro
Copy link
Collaborator Author

@laserprec @anargyri when you are available, can you please review again?

@laserprec
Copy link
Contributor

@laserprec @anargyri when you are available, can you please review again?

I double checked all the links again. Looks good!

@miguelgfierro miguelgfierro merged commit 095f382 into staging Nov 5, 2021
@miguelgfierro miguelgfierro deleted the miguelgfierro-patch-2 branch November 5, 2021 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants