-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bio.Tools metrics: quality, quantity, progress, and contribution indicators #113
Comments
One more note: "2. # of operations (with at least one EDAM data concept ≠ 0006, as input or output, and at least one EDAM operation concept ≠ 0004; summed over all entries)" means: # of DISTINCT operations within a Bio.Tools entry, where each can have multiple functions i.e. EDAM operation concepts ≠ 0004. That leads to another SIMPLE and relevant metric 2.5: 2.5. # of functions (i.e. # of EDAM operation concepts ≠ 0004, in operations with at least one EDAM data concept ≠ 0006, as input or output. That means that useless operations without neither inputs nor outputs are ignored, just like in 2.) Noteworthy, both 2. and 2.5 are relevant and SIMPLE, each important and motivating for good annotations separately: 2. for well-annotated tools with multiple operations (e.g. toolkits), and 2.5 for well-annotated tools with integrated functionality (e.g. workflows). A corresponding quality metric (C.) should be added: 12.5. 2.5. ÷ 3. -- i.e. # of functions per # of entries, as well as a corresponding progress metric (D.), and a corresponding per-entry quality metric (E.). |
Very useful - thanks a million for this proposal Matus. Enhanced content reporting is in the roadmap (http://biotools.readthedocs.io/en/latest/changelog_roadmap.html) for Dec 16 and could include much of this. ps. that more-or-less empty entry you pointed out was intentional: the BioExcel partners will be adding details in due course. We just needed to add them to bio.tools to allow a means for them to make edits. Really they should be in the "staging area" / marked as "beta" and this in the roadmap for 2017 Q1. |
@matuskalas - we should def. pick up on this later in the year once other higher priority things are out the way. I label as "complex" because while each individual thing is easy enough to do, there are lots of them |
From #25:
Also stats for each annotation: that's the key ones right now (given current content) I think? |
@matuskalas - on Monday me and @ekry will finalise which of the above ideas will make it into the next revision of bio.tools/stats : do you have any more ideas to add? Thanks! |
I'd like to hear some suggestions about what metrics we could get in light of the groupings in the information standard (https://github.com/bio-tools/biotoolsSchemaDocs/blob/master/information_requirement.rst), see bio-tools/biotoolsSchema#77 Specifically aggregated metrics to capture things like project maturity & community as evidenced by things like repos, documentation, mailing list etc etc.. |
Additional such metrics are an issues for biotoolsLint (https://github.com/bio-tools/biotoolsLint) to calculate potentially. |
Please, keep us in the loop (OpenEBench) as we have implemented, are
implementing and/or looking for mechanisms for computing those metrics by
consulting several entries.
Cheers,
Salva
…On Tue, Oct 25, 2016 at 1:18 PM Matúš Kalaš ***@***.***> wrote:
As a preamble to this topic, a great example:
https://bio.tools/tool/Galaxy/version/none 👎
Here comes a list of metrics/indicators that are EXTREMELY EASY TO
IMPLEMENT, while at the same time *excellent indicators of quality,
quantity, contribution, and progress*.
Notes: The SIMPLEST and *most relevant* indicators are in *bold*. The
rest are additional that are similarly SIMPLE and *relevant*, but less
general (more specific). All the following has been mentioned and discussed
regularly since the EMBRACE Registry times, repeatedly in various meetings
in Amsterdam and Lyngby, including Kristoffer, @ekry
<https://github.com/ekry>, @joncison <https://github.com/joncison>,
@hmenager <https://github.com/hmenager>, Łukasz, me, Manchester folks,
and Gert.
A. Basic (=) quantity metrics
*1. # of attributes* (nodes or leaves in the JSON/XML tree; summed over
all entries)
*2. # of operations* (with at least one EDAM data concept ≠ 0006, as
input or output, and at least one EDAM operation concept ≠ 0004; summed
over all entries)
*3. # of entries*
These 3 should certainly be shown also on the top of the Bio.Tools "home
page", and then also on the top of each list/table of search results (then
of course *per* the found entries).
B. Community (=) contribution metrics
*4. # of updates of an entry* (summed over all entries)
*5. # of individual registrants/curators* (especially nice after the
anonymous registrant groups a.k.a. "affiliations" are split into real users)
6. # of registrant/curator institutions
7. # of authors/developers/contributors in *Credits*
8. # of institutions in *Credits*
9. # of publications (with distinct DOIs)
C. Quality metrics (for the whole registry)
*10. 1. ÷ 3. -- i.e. # of attributes per # of entries 11. 2. ÷ 3. -- i.e.
# of operations per # of entries 12. 4. ÷ 3. -- i.e. # of all entry updates
per # of entries*
13. 5. ÷ 3. -- *i.e.* # of registrants/curators *per* # of entries
14. 7. ÷ 3. -- *i.e.* # of authors/developers/contributors *per* # of
entries (possibly *etc.* with 6., 8., 9.)
D. Progress visualisation
- *The growth of ALL THE METRICS ABOVE over time* (especially *1. - 5.*
and *10. - 12.*; with *per*-day resolution)
- Note: A separate report should be published where the above growth
curves are plotted on the time-line together with hackathons' and
workshops' dates marked.
Note:
All the indicators A. - D. can also be internally (within
ELIXIR-EXCELERATE WP1) reported *PER* PARTNER plus *per* "the rest of the
contributors (*i.e.* non-EL-EX-WP1)". The only required dependency is to
first manually split all registrants into the "outreach and support
spheres" *per* EL-EX-WP1 partner. Some registrants can fall under
multiple partners, *e.g.* all de.NBI ones are supported by DK+NO+FR.
E. Quality metrics (for one entry)
- *The same as 1. - 2. and 4. - 9., BUT FOR THE GIVEN ENTRY*
- *The same as above, BUT PER CURRENT AVERAGES (i.e. per 10. - 14.)*
- Note: Both of these can be beautifully visualised with some tiny
icons in the entry cards/rows, and even in the future taken into account
when sorting search results.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#113>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAH4h0gjRTEfgIThmdnR3_MRP7I5vAiIks5q3eWWgaJpZM4Kf2tK>
.
|
will do @scapella ... any metrics we calculate internally will be strictly in scope of what data we have in bo.tools. Stuff above is obviously only a small slice of all the different metrics we (== ELIXIR) has been considering. |
As a preamble to this topic, a great example: https://bio.tools/tool/Galaxy/version/none 👎
Here comes a list of metrics/indicators that are EXTREMELY EASY TO IMPLEMENT, while at the same time excellent indicators of quality, quantity, contribution, and progress.
Notes: The SIMPLEST and most relevant indicators are in bold. The rest are additional that are similarly SIMPLE and relevant, but less general (more specific). All the following has been mentioned and discussed regularly since the EMBRACE Registry times, repeatedly in various meetings in Amsterdam and Lyngby, including Kristoffer, @ekry, @joncison, @hmenager, Łukasz, me, Manchester folks, and Gert.
A. Basic (=) quantity metrics
1. # of attributes (nodes or leaves in the JSON/XML tree; summed over all entries)
2. # of operations (with at least one EDAM data concept ≠ 0006, as input or output, and at least one EDAM operation concept ≠ 0004; summed over all entries)
3. # of entries
These 3 should certainly be shown also on the top of the Bio.Tools "home page", and then also on the top of each list/table of search results (then of course per the found entries).
B. Community (=) contribution metrics
4. # of updates of an entry (summed over all entries)
5. # of individual registrants/curators (especially nice after the anonymous registrant groups a.k.a. "affiliations" are split into real users)
6. # of registrant/curator institutions
7. # of authors/developers/contributors in Credits
8. # of institutions in Credits
9. # of publications (with distinct DOIs)
10. # of public repositories (GitHub etc.), and similar useful & non-mandatory attributes
C. Quality metrics (for the whole registry)
11. 1. ÷ 3. -- i.e. # of attributes per # of entries
12. 2. ÷ 3. -- i.e. # of operations per # of entries
13. 4. ÷ 3. -- i.e. # of all entry updates per # of entries
14. 5. ÷ 3. -- i.e. # of registrants/curators per # of entries
15. 7. ÷ 3. -- i.e. # of authors/developers/contributors per # of entries (possibly etc. with 6., 8. - 10.)
D. Progress visualisation
Note:
All the indicators A. - D. can also be internally (within ELIXIR-EXCELERATE WP1) reported PER PARTNER plus per "the rest of the contributors (i.e. non-EL-EX-WP1)". The only required dependency is to first manually split all registrants into the "outreach and support spheres" per EL-EX-WP1 partner. Some registrants can fall under multiple partners, e.g. all de.NBI ones are supported by DK+NO+FR.
E. Quality metrics (for one entry)
The text was updated successfully, but these errors were encountered: