-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo / lightning talk for plankton image data flow #8
Comments
Did a small rendering of k-means clustering of the plankton embeddings which had visually similar outcomes to the similarity search, this is on the It's outgrowing a notebook, wondering if Focus of this is to show naively-minimal output to plankton researchers and enlist their help either in finding flaws, or in refining which path to take is actually useful to them. Should be quite timeboxed, ideally no more than a day, max 2... |
Note to self that embeddings_app assumes some data that's generated by methods in discoverability This shows use of UMAP to do dimensionality reduction on embeddings; which is probably worth trying in the notebook to see if that helps DBSCAN not to see everything as noise |
Another note to self that while it's not necessary now, the next visit to this should involve
|
Updated the issue title to reflect this has grown some extra dimensions! Come back here after some shared discussion and outline what it is we'd like to show
The work in #5 and #6 serves as a proof of concept of minimal-effort approaches to learning from image collections without undertaking model training or costly labelling; but it's at the edge of what's meant to be a deeper investigation of pipelines and workflows that can apply to related projects - most immediately AMI-system. This Discussion on DataLabs computer vision needs for a combination physical sample / imaging field site shows likely demand.
Putting together a short show-and-tell / demo that can be presented to the Environmental Data Science group and the research group is a nice motivator to draw a line under the low-hanging ML parts, shift focus to architecture choices and cross-project common ground
Of these, 2. needs expanded a bit to become more visually interesting and to probe for areas where the approach is weak. 3. we haven't tried at all, got lost in the wash between pipeline/workflow #9 on the one hand and experimental model choice #10 on the other, but it should be quick to try (DBScan etc)
See also the section on transfer learning / feature extraction in this workshop paper:
https://aslopubs.onlinelibrary.wiley.com/doi/full/10.1002/lno.12101#lno12101-sec-0025-title
The text was updated successfully, but these errors were encountered: