Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Kernelogic - Post Slingshot 2.8 continuation on NEXRAD dataset #594

Closed
kernelogic opened this issue Aug 7, 2022 · 13 comments
Assignees

Comments

@kernelogic
Copy link

kernelogic commented Aug 7, 2022

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: Fei Yan - Kernelogic
  • Website / Social Media: https://slingshot.kernelogic.ca/ Slack: Fei Yan
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 15 PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 1 PiB
  • On-chain address for first allocation: f1yy7riqoc3vm7jv6nawupnytj4m6sajfuq7kqn6q

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Similarly to this LDN comment https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/432#issuecomment-1204902669 from @dkkapur , as Slingshot 2.8 has ended, I'd like to continue storing the whole dataset to its completion under a new LDN, following the same rules as before.

I have participated every Slingshot phase and is probably the best performing as a "small individual client". 

I have successfully completed a few LDNs on other datasets and I have record to show I have been following the rules of decentralization and have zero self dealing.

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/60
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/59
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/46
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/297
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/298
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/304

What is the primary source of funding for this project?

Self-funded, BigD exchange.

What other projects/ecosystem stakeholders is this project associated with?

enterprise-sp-wg, BigD exchange.

Use-case details

Describe the data being stored onto Filecoin

Real-time and archival data from the Next Generation Weather Radar (NEXRAD) network.

Where was the data in this dataset sourced from?

https://registry.opendata.aws/noaa-nexrad/

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

The data is primarily compressed binary data. Below site demonstrate how to consume and render the data
https://nbviewer.org/gist/dopplershift/356f2e14832e9b676207

s3://noaa-nexrad-level2/2021/01/01/TSDF/TSDF20210101_235417_V08

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

AWS open dataset

What is the expected retrieval frequency for this data?

Infrequent. However all details are available at my browser https://slingshot.kernelogic.ca/nexrad.html?v=2.8

For how long do you plan to keep this dataset stored on Filecoin?

Between 365 - 520 days.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

All regions.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

I will upload my prepared CAR files to a web server and coordinate with providers to download and propose offline deals.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Beside the previous SPs I have worked with, I also utilize bigD exchange to further decentralize the storage

To name a few from the community that I deal with regularly: PIKNIK, Holon, CabrinaHuang, HarryM, BigBear, j1v, XinAn Xu, WillTechMusing.

From BigD exchange: Mog Li, Devin Chen, DSS Nathanial Marsh, Rabinovitch, Vin K, arockpool Tony

How will you be distributing deals across storage providers?

Evenly across all providers I propose to, if they can handle. If a miner is a notary itself, this notary will receive no more than 20% of the total granted datacap.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

I have all I need to start making deals.
@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@kernelogic
Copy link
Author

kernelogic commented Aug 7, 2022

To emphasize my advantages:

  1. A very decentralized, transparent list of SPs.
  2. A usable dataset browser containing file details and how to retrieve them.

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@kernelogic
Copy link
Author

Updating this request to potentially utilize the new proposal #594

This dataset is 2PB+, I have got one 5PB so this one I am requesting 15PB so that I can have approximately 10 replicas in total.

@Sunnyiscoming
Copy link
Collaborator

There are some existing issues related with noaa-nexrad.
As you said, this dataset is 2PB+, But there are more than 20 PB Datacap requested in the following large datasets.
Slingshot v2 has ended, there should be no more Slingshot related LDNs.
So I think maybe this issue should be closed.

#483
#432
#398
#340
#312
#80

@kernelogic
Copy link
Author

@Sunnyiscoming I'm creating this to see if I can finish what I prepared. Up to the RKH to decide.

@raghavrmadya
Copy link
Collaborator

Hi @kernelogic , we expect the outcome of #594 will take some time so it's best to open 3 different apps for the 15 PiB request

@kernelogic
Copy link
Author

kernelogic commented Sep 23, 2022

Sorry @raghavrmadya I just saw your reply. Closing this to open new split ones.
#1004
#1005
#1006

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants