Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] <RongYIn Open Data Project 1> #1579

Closed
1 of 2 tasks
datalove2 opened this issue Feb 2, 2023 · 45 comments
Closed
1 of 2 tasks

[DataCap Application] <RongYIn Open Data Project 1> #1579

datalove2 opened this issue Feb 2, 2023 · 45 comments

Comments

@datalove2
Copy link

Data Owner Name

RongYIn Open Data Project 1

Data Owner Country/Region

China

Data Owner Industry

IT & Technology Services

Website

https://www.qcc.com/firm/3380acbb3101bd58394d1ba4be51e877.html

Social Media

https://www.qcc.com/firm/3380acbb3101bd58394d1ba4be51e877.html

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

Custom multisig

  • Use Custom Multisig

Identifier

No response

Share a brief history of your project and organization

RongYin was established in 2019 in HK. We were provided with a storage capacity in total of 100PiB. Now, we are planning to engage in onboard humanity data which is useful for the network. <RongYin Open Data Project> estimated onboard 10PiB storage capacity to the network, which is about 1.5P raw data.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

We are going to onboard open data of agriculture from AWS. 
Agriculture datasets covers 86 matching datasets. In total about 5PiB. 
Including USGS Landsat, NOAA, ESA WorldCover, Digital Earth Africa, Rice Genomes Project, SILO Climate data and so on.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, singularity, graphsplit

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/deafrica-geomad/
https://registry.opendata.aws/deafrica-wofs/
https://registry.opendata.aws/afsis/
https://registry.opendata.aws/3kricegenome/
https://registry.opendata.aws/isdasoil/
https://registry.opendata.aws/noaa-goes

Confirm that this is a public dataset that can be retrieved by anyone on the Network

  • I confirm

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, South America, Europe, Australia (continent)

How will you be distributing your data to storage providers

HTTP or FTP server, IPFS, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Big data exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

In communicating

How do you plan to make deals to your storage providers

Boost client, Lotus client, Singularity

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@lvschouwen
Copy link

Yesterday this same application was filed under #1577 and #1578 by Github user ledecaevi.
Both applications are deleted.
@datalove2, can you please explain what the relation is here?

image

@datalove2
Copy link
Author

datalove2 commented Feb 3, 2023 via email

@Sunnyiscoming
Copy link
Collaborator

Data samples you provided are mentioned in the following applications. How many duplicates are there?
#1123
#1130
#3
#46
#136
#1112
#1488
#1489
#1490
#1491

@datalove2
Copy link
Author

datalove2 commented Feb 6, 2023 via email

@cryptowhizzard
Copy link

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Can you show us visible proof of the size of your data and the storage systems you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

@simonkim0515 simonkim0515 self-assigned this Feb 13, 2023
@simonkim0515
Copy link
Collaborator

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

@large-datacap-requests
Copy link

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

DataCap allocation requested

250TiB

Id

e01dc661-fcec-415b-b14b-58818b8665d0

@herrehesse
Copy link

Dear Filecoin+ Github applicant,

We have noticed that some of you are submitting merged datacap requests for datasets that are already (partly) on the chain. While we appreciate your enthusiasm to contribute to the Filecoin network, we want to remind you that this behaviour may not be beneficial to the network in the long run. In fact, this behaviour has been questioned and discussed in issue #832 on the Filecoin notary-governance Github repository.

We encourage you to review the discussions in issue #832. It's important to ensure that your datacap requests are valid, necessary, and add value to the network. By doing so, you can help to maintain the integrity and sustainability of the Filecoin network.

You can find the link to issue #832 here: filecoin-project/notary-governance#832

Thank you for your understanding and cooperation.

@kernelogic
Copy link

@datalove2 although there is no rule against merged datasets currently, but how do you plan to make the data available for public to use? Do you have your own website to perform indexing of these different datasets?

@datalove2
Copy link
Author

datalove2 commented Feb 28, 2023 via email

@kernelogic
Copy link

For example, if I want to retrieve 3kricegenome, how do I know which CIDs belong to this dataset? Will you provide a list?

@datalove2
Copy link
Author

datalove2 commented Feb 28, 2023 via email

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceael6yjkgibwgsm2hwlz3gnsvhtsezype7ylslfjlkcacx5a7bcsw

Address

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

Datacap Allocated

250.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

e01dc661-fcec-415b-b14b-58818b8665d0

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceael6yjkgibwgsm2hwlz3gnsvhtsezype7ylslfjlkcacx5a7bcsw

@sxxfuture-official
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

  • Overall Graphsync retrieval success rate: 0.01%
  • Overall HTTP retrieval success rate: 0.00%
  • Overall Bitswap retrieval success rate: 0.00%

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 89.90% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the CID Checker report.
Click here to view the Retrieval report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@sxxfuture-official
Copy link

@datalove2
Although the sealing speed of different SPs is different, the data distribution problem at the current stage needs to be resolved as soon as possible.
Part of the data can be retrieved, but the proportion of data that can be retrieved needs to be increased.
image

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecnjuzoovihnkkq7yauyfm5ll7tt7xc67zazcojnr4p5axwbg3x7o

Address

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

Datacap Allocated

1.95PiB

Signer Address

f1foiomqlmoshpuxm6aie4xysffqezkjnokgwcecq

Id

1348b904-e877-4545-a29f-b477b4b6cb58

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecnjuzoovihnkkq7yauyfm5ll7tt7xc67zazcojnr4p5axwbg3x7o

@zcfil
Copy link

zcfil commented Jun 9, 2023

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

  • Overall Graphsync retrieval success rate: 0.01%
  • Overall HTTP retrieval success rate: 0.00%
  • Overall Bitswap retrieval success rate: 0.00%

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 89.90% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the CID Checker report.
Click here to view the Retrieval report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@mikezli
Copy link

mikezli commented Jun 9, 2023

lotus client retrieve --provider f01854080 --pieceCid baga6ea4seaqggenwkzclg2dtnp6jijwicbqi4llt4aslo75kuhstoizfh3j6mcq bafykbzacebmkzuqtlib52xe5mvojlqzmb3ovjjztjncrzqez33qqutpuntwq4 ~/
ca6e22a29657696312b707c39ef5b01
The customer contacted me and explained to me the reasons for CID sharing and retrieval. I am willing to support customers in this round for the time being, and I will keep paying attention.

Copy link

mikezli commented Jun 9, 2023

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedv3ap5kvtpkb62icr4q7vswvfljeyc66serjksqhk6xtwdmmowpw

Address

f1es3jnh7ivhvc32s23mro7wuktehjkxlc7yjac6a

Datacap Allocated

1.95PiB

Signer Address

f1dnb3uz7sylxk6emti3ififcvu3nlufnnsjui6ea

Id

1348b904-e877-4545-a29f-b477b4b6cb58

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedv3ap5kvtpkb62icr4q7vswvfljeyc66serjksqhk6xtwdmmowpw

Client f02048944 does not follow the datacap usage rules. More info here.
This application has been failing the requirements for 7 days.
Please take appropiate action to fix the following DataCap usage problems.

Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 11%. This should be greater than 25%. To find out more about CID checker score please look at this issue: filecoin-project/notary-governance#986

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests