Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Baikal Seal Storage Technology #1212

Closed
salstorage opened this issue Nov 7, 2022 · 57 comments
Closed

[DataCap Application] Baikal Seal Storage Technology #1212

salstorage opened this issue Nov 7, 2022 · 57 comments
Assignees
Labels
efil+ designates that an application is going through additional upfront ID checks granted kyc verified User has passed KYC check Stale validated verified client

Comments

@salstorage
Copy link

salstorage commented Nov 7, 2022

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: Seal Storage Technology
  • Website / Social Media: www.sealstorage.io
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 2PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 400TiB
  • On-chain address for first allocation: f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida
  • Type: Custom Notary
  • Identifier: E-fil

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Due to previous LDN #325 application deprecated due to non activity, FF has requested we resubmit this LDN as new application.

Our customer is a Dark Matter Group within UC Berkeley and Seal is involved in a project with them to store the outputs of their scientific experiments. They would like to upload data to a distributed platform for other globally based researchers to be able to access this data. We will kicked off the project in early March 2022 with ingestion estimated to begin in mid April 2022 via portable disk unit. The customer’s data will not be encrypted, access controls will be implemented. They are looking for storage for at least the next three years.

Seal is a carbon-neutral, decentralized cloud storage provider. Seal's technical leadership brings decades of experience from traditional enterprise storage companies including Seagate and Oracle, as well as world-class experience on the Filecoin Network. Today, Seal operates data centers across the US and Canada with enterprise-grade infrastructure and data policies.

What is the primary source of funding for this project?

Seal is funding the project.

What other projects/ecosystem stakeholders is this project associated with?

None at this time.

Use-case details

Describe the data being stored onto Filecoin

The data sets are the original outputs of scientific experiments.

Where was the data in this dataset sourced from?

The data sets have been created by dark matter-related experiments and instrumentation.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes. A link will be added shortly.

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

The current data set requires permission based access.

A goal of the pilot project is for Seal to work with our customer to provide a permission based model to access data. Staged data for access will be supported on IPFS, Seaweed FS Open Source tools.

What is the expected retrieval frequency for this data?

Archival is primary. The data will be accessed by external collaborators and Researchers.

For how long do you plan to keep this dataset stored on Filecoin?

Three years, at least.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We plan to store five copies of the 400 TiB data set [total of 2 PiB] in five different cities, in three different countries and across two continents.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Seal Storage has dual 100 Gbps internet connections. SPs will download data from Seal. Offline data transfer may be possible.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are currently discussing capabilities and performing due diligence with several SPs and have chosen three SPs for this project. We chose these based on their current storage capacity, compute capabilities, enterprise-grade DCs and bandwidth.

How will you be distributing deals across storage providers?

Holon, 400 TiB
ElioVP, 400 TiB
PikNik, 400 TiB
Seal, 800 TiB

Seal will also be keeping a hot copy (400 TB) for the Customer available for access.

The data ingestion will follow this approximate schedule:

55 TB right away
by the end of year 1: 5 TB
by end of year 2: 50 TB
by end of year 3: 290 TB

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have the resources/funding to begin making deals once we receive DataCap. 

We currently have the support we need.
@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@salstorage
Copy link
Author

As per @dkkapur @galen-mcandrew @simonkim0515 request, original LDN #325 has been deprecated due to inactivity.
Delay to #325 was due to internal 'deal API application' Seal Storage Technology is currently developing for this project to share with other SP's assigned to this project.
Project is ready, we had already sealed 50TiB DC allocation as per application #325
Seal Storage Technology has been asked to resubmit the LDN in a new application

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@salstorage
Copy link
Author

Paging @raghavrmadya
The DC allocation Request has been approved.
Checking DC on wallet ID f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida has status: unverified
How do we get this resolved?

Screen Shot 2022-11-09 at 3 38 35 PM

@galen-mcandrew
Copy link
Collaborator

Seeing exit code 16 here: https://filfox.info/en/message/bafy2bzacea7konkezsnvvqezyu5mtbfgwfxbl7khr6uiamcxd4ilamtx6ztcu

I think that means the notary (in this case the v3 msig) does not have the amount of DataCap available to fill the proposal, even though the proposal was correctly approved by a second signature. The RKH are working to approve more DataCap the to v3 notary. Until that happens, proposals and approvals over the remaining notary balance will fail, and after the v3 gets more DataCap it will require two new notary messages (there's no way currently to point at these above messages and say "please accept those now")

@simonkim0515 simonkim0515 self-assigned this Nov 10, 2022
@kevzak
Copy link
Collaborator

kevzak commented Jan 23, 2023

Could you explain why these two nodes cannot be connected? f01923554 and f01886710

@salstorage can you reply to @UnionLabs2020?

@salstorage
Copy link
Author

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report1

  • Organization: Seal Storage Technology
  • Client: f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida

Approvers

1cryptowhizzard
1Fenbushi-Filecoin
1flyworker

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 30% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ f01886710 has sealed 35.80% of total datacap.

⚠️ f01886710 has unknown IP location.

⚠️ f01923554 has unknown IP location.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01886710 Unknown
Unknown
33.25 TiB 35.80% 32.53 TiB 2.16%
f01154023 Melbourne, Victoria, AU
Anycast Global Backbone
3.23 TiB 3.48% 3.23 TiB 0.00%
f01345523 Antwerpen, Flanders, BE
Cogent Communications
23.21 TiB 24.99% 23.21 TiB 0.00%
f01392893 Amsterdam, North Holland, NL
Fusix Networks B.V.
22.78 TiB 24.52% 22.78 TiB 0.00%
f01873432 Las Vegas, Nevada, US
PiKNiK & Company Inc.
9.30 TiB 10.01% 9.30 TiB 0.00%
f01923554 Unknown
Unknown
1.11 TiB 1.19% 1.11 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 30% of unique data are stored with less than 4 providers.

⚠️ 99.73% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
8.59 TiB 8.62 TiB 1 9.28%
30.74 TiB 61.88 TiB 2 66.63%
7.28 TiB 22.13 TiB 3 23.82%
64.00 GiB 256.00 GiB 4 0.27%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1usscfxtogr5v4jmi32uzkckeql2mgvun72q37ga Seal Storage Technology 31.32 TiB 495 1dannyob
1TimWilliams00

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@salstorage
Copy link
Author

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report1

  • Organization: Seal Storage Technology
  • Client: f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida

Approvers

1cryptowhizzard
1Fenbushi-Filecoin
1flyworker

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 30% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ f01886710 has sealed 35.80% of total datacap.

⚠️ f01923554 has unknown IP location.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01154023 Melbourne, Victoria, AU
Anycast Global Backbone
3.23 TiB 3.48% 3.23 TiB 0.00%
f01345523 Antwerpen, Flanders, BE
Cogent Communications
23.21 TiB 24.99% 23.21 TiB 0.00%
f01392893 Amsterdam, North Holland, NL
Fusix Networks B.V.
22.78 TiB 24.52% 22.78 TiB 0.00%
f01886710 Las Vegas, Nevada, US
GTT Communications Inc.
33.25 TiB 35.80% 32.53 TiB 2.16%
f01873432 Las Vegas, Nevada, US
PiKNiK & Company Inc.
9.30 TiB 10.01% 9.30 TiB 0.00%
f01923554 Unknown
Unknown
1.11 TiB 1.19% 1.11 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 30% of unique data are stored with less than 4 providers.

⚠️ 99.73% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
8.59 TiB 8.62 TiB 1 9.28%
30.74 TiB 61.88 TiB 2 66.63%
7.28 TiB 22.13 TiB 3 23.82%
64.00 GiB 256.00 GiB 4 0.27%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1usscfxtogr5v4jmi32uzkckeql2mgvun72q37ga Seal Storage Technology 31.32 TiB 495 1dannyob
1TimWilliams00

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@salstorage
Copy link
Author

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report1

  • Organization: Seal Storage Technology
  • Client: f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida

Approvers

1cryptowhizzard
1Fenbushi-Filecoin
1flyworker

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 30% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ f01886710 has sealed 35.80% of total datacap.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01154023 Melbourne, Victoria, AU
Anycast Global Backbone
3.23 TiB 3.48% 3.23 TiB 0.00%
f01345523 Antwerpen, Flanders, BE
Cogent Communications
23.21 TiB 24.99% 23.21 TiB 0.00%
f01392893 Amsterdam, North Holland, NL
Fusix Networks B.V.
22.78 TiB 24.52% 22.78 TiB 0.00%
f01886710 Las Vegas, Nevada, US
GTT Communications Inc.
33.25 TiB 35.80% 32.53 TiB 2.16%
f01923554 Las Vegas, Nevada, US
GTT Communications Inc.
1.11 TiB 1.19% 1.11 TiB 0.00%
f01873432 Las Vegas, Nevada, US
PiKNiK & Company Inc.
9.30 TiB 10.01% 9.30 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 30% of unique data are stored with less than 4 providers.

⚠️ 99.73% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
8.59 TiB 8.62 TiB 1 9.28%
30.74 TiB 61.88 TiB 2 66.63%
7.28 TiB 22.13 TiB 3 23.82%
64.00 GiB 256.00 GiB 4 0.27%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1usscfxtogr5v4jmi32uzkckeql2mgvun72q37ga Seal Storage Technology 31.32 TiB 495 1dannyob
1TimWilliams00

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@salstorage
Copy link
Author

Could you explain why these two nodes cannot be connected? f01923554 and f01886710

@salstorage can you reply to @UnionLabs2020?

@UnionLabs2020 @Kevin-FF-USA

We have resolved the location ID's, we believe it was a bug in Lotus and we had to run actor set-addrs on miner to fix it
Can we please get this signed?

@UnionLabs2020
Copy link

OK, hope to see better result in your next round.

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecmb3muxy6vxuicaiejotdtbxwfnzdk7jtteute7biho7qnwba2f4

Address

f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida

Datacap Allocated

819.19TiB

Signer Address

f17xdri3wunqgld7dm23e4f3eqsntjakwc47xjo6i

Id

b7465293-b348-4f58-b269-bb64a7f9cef1

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecmb3muxy6vxuicaiejotdtbxwfnzdk7jtteute7biho7qnwba2f4

@filplus-checker-app
Copy link

DataCap and CID Checker Report1

  • Organization: Seal Storage Technology
  • Client: f1rovtu5m3gq7q5vu4kfh4oiiif643gqq7voi4ida

Approvers

1cryptowhizzard
1Fenbushi-Filecoin
1flyworker
1UnionLabs2020

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 30% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ f01886710 has sealed 36.06% of total datacap.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01154023 Melbourne, Victoria, AU
Anycast Global Backbone
3.23 TiB 3.47% 3.23 TiB 0.00%
f01886710 Arcadia, California, US
Cogent Communications
33.64 TiB 36.06% 32.92 TiB 2.14%
f01345523 Antwerpen, Flanders, BE
Cogent Communications
23.21 TiB 24.87% 23.21 TiB 0.00%
f01392893 Amsterdam, North Holland, NL
Fusix Networks B.V.
22.82 TiB 24.46% 22.82 TiB 0.00%
f01923554 Las Vegas, Nevada, US
GTT Communications Inc.
1.11 TiB 1.19% 1.11 TiB 0.00%
f01873432 Las Vegas, Nevada, US
PiKNiK & Company Inc.
9.30 TiB 9.96% 9.30 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 30% of unique data are stored with less than 4 providers.

⚠️ 99.73% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
8.24 TiB 8.27 TiB 1 8.87%
31.13 TiB 62.66 TiB 2 67.15%
7.28 TiB 22.13 TiB 3 23.71%
64.00 GiB 256.00 GiB 4 0.27%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1usscfxtogr5v4jmi32uzkckeql2mgvun72q37ga Seal Storage Technology 31.32 TiB 495 1dannyob
1TimWilliams00

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@kevzak kevzak added the efil+ designates that an application is going through additional upfront ID checks label Apr 17, 2023
@kevzak
Copy link
Collaborator

kevzak commented Jul 3, 2023

Hello @salstorage we have a new KYC ID check available on filplus.storage. Please complete when you have a chance as a requirement for E-Fil+ to add additional layer of trust and to verify this GitHub account. See details: LINK

@data-programs data-programs added the kyc verified User has passed KYC check label Jul 19, 2023
@data-programs
Copy link
Collaborator

KYC

This user’s identity has been verified through filplus.storage

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Jul 30, 2023
@github-actions
Copy link

github-actions bot commented Aug 4, 2023

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
efil+ designates that an application is going through additional upfront ID checks granted kyc verified User has passed KYC check Stale validated verified client
Projects
None yet
Development

No branches or pull requests

14 participants