-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Project Antarctic - 10 PiB Data Set / 50 PiB DataCap #489
Comments
Thank you for this proposal. The rules of Fil+ are fair and clear at the moment. -> https://github.com/filecoin-project/filecoin-plus-large-datasets/ A snap from the rules : "the dataset should be public, open, and mission aligned with Filecoin and Filecoin Plus. This also means that the data should be accessible to anyone in the network, without requiring any special permissions or access requirement Given this i think it would make more sense if you don't submit a LDN application but submit for a regular datacap combined with a grant proposal towards PL. Storing encrypted data on an LDN is not a path forward in my opinion. |
I'd note that we're in a strange place with this, because there's no explicit prohibition on encryption in the rules. In this case, we can seek ways to ascertain that the dataset (separate from the encryption) is public, open and mission-aligned. The data itself would be accessible. I feel like I'm lawyering a bit over data and dataset here, however, but I think that's because the rules weren't devised with this kind of scenario in mind, and given that our scoping rules are prefaced with the proviso that this is "still an evolving conversation, so the scope is subject to change", I think we can entertain whether this proposal fits the spirit of what we're trying to achieve, versus the methods we use to achieve them. I think it does. |
If we look at Google AWS, Microsoft’s storage terms, and conditions, all of their solutions for individuals in enterprises. In addition to public data, a considerable part of them is non-public data belonging to individuals and enterprises. I hope that this case can break through the limitation of the Filecoin network to store public data of enterprises, opening a new path for the Filecoin network. |
From what I understand, there isn't a need for companies/enterprises/organizations to apply for DataCap to store their data. Nobody is holding a gun at their head and saying that they must apply for a DataCap. They can simple use Filecoin as it is and still store their data without a DataCap. |
As discussed in the Notary governance call today, there is a need for an enterprise/ private program, of which this proposal has some elements of. 80% of the worlds data is not public, we are seeing demand from businesses with encrypted/ non-public data that would like to onboard and where they are KYC'd and trusted, the question is, can due diligence to establish trust be performed on the client over the data (which it is in this case)? |
For me, I still don't understand why we need a 10PB onboarding POC, plus, some storage provider does not even have 1PB deals on boarding experience, should not we let the SPs who have better experience do the test first? I am still not convinced as I mentioned before in the governance call and SXSW for this big encrypted data case. I would also want to know what is the timeline and process if some notaries want to push it pass anyway. |
Yep they are reasonable questions Charles for @scharfstein |
Really looking forward to your proposal @MegTei |
Trying to make sure the 5 notaries are all signed off on these, since we'll be going out of normal tooling process to generate these LDN's with only those signers. Can I get a thumbs-up emoji from the five notaries confirming you are ready to support these and that the following address is accurate for putting on these 10 LDN's?
|
I'd note though that I'd like to go forward with a bit of KYC with Seal, @scharfstein and the customer just to confirm the identity and plan to my satisfaction before I sign off. |
Just as an update to this -- I've spoken with @scharfstein and obtained more details about the customer, as well as the (confidential) scope of work, which matches this public request. I'm waiting on the answer to a couple of questions for the customer, which I anticipate receiving at the beginning of next week. |
Confirming under NDA I have discussed the project and client with Alex and Gregory @ Seal Storage, citing the statement of work between the client and Seal and the comms trail around the deal and have verified to the best of my ability. I have requested a checkpoint once the client is happy with the proof of concept later in 2022. |
Hey folks, so my questions were answered, and I'm ready to go ahead on this. |
Hey @dkkapur , @galen-mcandrew is there an explanation anywhere as to why we're doing this in 10 separate dataset applications? I'd like to understand this better. |
Is it necessary to store non-public data using File+ DataCap? Because even after applying for LDN with the goal of storing public data, we found cases of filling it with meaningless dummy data for 10 times compensation. If non-public data is stored in LDN, there will be more cases of storing meaningless dummy data by collusion between SP and customers. I think you deserve to pay that much to store non-public data. That way, you won't fill it with dummy data. |
@scharfstein wondering if the team could do a check-in report at the next governance call? The DataCap allocations went out to clients around the end of April, approximately 10 weeks right? As more and more enterprise and encrypted datasets are getting proposed, it seems important to keep reporting on the status and success of these proof of concept projects. |
We would be happy to report out. I will attend and give an update on 12Jul at the 4pm pacific call. |
Hi Gregory, @scharfstein How is your project going? It's been a while since I've seen Sal ask for multisig As the supporting notary for Project Antarctic, I did not verify the data samples after the NDA was signed. Now I am asking for a review of the data sample. I don't know when is convenient for you, but you can contact me on Slack to set up an appointment. Thanks! |
Eric - thanks for checking in. We presented an update on the July 12 Governance call ... you can find the recording here: https://youtu.be/yqPc-0Wd75M?t=5023 |
In discussions with @dkkapur, it was recommended that we increase the weekly allocation. 100 TiB was the original request, we have increased this to 1 PiB. This was done to all ten LDN applications. |
curious @swatchliu Did you get the data? |
Frankly, I haven't seen the data, just the correspondence from the client and the reason why Seal has this client |
For the purposes of community transparency, Seal Storage Technology would like to present a collaborative plan for onboarding a total of 50 PiB to the Filecoin Network. This will represent 5 full replicas of a 10 PiB data set. We would like to present details here about the project and plan to attend the Tuesday, April 5, 2022 Governance Call to discuss with the community.
Project Description
The project is a 10 PiB project to prove out the value propositions of decentralized data storage. A set of 10 DataCap Applications has been submitted by Seal Storage on behalf of our Customer, who wishes to remain confidential for the duration of this project project. Our Customer is a world-class scientific research organization and the data sets are outputs of scientific experiments.
The Customer has been working with large data sets (PiBs) for decades and is interested in pursuing the Filecoin Network as a solution to some of their exabyte-scale storage problems. This is why starting with a 10 PiB project makes sense to them as it represents a small portion of their complete archive. Due to the perceived risk associated with cryptocurrency projects, our Customer feels it is best to delay public announcement of our collaboration until we have successfully completed the project.
Data Set
The data set contains outputs of scientific experiments. The data itself is not of use to anyone beside our Customer due to the post-processing required to create a useful result. However, our Customer would like the data to remain private as viewing it can lead to understanding the name of the Customer. Therefore, the data will be encrypted as per Customer requirements.
It should be noted that scope for the project includes creating a publicly available data set.
Transparency in KYC
Our Customer is a world-class scientific organization and we have completed a KYC process to verify this customer including meeting the Customer Lead face-to-face, numerous meetings with the broader technical team, data transfer tests with the Customer Team and their collaborators, and verification of sample data.
We understand that a confidential customer and encrypted data complicate the ability of the Filecoin Community to verify this project.
Seal is committed to transparency and we have completed NDAs with notaries and storage providers, including Filecoin Foundation, and disclosed the name of the Customer along with evidence of the project such as email communications, a statement of work document, data transfer details and sample data.
Transparency in Filecoin Plus Guidelines
We would like to show our appreciation to the community for allowing this project project to move forward.
In working closely with Protocol Labs and The Filecoin Foundation, Seal is following these recommendations as a path forward to make the project a success for the Customer and the Network.
Data Storage Plan
Five full replicas, total of 50 PiB of Datacap
Primary SP Partner:
DLTX, receiving 10 PiB of Datacap for one full replica,
Supporting Seal with compute to meet Customer milestones
Location: Omaha, Nebraska
SP Cohort, each receiving 5 PiB, for a total of two fulls replicas
Holon, location Sydney, Australia
ElioVP, location Antwerp, Belgium
W3b Cloud, Washington State, USA
PikNik, San Diego, USA
Seal Storage: receives 20 PiB of Datacap for two full replicas
Locations: Las Vegas, USA and Montreal, Canada
For Customer project, Seal must also keep a full unsealed replica (10 PiB)
Notaries that Support the Project [person / org / region / Github app]
DataCap Applications
LDN-01-DLTX
filecoin-project/filecoin-plus-large-datasets#274
LDN-02-DLTX
filecoin-project/filecoin-plus-large-datasets#313
LDN-03-Holon
filecoin-project/filecoin-plus-large-datasets#314
LDN-04-W3b
filecoin-project/filecoin-plus-large-datasets#315
LDN-05-PikNik
filecoin-project/filecoin-plus-large-datasets#316
LDN-06-ElioVP
filecoin-project/filecoin-plus-large-datasets#317
LDN-07-Seal
filecoin-project/filecoin-plus-large-datasets#318
LDN-08-Seal
filecoin-project/filecoin-plus-large-datasets#319
LDN-09-Seal
filecoin-project/filecoin-plus-large-datasets#320
LDN-10-Seal
filecoin-project/filecoin-plus-large-datasets#321
The text was updated successfully, but these errors were encountered: