-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modification: LDN process - Increasing limit for DataCap requests #594
Comments
I agree that there is a risk of abuse but the possible efficiency increase for various actors is worth a test run. In addition to DC removal, I would like to propose checks and balances in place such as clients benefitting from this increase in present updates at a governance call as well as a detailed SP allocation plan. |
I've shared some of my ideas on slack, now reposting it here. Quoting the LDN Datacap stage provided by @dkkapur here, thanks for the precise classification. I think we can all agree that our actual purpose is about raising (iii) asap within the community compliance framework. BUT
The vision of filecoin is to help more clients store datasets into the network, and for client with mega datasets, we can absolutely welcome them by raising the kyc audit criteria without having to set 25PiB as the upper limit for all clients. Certainly, some clients with very large storage needs do require 50PiB/100PiB or even more datacap, but this does not suit all clients and is the primary reason why this should not become a universal standard. Increasing the upper limit of applications will face a lot of data abuse, and if it is to better meet a small portion of larger storage needs we can completely develop a non-general application review method. And of course, if PL must pass this, I would love to know why it is 25PiB not 30 or 50PiB? How was this number determined? Thanks! |
Hi @BobbyChoii, thank you for adding your thoughts on this proposal. I'm RG, and I focus on preventing DataCap abuse. The proposal is to increase (ii) based on the ratio of (iii)/(ii) which is above 75% at the moment. This means that for deal-making clients / actively onboarding clients, DataCap is actually getting used up pretty quickly. I agree that ((i) and (iii) aren't significantly correlated and to my understanding, we don't expect much correlation there. If anything, we expect (i) and (ii) to be correlated as more allocation means the need to commit/provision more DC through RKH/community nexus. Checkpoints at (i) and (ii) will continue to remain active and are consistently being improved. As things stand today, clients who need more than 5 PiBs open multiple applications, often with different client addresses contributing to more overhead both for the governance team as well as the notaries to keep track to simply sign and do the due diligence in a repetitive manner to abide by the standards we have set in place. Having a client be upfront about the amount of total DC they need and apply in a single application is overall more efficient and I do agree it comes with the likelihood of increased abuse. My own strategy to achieve the vision of the network as you've outlined is to create more penalties rather than barricades and red-tapism. As all allocations are tranched, the community is always welcomes and encouraged to monitor DataCap usage and deal-making behavior at Dashboards such as - https://filplus.d.interplanetary.one/clients |
To answer why 25 PiBs, it's based on the volume we are seeing here - https://github.com/filecoin-project/filecoin-plus-large-datasets/issues. On average, a client needing more than 5 PiBs opens 3-5 applications which is why the proposal is for 25 PiBs. If the community feels this should be 30 or 15 or anything else, we can assess that based on incoming plus historical application data |
I'm happy to share some efforts and projects that are in the works and will eventually be presented at a governance call with regard to preventing DC abuse if it helps the community gain confidence |
Adding to this discussion, here are the kind of clients and applications which are vetted and would benefit from this proposal - filecoin-project/filecoin-plus-large-datasets#483 We will bring it up again at the governance call to get more community feedback. @BobbyChoii , it would be great if you are able to make it to one of the gov calls this coming week. |
Adding examples of clients that would benefit from this proposal - |
This modification would decrease the overhead of applying and tracking datacap. Reasons why I am in favor of it:
Flags brought up in the notary call yesterday:
I would argue that this would result in less work for notaries because
For our team, this issue persisted with our NEXRAD dataset. We did not know the status of the project as a whole, thus giving us uncertainty to begin the project. |
@raghavrmadya appreciate all the efforts. I had a busy week and didn't make it to the governance call. Is there any notes from the last meeting or could you let me know where to get the updates? Thanks! |
@BobbyChoii we spent some time talking about this at the governance call, you should check out the recording here. Main takeaway was that we should look into proposing changing the tranche limits to add some safer upper bounds for initial allocations. Opening up the floor here for proposals!
|
@dkkapur thanks for sharing. Along with the governace call and some recent updates from the community, there are a couple of questions I'd like to confirm.
Quoting from slack #fil-plus channel here.
Thanks! |
Just went through the other two applications quickly. I would like to share some of the parts that I think require due diligence and please correct me if I'm wrong.
Sounds like fancy words for taking data from individuals without any permissions. Blog owners, post users, journalists they are the real owners of the data. It's their intellectual property. We all have the right to read them, to save in folders, to share with friends and that's it. Taking them for financial benefit is definately not in the many rights shared by these platforms. Let's take Quora as an example. Here's their policy about copyright. https://help.quora.com/hc/en-us/articles/360052494012-How-does-Quora-intend-to-enforce-the-Not-for-Reproduction-feature- If the applicant wanna store them in filecoin for business interest, then at least they should provide licenses from all these platforms. If they can't, i dont think we should support such an application. Have we really thought through the need for 25PB? Seems like it will only increase friction and more challenges to community fairness... |
Hi @BobbyChoii , thanks for the comments.
|
To your second comment @BobbyChoii, the first issue you have cited has gone through KYB through the client growth team. We have not conducted the KYC yet and there has been no trigger. For the second issue, we have had the SP driving BD for this project reach out. They also might have private data and will be held to the same process as any other client. Of course, as you are aware, a trigger from the governance team is not DataCap approval and notaries have the final say. If you have questions about these specific issues, please comment on the issue itself. |
Finally, respond to this - "Trust and transparency update - Affective today, any application requesting the maximum limits (5 PiBS) and 100 TiBs weekly allocation without proper justification in the application and that have less than 10 data samples will be flagged and will most likely be asked to open a new application with proper justification. Could you be more specific on the most likely part? If it is what you mentioned in slack before, I think all applications that do not meet the criteria should be closed and all applicants should open new issues. The fairness is nowhere to find in the case comparison i mentioned above. If consistency is not maintained, this will cause endless confusion in the approval order at a later stage. The new LDN application rules need to be published through modification here just like this one proposed by Deep. And be synced to slack for discussion. This is a community-driven place and no decision should be made unilaterally by one person." As the trust and transparency Lead, it is my mandate to prevent DataCap abuse. More recently, I've seen rampant abuse, and the message posted is merely a flag, not a rule. If any application does not provide justification for the amounts being requested, I will be inclined to close the application and request them to open a new application instead of going back and forth requesting for information. There is enough precedent on GitHub and we have made the rules very clear. A client can also reach out to the client growth team, get notary support, and/or share a working relationship with SPs to gain trust. This is a governance team process choice to be more efficient. If you disagree, please open a discussion on the governance repo. |
@raghavrmadya thanks for sharing.
ACK. If you're just using them as examples, then I have no more doubts about that.
Does the client growth team conduct KYB for all participating businesses in the community? What are the requirements for business? As sp if there are companies that are suitable and want to participate in filecoin, how can I help them to make contact?
Thanks for the heads up, I will comment below the issue. May I know what percentage of applications were contacted through this non-public type tho? I haven't seen any public notices on github or slack about this. In addition, I think there should be some level of official disclosure about these communications that are not done publicly. Either the governance team or any member of these teams you mentioned before should make clarification under the issue itself to reduce the confusion.
Appreciate your efforts in preventing DC abuse. I believe this will make Filecoin a better place to store useful data. But based on the way the issues were closed recently, #483 as I mentioned before, I still don't understand with weekly 1PB allocation why is it still open? This proposal is still under discussion and has not been approved yet. Whether it is through the client growth team or BD or any other way, why can't applications that don't make changes to the max limit and weekly allocation be closed? They could have reopened new issues just like others. Standards are meant to be universal for all. If there are special cases, they can be implemented according to the consensus rules. Such as asking for support of notaries on gvernance. But at the very least the benchmark should be the same. As in this application, #840 I think it is against the rules of democracy if only your approval is needed.
If you still don't think it is necessary to follow a uniform rule. That is, no need to apply the same 5pb/100TB requirement. I would also like to know if you have any prevention methods for this? Thanks! |
Hi @raghavrmadya @dkkapur, is there any update? |
As a SP we currently have multiple large +25PiBs deals in the pipeline. |
Issue Description
In the current scope of the LDN process (see https://github.com/filecoin-project/filecoin-plus-large-datasets#current-scope), clients are able to apply for up to 5 PiB of DataCap. This was initially instituted for a couple of reasons:
With v3 - this becomes substantially easier to manage, especially for applications for open/public datasets (see #509 for more details on the v3 changes).
In the last 6+ months, the usage of the LDN process has increased substantively. Several data owners or client representatives are now applying for second applications or filing multiple applications up front for projects that need >5 PiB of DataCap. This creates additional management overhead and also presents interruption to the onboarding flow of a client, since they not only need to get DataCap granted through a second application, but also can have their allocation tranches reset and go back to getting a lower amount of DataCap initially.
As a result - proposing that we increase the maximum amount of DataCap per application to something higher. Initially, we can move up to 25 PiB, with the explicit intent in the future to index much higher on actual onboarding rates and size of raw data onboarded + replication needs.
(Note, this was initially suggested a while back in #227.)
Impact
This change reduces the overhead for clients, notaries, and the governance team. This change also increases safety in some cases for the system, where applications are likelier to be long term associated with a client / project and can serve as a single source of truth of client needs and data. This change also gives more confidence to entities doing business development in the network to hunt projects that have larger DataCap needs.
Proposed Solution(s)
Increase LDN scope to support applications up to 5 PiB.
Tactically, this means:
Screenshots of current status:
Timeline
The proposed solution will likely take at least 1 week to implement.
Technical dependencies
The validation bot has to be updated.
End of POC checkpoint (if applicable)
Recommending that we check in after 6 weeks and 12 weeks to look at potential abuse of this. See Risks outlined below.
Risks and mitigations
Risk: By increasing the total amount of DataCap requestable and not adjusting the tranche sizes, this will enable people to now apply for a theoretical max of 1.25 PiB of DataCap in their first tranche (25 PiB DataCap requested with a claimed onboarding rate of 2.5+ PiB/week).
Though this is a concern from a safety standpoint if untrustworthy projects are able to get DataCap, we're also simultaneously investing in and improving the due diligence process as a community. Efforts include KYC learnings from E-Fil+, improved applications templates, better monitoring and risk analysis tools, and more engaged members of the community helping with due diligence.
Separately, we should also look at putting a FIP together as a community to remove DataCap from sealed sectors (IIRC this was already pitched in the past) to ensure projects that do end up sealing verified deals that should not have been verified can still be adjusted down.
Related Issues
#227
The text was updated successfully, but these errors were encountered: