Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification: Combatting Fraud and Maintaining Integrity: Proposals for Ensuring a Valuable Filecoin Network #813

Closed
herrehesse opened this issue Jan 18, 2023 · 26 comments
Assignees

Comments

@herrehesse
Copy link

herrehesse commented Jan 18, 2023

Issue Description

Dear Filecoin Community,

In the current Filecoin ecosystem, datacap is a highly sought-after commodity. As a result, it is not at all surprising that entities try to amass this commodity without adhering to the program rules. This is particularly true for entities that are in a financial bind or operate large cloud computing operations with mainly CC sectors. These entities will often go to great lengths to convert their sectors to ones that contain datacap deals. Although this behavior is not surprising, it is noteworthy that after extensive research on the LDN's, only a small percentage of large miners actually engage in fair practice. Most of them have chosen a more fraudulent path to growth, largely due to the lack of adequate supervision.

The responsibilities and expectations of a storage provider(SP) who operates with good intentions include:

  • Clearly and accurately disclosing their physical location and operation details.
  • Investing in and upgrading their hardware and network infrastructure to improve the capacity and speed of data transfer.
  • Forming partnerships with reputable data preparers to ensure the quality and integrity of stored data.
  • Willingness to cover the costs associated with packaging, bandwidth and storing data deals.
  • Maintaining accessibility and uptime of their miners for at least 540 days.
  • Ensuring the retrievability of stored data deals for a minimum of 540 days.
  • Being transparent and open in their dealings with the Filecoin community and network.

It is a fair and equitable exchange for storage providers who meet the established requirements and actively fulfill their responsibilities to receive support and increased revenue from the community in the form of a multiplier.

However, the majority of storage providers (SPs) do not adhere to the responsibilities and requirements mentioned above, particularly when it comes to maintaining high standards of data integrity and availability. This trend needs to be addressed and rectified. It is important to keep in mind that the scope of data storage and network power being discussed is not limited to mere terabytes or petabytes, but rather extends to an exabyte and beyond. It is a more significant issue than one may initially realize.

It has been made alarmingly clear that certain large storage providers have been colluding to make hundreds of datacap requests for their own benefit, effectively converting their entire mining operations to verified sectors with the multiplier. This is not only a flagrant violation of the rules, but a brazen act of fraud and theft, stealing revenue from all the entities in Filecoin who are working hard to play by the rules. This behavior is unacceptable and must be addressed immediately.

Impact

It is disheartening to see that while many genuine companies and businesses are working tirelessly to build and contribute to the Filecoin ecosystem, there are bad actors who are exploiting the system for their own gain. Take PiKNiK for example, a company that has invested millions to recruit new storage providers and make the network more valuable. Or SEAL Storage, a company that has always been committed to storing humanity's most important information at storage providers around the world, with a focus on high-quality research data. And let's not forget the countless software companies that are investing in the protocol to add value and make the ecosystem more accessible for businesses and individuals.

These examples illustrate the hard work and dedication of many players in the Filecoin ecosystem. However, it is a harsh reality that not everyone shares these values. We must stop assuming that everyone has good intentions and start looking at the facts. The fact is that many datacap requests are not being made with the best intentions, and it is the responsibility of the applicants, data preparers and storage providers to prove otherwise.

Datacap is a valuable and scarce resource that should be used to benefit the entire Filecoin network. Its abuse not only undermines the integrity of the blockchain, but also undermines the hard work of those who are trying to build a better, more valuable ecosystem. It is our duty to hold those who abuse the system accountable, and ensure that the value of datacap is protected for the benefit of all.

Proposed Solution(s)

It is our recommendation that moving forward, datacap should only be allocated to applicants that have been proven to operate with good intentions and that strict adherence to the following rules should be upheld:

  • Adhering to a fair and impartial process for all entities seeking datacap, with no exceptions or special treatment.
  • Allowing all entities a (single) opportunity to prove their intentions and qualifications.
  • Implementing a thorough Know Your Customer (KYC) process for both the applicant and the data preparer, conducted by multiple independent notaries.
  • Requiring storage providers to provide their company name and proof of location to prevent the use of virtual private networks (VPNs).
  • Storage providers who are found to be using consecutive IDs must provide proof that they are separate and distinct entities in order to avoid suspicion of fraud. The burden of proof will be on the miner to demonstrate that they are operating under different identities, and not simply attempting to cheat the system.
  • Developing a ranking system for known applicants and notaries to promote trust and transparency within the network.
  • Applicants must provide comprehensive details and plans for the distribution and use of the datacap.

In our view, it is essential that there is a clear and defined set of expectations and requirements for entities seeking datacap. Currently, many of these expectations are left ambiguous in an effort to be inclusive, however, it has become evident that this approach has led to widespread abuse and is detrimental to the growth and success of the Filecoin network. The time has come to establish clear and specific guidelines to ensure that only well-intentioned entities are able to acquire datacap and contribute to the network's progress.

Proposed guidelines for the Filecoin+ program:

  • A minimum of six data replicas must be maintained to ensure data redundancy and availability.
  • Data must be distributed across a minimum of three continents to ensure geographical redundancy and accessibility.
  • A maximum of two copies of data may be stored by a single company or entity, if hardware is located in different cities.
  • The sharing of CIDs (Content Identifiers) is strictly prohibited, with the exclusive exception of duplicate or identical datasets, if the packing process can be independently verified.
  • Miners must be reachable at all times, with a minimum uptime of 98%, a maximum of 6 days of annual downtime.
  • Deals must be retrievable at all times, to ensure data integrity and accessibility.
  • Data stored must match the original content as declared in the LDN datacap application.
  • The use of Virtual Private Networks (VPNs) is permitted, as long as it is not utilized for fraudulent activities with intent. When abuse is suspected, the burden of proof is on the respective storage provider and cooperating applicant.
  • The data stored under the FIL+ program should be considered valuable and enhance the overall quality and of the Filecoin network.
  • (potentially) Using Boost-software should be mandatory to make and receive Filecoin+ deals due to the improved retrievability.

A separate but critical aspect to consider is the definition of "valuable" data. The FIL+ program is intended to support the storage of humanity's most important information, as stated in its description. However, the determination of what constitutes "valuable information" is subjective and can vary among individuals and organisations. Therefore, it is imperative that clear and specific rules, rather than guidelines, are established to define what data is and is not acceptable for the program. This will ensure that the program's purpose is upheld and that all entities understand the expectations and requirements for participating in the FIL+ program.

We believe that the FIL+ program should have strict requirements for applications, as the datacap provided in return is extremely valuable. It is only fair that in exchange for this valuable resource, the Filecoin community should expect a significant contribution or added value to the network from the applicant, data preparer, and storage provider. If the applicant is not able to meet these expectations, the community is more than willing to assist with regular paid deals.
Allowing non-valuable data to be stored through the FIL+ program would dilute the program's purpose and undermine the efforts of storage providers who are truly committed to this mission.

Moreover, by requiring other data to move through regular paid deals, it ensures that the scarce and valuable datacap is being utilized efficiently and effectively. This helps to ensure that the program is being used for its intended purpose and that the resources such as datacap are being directed towards preserving and storing important information. In addition, it also ensures that the network is not overcrowded with non-valuable data, which can bring down the overall quality of the network.

Examples of data that can be considered valuable and contribute to the Filecoin network include:

  • Large-scale medical research data, including genomic data and clinical trial data.
  • Historical documents and artifacts that are important for preserving human culture and history, such as manuscripts, photographs, and artifacts.
  • Cultural and artistic works, such as literature, music, and film, that are important for preserving human creativity and expression.
  • Climate data and environmental research that is important for understanding and addressing global challenges such as climate change and biodiversity loss.
  • Financial and economic data that is important for understanding and analyzing financial markets and economic systems.
  • Educational resources and academic research that are important for advancing knowledge and understanding in various fields of study.
  • Scientific research data, including data from experiments and simulations, that are important for advancing understanding in various scientific fields.
  • Social and demographic data that is important for understanding and analyzing various social and demographic phenomena.
  • Digital archives and libraries that are important for preserving and providing access to a wide range of digital content such as books and articles.

It is with great disappointment that we have discovered, after thorough research, that a majority of applications submitted are not genuinely focused on storing valuable data as defined above, but are instead primarily focused on gaining access to the multiplier benefits provided by the FIL+ program.

We have found and should not allow instances of the following:

  • Fake websites created specifically to deceive the community and gain access to datacap.
  • Impersonations of legitimate companies, with the aim of obtaining datacap.
  • Completely fabricated and false applications, containing no relevant information or data.
  • Applications that claim to store data, but in reality, the data is completely non-existent or empty.
  • Unauthorised, or intentional misuse of CIDs for the sole purpose of consuming datacap.
  • Data that is stored without the proper permissions or consent from the rightful owner.
  • Random and irrelevant videos, photos or documents, which have no value to the Filecoin network.

These fraudulent activities are not only a violation of the integrity of the network but also have no intention of contributing to the Filecoin network's goal of storing humanity's most important information. It is imperative that these types of applications are not supported and strict measures are taken to prevent them from accessing datacap and diminishing the value of the network.

This situation is unacceptable and cannot be allowed to continue. The purpose of datacap is not to enable fraud, but rather to reward storage providers who are committed to storing humanity's most important information and contribute to the overall integrity and quality of the network. Fraud should be kept to a minimum, and all entities should be held accountable for their actions.

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently. This would prevent them from receiving the long term rewards associated with their actions and prevent them from continuing to harm the integrity of the network and the broader Filecoin community. It is essential that this type of behavior is not tolerated and that the network is protected from those who seek to exploit it for their own gain.

In conclusion, it is important to remember that we are all here for the same goal: to make decentralized storage a reality and create a more valuable Filecoin network. Any actions that undermine this potential, whether it is through fraud or other unethical behavior, must be held accountable. We must work together to ensure that the network is protected and that the scarce resources of datacap are used efficiently and effectively.

Let's remain civil in our discussions and be transparent in our intentions and actions. In this way, we can work towards achieving the shared goal of creating a more valuable and sustainable Filecoin network for everyone. We welcome and value all feedback and opinions regarding the proposed rules and guidelines outlined above.

@NSC-FIL
Copy link

NSC-FIL commented Jan 18, 2023

This proposal is well defined and touches on many issues discussed in the early days of the Filecoin+ program development in particular:

  1. How to incentivize the storage and availability of "humanities most important information".
  2. How to audit and punish violators of the mission and defined rules regarding Filecoin+ (i.e. a tribunal of notaries / ecosystem stakeholders that determine if infractions have occurred and how to penalize the violators).
  3. A "credit system" of clients,SP's and notaries that are participating in the Filecoin+ program to determine "good actors" and "bad actors". We have several for overall SP reliability, but not specifically for all Filecoin+ participants.

Many of the points made in this FIP proposal are valid and worth moving to specific FIP proposals and phasing in each step while prominently announcing each to the wider community (preferably in translated native languages). This would help alleviate any shocks to members of our ecosystem. Preparation of moving Filecoin+ to L2 after the FVM is launched, could be a later phase IMO, but immediate attention to guide notaries, clients and SP's, particularly developing an audit/penalty mechanism is needed ASAP.

One suggestion that I would modify is the requirement to have boost for retrievals. Any boost-like method that allows retrievals at reasonable levels (which would need to be clearly defined like in MB/sec ranges) should be sufficient IMO.

@hyunmoon
Copy link

hyunmoon commented Jan 18, 2023

The use of Virtual Private Networks (VPNs) is strictly forbidden, as it makes it impossible to track and verify data distribution.

As long as one's VPN server is placed within their country, I think it should be considred OK so they can defend themselves against DDOS attacks.

Other than that, I really like the specific examples provided.

@flyworker
Copy link

Cancel the FIL+, user need to pay, problem resolved

@hyunmoon
Copy link

Cancel the FIL+, user need to pay, problem resolved

That has been my point of view all along but seeing an improvement attempt like this makes me hopeful about the program for the first time.

@xinaxu
Copy link

xinaxu commented Jan 18, 2023

Requiring storage providers to provide their company name and proof of location to prevent the use of virtual private networks (VPNs)

VPN is okay if the storage provider discloses their actual location and those actual locations complies to the distribution policy as defined in this thread. The goal is not to forbid VPN, but disallow storage providers to use VPN to disguise their actual location and use it as a tool to violate the rule. There could be valid use case of VPN to encrypt the traffic, etc.

@herrehesse
Copy link
Author

@xinaxu agreed. Will edit.

@xinaxu
Copy link

xinaxu commented Jan 18, 2023

Is this proposal already approved within the T&T working group? or are you proposing this to be reviewed bythe T&T working group? How do you plan to drive this to some degree of concensus across the community and notaries?
The proposal contains lots of bullet points and the detail of each one is up for debate. (i.e. why 6 replicas not 4)

@Reiers
Copy link

Reiers commented Jan 18, 2023

Hi @herrehesse !
I think it would get more 👀 , feedback - if this was opened in:
https://github.com/filecoin-project/FIPs/discussions
Having this under issues in between all the requests, it's not optimal.

Please post there, whenever you are ready - and I will join in then 👍

@herrehesse
Copy link
Author

Hello @Reiers friend!
A formal FIP draft will be finished by Friday and I will definitely post it in the /discussions channel!

Thank you for the reminder.

@herrehesse
Copy link
Author

@xinaxu

Is this proposal already approved within the T&T working group or are you proposing this to be reviewed bythe T&T working group?

  • The proposal has been drafted and discussed, incorporating input from the T&T working group, but has not yet been formally approved. We would like to discuss it further and hear everyone's opinions before making a final proposal.

How do you plan to drive this to some degree of consensus across the community and notaries?

  • Based on feedback from multiple teams within Protocol Labs, we anticipate revisions to this draft and potential voting before final implementation. It's important to note that this is not related to a protocol update, but rather establishing rules and guidelines for the Filecoin+ program to prevent fraudulent applications and misuse by bad actors. The actual FIP for datacap removal from storage providers is a separate discussion.

The proposal contains lots of bullet points and the detail of each one is up for debate. (i.e. why 6 replicas not 4)

  • Debate is a crucial part of the proposal process, as it allows for community input and the consideration of different perspectives. Each point can be debated and refined before a final decision is made.

@panges2
Copy link
Collaborator

panges2 commented Jan 19, 2023

@herrehesse

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently.

From a tooling standpoint, the removal of datacap from clients is already being developed right now. We're putting greater priority on this, seeing now its importance.

@Reiers, I agree this should be in discussions. Also if you post there, there won't be spam from the LDN bot 😂

@NSC-FIL trying to build exactly those things. Expect more discussion posts about this coming soon. building a Incentive and credit systems for notaries has been top of my mind, and I'm open to discussing the ideas you have so far. if you want to reach out on FF slack: @philippe Pangestu

@alchemypunk
Copy link

#813 (comment)

Cancel the FIL+, user need to pay, problem resolved

See through the essence at a glance.
But the community will not let you do this, they need short-term benefits to survive.

@kernelogic
Copy link

I agree most of the points but also agreed on the VPN: it's not all evil.

  1. To have a firewall
  2. To have better retrieval speed
  3. To escape government sanctions

And since we are talking about numbers here, I'd like to see max 3 copies per city allowance. The 1 copy per city is too restrictive.

Also we need to consider the number of SPs in each continent. Continents containing higher number of SPs should be allowed to allocate more copies there. It is not practical to allocate equal copies in every continent.

@SBudo
Copy link

SBudo commented Jan 19, 2023

Agree on the proposal with exception to:

  • The use of VPN: as mentioned, this should be refined to "the use of VPN to deliberately conceal your location or identity."
  • The uptime requirements: as we all know, lotus and boost, while improving, are still undergoing some major bug fixes and improvement (including high availability which is lacking at the moment). With such high release cycle and lack of proper testing at large scale, I don't believe anyone can today commit to a maximum of 6 days of downtime per year.

@xinaxu
Copy link

xinaxu commented Jan 19, 2023

Proposed guidelines for the Filecoin+ program

I suppose this the guideline is a recommendation but not mandatory. Lots of points below can easily have exceptions

A minimum of six data replicas must be maintained to ensure data redundancy and availability.

Originally, this is 4 replicas. Is there a reason to increase it to 6. Are we talking about 6 different minerIds or organizations or locations. Also, we also need to define what 6 replicas mean, i.e.

  1. Each unique data (pieceCid) must be sealed by 6 different entities(minerId/organization/location)
  2. Each entity(minerId/organization/location) cannot be using more than 20 % of total datacap

Data must be distributed across a minimum of three continents to ensure geographical redundancy and accessibility.

I think that's too strict. Some clients require data to be stored inside their country. IMO, a single continent is fine as long as it is stored in different locations.

Miners must be reachable at all times, with a minimum uptime of 98%, a maximum of 6 days of annual downtime

IMO that is really up to the client to decide but we can provide a minimum bar. Since you already mentioned 6 replicas, 90% up time for a single miner will give you 99.9999% up time so I'd either remove this requirement or relax it.

We have found and should not allow instances of the following

Lots of great examples, right now we are relying on notaries to do due diligence and each notary are doing it differently. Should we instead agree on datacap application verification workflow so all below can be avoided (like E-FIL+)

In addition to your proposal, I think a notary voting system will be really useful to reach agreements like whether to revoke a notary or client, or whether 1PiB of open garage video is considered valuable

@DLTX-Github
Copy link

I have to say that the amount of effort and time you have put into the investigations and formulating all communications including this proposal is nothing less of legendary work. Kudos for that, and thank you for pushing so hard to help make the filecoin network better

@herrehesse
Copy link
Author

@SBudo @xinaxu - Edited the VPN usage part.

@herrehesse
Copy link
Author

@Kevin-FF-USA @raghavrmadya Can someone disable the bot here?

@herrehesse
Copy link
Author

@xinaxu

I suppose this the guideline is a recommendation but not mandatory. Lots of points below can easily have exceptions

  • We will discuss the distinction between guidelines and mandatory points.

Originally, this is 4 replicas. Is there a reason to increase it to 6. Are we talking about 6 different minerIds or organizations or locations. Also, we also need to define what 6 replicas mean, i.e.

  • Each unique data (pieceCid) must be sealed by 6 different minerID's
  • Each unique data (pieceCid) must be sealed in 6 different cities
  • Each unique data (pieceCid) must be sealed by at least 3 different entities or companies.
  • Each minerID cannot be using more than 20 % of total datacap

I think that's too strict. Some clients require data to be stored inside their country. IMO, a single continent is fine as long as it is stored in different locations.

  • Clients who only wish to store data on a single continent may not be suitable for this program, and should consider using regular paid deals instead.

IMO that is really up to the client to decide but we can provide a minimum bar. Since you already mentioned 6 replicas, 90% up time for a single miner will give you 99.9999% up time so I'd either remove this requirement or relax it.

  • The final decision on granting datacap lies with the client, but it is up to us to determine if the client is worthy of receiving it. We may request a minimum level of commitment in return for valuable datacap from the community. If the client does not want to meet these requirements, they may opt for regular paid deals instead.

Let's not forget, we can set high standards for this program as the incentive we give away as a community is extremely valuable and profitable. We CAN ask for things that are not easily done or difficult to execute. If entities do not want to adhere, then opt for regular paid deals? If you want free data storage and a revenue stream, then follow the high standards and rules set for the program.

Valuable, retrievable, usable and distributed data storage in exchange for datacap. That is the tradeoff and nothing less.

@cbtan21
Copy link

cbtan21 commented Jan 19, 2023

  • we should take regular deals out of this conversation. the data shows that less than 1% of the deals are regular deals, suggesting no one is doing that

  • what is the primary objective of the Fil+ program? Is it just to incentivize onboarding of new data communities? if that's the case, then do we measure how many non-filecoin data communities we have onboarded? or are the primary objectives multi-fold - encourage storage providers to start storing data and encourage movement to onboard more data into the ecosystem?

  • think most here will agree that Fil+ is a valuable resource, and if the objectives are multi-fold then presumably the value of this scarce resource should be distributed accordingly

  • if this scarce resource is distributed, i.e. shared, then what is a fair distribution among the participants, namely the data clients and the storage providers? i think we can agree that both sps and data clients stand to benefit from Fil+

  • i think the root of the problem or rather the question is - is the current set up a fair distribution among all the participants, which are contributing to the ecosystem

  • i also hope we can agree that Fil+ deals primarily comprise of public datasets, which are usually downloaded from somewhere

  • then at play here may be one's bandwidth speed and the proximity of datasets

  • for certain countries, these two factors may not come hand in hand and so to these participants - is the current set up a fair distribution?

  • if it is perceived to be unfair or not within reach, then perhaps participants are motivated to engage in fraudulent behaviors, which of course is not acceptable. just want to highlight what might potentially be the root of the problem. my 2 cents

  • now, onto the more important point - let's just address the elephant in the room - a group of 4/6/10 different SPs, which know one other can easily pass many of the supposed Fil+ rules/guidelines. One can apply for Fil+, store one copy with itself, and distribute the rest among the group and this can go on in perpetuity. I have no opinion on this since it can be considered well distributed, but i actually do not know if such behavior is encouraged.

  • if it is encouraged, then how can some SP new to the ecosystem even compete?

  • if it isn't encouraged, then the question at heart isn't just about the authenticity of the ldn application (of course application should be authentic), but also what happens to the downstream flow of DataCap

  • There can perhaps be more traceability and analysis of how the datacap are being spent and if there are repeated patterns of behaviors that are not encouraged / approved

  • We should have more measures to check applications and also more tools to track datacap flows

  • At BDE (https://www.bigdataexchange.io/), we strive to make all data storage/verified deals transactions transparent - one can see which wallet is bidding on verified deals, what is the data client's current datacap balance, which storage provider the data client chooses, did the data client choose based on price or other factors, the deal making process between the data client and the winning storage provider, along with the spids and etc. Before the usual suspects come in to start criticizing that not everything on BDE is traceable, I am going to frontrun this time by stating it isn't the perfect product yet, but we are starting somewhere and we provide more traceability and transparency on the deals that take place on BDE than many of the deals that take place away from BDE. Moreover, there's no transaction fee on transactions. So if you are sourcing for verified deals, check out BDE; if you can find data clients that pay to store verified deals, that's awesome. BDE is just another option, particularly useful for SPs with no consistent supply of verified deals.

@DaYouGroup
Copy link

As a libertarian, I don't think more demands are beneficial. This would also restrict anyone who wants to join, which is probably bad for filecoin. I hope that filecoin can be like the early Bitcoin and ETH, and anyone who is interested can participate and contribute with a very low threshold. I think by making filecoin retrievable, these problems will naturally disappear.

@flyworker
Copy link

  • we should take regular deals out of this conversation. the data shows that less than 1% of the deals are regular deals, suggesting no one is doing that

the reason the regular deal is less than 1% is because of Fil+, people will pay if there is no Fil+, in 2021 98% are regular deals, to me the current economic model is unhealthy, SP pays for storing data.

@cbtan21
Copy link

cbtan21 commented Jan 20, 2023

precisely my point of taking it out of the conversation because now we have to debate regular deals vs Fil+, instead of the governance of Fil+

the reason the regular deal is less than 1% is because of Fil+, people will pay if there is no Fil+, in 2021 98% are regular deals, to me the current economic model is unhealthy, SP pays for storing data.

I cannot in good position debate on regular deals vs Fil+ as I have too many questions that are unanswered:

  1. 98% of how much data? is that a large enough sample size? my understanding is deals grew around 20x to 500 PiB in the past one year. Are we talking about 98% of 25 PiB vs 99% of 500 PiB?
  2. of the 98% - are they all valuable / actual data? was there checking involved? (this discussion thread centers around the checking aspect.)
  3. why was Fil+ rolled out if regular deal was working?
  4. regular deal less than 1% currently because Fil+ works as intended or because of other reasons?
  5. On “people will pay if there is no Fil+" --> they can still pay even if there's Fil+ (nothing stopping them now); if they were paying and are currently not, does it suggest the driver behind their actions is primarily economical, instead of other reasons like security, stability etc. If SPs were attracting these target users using lower price vs centralized solutions in the first place, then isn't it even more attractive to target same users using negative pricing, which is happening / doable now because of cryptoeconomic incentives. Also, if decisions were made based on economics, then the market decides on the pricing, which is a result of demand and supply. Hypothetically, if Fil+ is at 10E now, i suspect the market may price it very differently

@dkkapur
Copy link
Collaborator

dkkapur commented Jan 20, 2023

@herrehesse thanks for putting this together, great to see something concrete that we can build off of as a community.

As you know, I personally have tried to set a higher standard for data onboarding through policy modification and adjustment on the Slingshot side, but have stayed away from pushing the same constructs on all verified deals. The main reason for this, as you alluded to, is that it is very hard to generalize and apply a standard set of rules across every case deemed to be useful across the world. The point of having regionally distributed notaries in the Fil+ world is so that decisions can be nuanced in whatever way makes sense, and there is tolerance in the system for edge cases. This specific piece is what I disagree with:

[...] however, it has become evident that this approach has led to widespread abuse and is detrimental to the growth and success of the Filecoin network.

It has led to widespread abuse, but I think the net impact is closer to neutral positive at the moment rather than outwardly detrimental. Every system with competition where there is room for abuse will result in rational actors looking to abuse it. Every economy in the world faces this issue. We will continue to face this issue. We need to curb the abuse to a tolerable amount, i.e., < 5%. However, given the current scale of the program, the network, the implied slices and size of the pie, we have so much more to grow as well, that we need to deal with both reducing abuse while still scaling up to ensure Filecoin actually delivers value to the whole world (the whole of it, not just the slices that make sense to you or to me).

A great example is the CID-checker-tool that was released in Dec. A lot of that type of analysis was done by me and others in the community individually throughout 2022 and used to identify cases of potential abuse of the system for notaries applying in the Q2 election cycle. However, not setting policies in stone enabled us to identify edge cases and follow up with them, which then led to (1) behavior correction from clients that put them down a much safer path with replica distribution and (2) notaries that were trigger happy becoming significantly less so. This continues to be the case at a much larger scale now, thanks to the automation.

The core takeaway I'd like to push is - instead of setting hard rules and policies, we should be setting examples of what good looks like, and then measure each application against it. Anything that deviates substantially is worth digging deeper into. As the defacto entry point into onboarding data into Filecoin, this community gets everything from a first timer client who has no idea what deals even are all the way to the most sophisticated SP operation impersonating arbitrary companies around the world to onboard useless bits. We need to handle it all with grace. Not just for this program's sake, but for the network itself, all its stakeholders, and the rest of the web3/crypto/blockchain community.

However, we can and should get a lot more sophisticated about what we measure against. Your list is a great starting point. With that in mind, I would like to recommend that we take the list of defined expectations you have established, work with the notary and broader Fil+ community to finalize them, and then build tools (as the T&T WG has been doing for a narrower scope) that help keep DataCap applicants and their progress in check. I do have thoughts on the list you have proposed as well, but wanted to first align philosophically. Nothing I've said is "program policy" or set in stone, these are my opinions and I will stay open minded to the best of my ability.

Separately, the implication in the current status quo and what I stated above is that the network will continue to place a lot of trust with Notaries. The list of things you flagged as types/examples of abuse were not surprising to me. Lots of applications come in, but not everyone gets DataCap either, and not everyone should. I'd love to hear from notaries on things we can collect from clients or on behalf of clients that reduce the friction to making more accurate determinations on the trustworthiness of clients faster.

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently. This would prevent them from receiving the long term rewards associated with their actions and prevent them from continuing to harm the integrity of the network and the broader Filecoin community. It is essential that this type of behavior is not tolerated and that the network is protected from those who seek to exploit it for their own gain.

Definitely in favor of exploring this conversation. Economics seem complex, but lets work through it 🦾 and see what is reasonable.

@herrehesse
Copy link
Author

@dkkapur Thank you for your detailed response to our proposal. Although we don't agree on every point, we share common goals and have a clear understanding of the issues at hand. We would appreciate if you could give your opinion on our most recent update, which can be found on our Slack channel.

https://filecoinproject.slack.com/archives/C01DLAPKDGX/p1674206285725749

Effective immediately, while investigations into potential non-compliance by storage providers and notaries are ongoing, I am requesting the direct implementation of certain requirements for participation in the Filecoin+ program.

  • All data stored through the Filecoin+ program and acquired datacap must be retrievable. Data that is not retrievable has no value and should not be rewarded with datacap or the Filecoin+ multiplier.
  • If a storage provider that received datacap is offline for an extended period of time (days to weeks), they will not be granted additional datacap until the issue is resolved. Applicants will also be held accountable for the behaviour of the storage provider.
  • If the applicant's content is retrievable from one of their selected storage providers, it must match the data from the relevant application (LDN). If this is not the case, the applicant will not be granted additional datacap until the issue is resolved and there is clarity on the situation.

While we may hope for individuals to exhibit positive behaviour, there is no guarantee that this will prevent misuse. Instead, let's establish clear guidelines for the program and focus on verifiable facts rather than relying on trust.

Let's minimise trust as much as possible and rely on verifiable facts and proof.

@dkkapur
Copy link
Collaborator

dkkapur commented Jan 23, 2023

ACK - posted a response in Slack. @herrehesse can we pick one location to continue the conversation? it seems to me like it is still at a Discussion phase, at which point, I think it makes sense to continue either in Slack or in a Discussion topic, and return to this Issue with a summary before the next governance call or major action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests