Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate on the new verification_jobs table #1826

Open
Tracked by #1367
marcocastignoli opened this issue Jan 6, 2025 · 4 comments
Open
Tracked by #1367

Iterate on the new verification_jobs table #1826

marcocastignoli opened this issue Jan 6, 2025 · 4 comments
Assignees

Comments

@marcocastignoli
Copy link
Member

marcocastignoli commented Jan 6, 2025

Iterate on the new table definition, making sure that it contains all the need information. In particular focus on the foreign key verification_id in the case of a successful verification and understand what information to include in the case of a failed verification:

  • jobId
  • endpoint: /v2/verify/etherscan/
  • chainId
  • address
  • status
  • server hardware
  • compilation time
  • errors (compiler failures)
  • createdAt
  • finishedAt (success or fail)
  • when successful: verification-id
  • Do we want to store information when the compilation succeeds but match fails?
    • onchain bytecodes
    • recompiled bytecodes
@marcocastignoli marcocastignoli moved this from Triage to Sprint - Candidates in Sourcify Public Jan 6, 2025
@marcocastignoli marcocastignoli moved this from Sprint - Candidates to Sprint - Up Next in Sourcify Public Jan 6, 2025
@manuelwedler manuelwedler moved this from Sprint - Up Next to Sprint - In Progress in Sourcify Public Jan 28, 2025
@manuelwedler
Copy link
Contributor

manuelwedler commented Jan 30, 2025

I mostly agree with above fields. Errors need multiple fields due to the API design in Stoplight. I propose to use exactly these names for the columns:

Table: verification_jobs

Needed for the API response:

  • id: bigserial, primary key, incremental
  • status: varchar, not null
  • started_at: timestamptz, not null
  • completed_at: timestamptz
  • chain_id: numeric, not null
  • contract_address: bytea, not null
  • verified_contract_id: bigserial, foreign key to verified_contracts.id
  • error_code: varchar
  • error_message: varchar
  • error_id: uuid

These are only needed for stats:

  • verification_endpoint: varchar, not null
  • server_hardware: jsonb, not null
  • compilation_time: interval

Regarding server_hardware, I am not sure if we can get helpful information for it, since we run it in Cloud Run and CPU and memory allocations are just limits. We can try to use the Node.js functions for those (os.cpus(), os.totalmem(), ...). Let's see what they return under this environment. I think it makes sense to store this information for each verification job in the following json format:

{
  "cpu": string,
  "memory": string
}

Anything else that should be added to the server_hardware?

Additional information?

Do we want to store information when the compilation succeeds but match fails?

  • onchain bytecodes
  • recompiled bytecodes

In my opinion, we should not store more information than necessary, because I don't want to blow the database. The error_code with the error_message should be enough to tell the user what went wrong. If we want to debug something, we should be able to find the corresponding logs by using the error_id.

@marcocastignoli marcocastignoli moved this from Sprint - In Progress to Sprint - Needs Review in Sourcify Public Jan 30, 2025
@kuzdogan
Copy link
Member

Small comment on chain_id, I think it should be bigInt instead of numeric, see verifier-alliance/database-specs#23

@kuzdogan
Copy link
Member

kuzdogan commented Jan 31, 2025

Additionally this was what's recommended by Claude. I don't know to what extent this is useful but I see we can go beyond the minimal cpu+memory info:

const os = require('os');
const v8 = require('v8');

const getHardwareInfo = () => {
  const memoryUsage = process.memoryUsage();
  const heapStats = v8.getHeapStatistics();
  
  return {
    cpu: {
      model: os.cpus()[0].model,
      cores: os.cpus().length,
      speed: os.cpus()[0].speed,
      allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
      cpu_quota: process.resourceUsage().maxCPU || null
    },
    memory: {
      total_allocated: process.env.MEMORY_LIMIT || os.totalmem(),
      available: os.freemem(),
      process_usage: {
        heapTotal: memoryUsage.heapTotal,
        heapUsed: memoryUsage.heapUsed,
        external: memoryUsage.external,
        rss: memoryUsage.rss
      }
    },
    environment: {
      type: process.env.K_SERVICE ? 'cloud-run' : 'local',
      node_version: process.version,
      container_id: process.env.HOSTNAME || null
    }
  };
};

Edit: When I asked for only the essential info Claude gave me these:

const os = require('os');

const getEssentialHardwareInfo = () => {
  return {
    cpu: {
      allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
      model: os.cpus()[0].model
    },
    memory: {
      allocated_mb: Math.floor((process.env.MEMORY_LIMIT || os.totalmem()) / (1024 * 1024))
    },
    environment: {
      type: process.env.K_SERVICE ? 'cloud-run' : 'local'
    }
  };
};

@kuzdogan
Copy link
Member

Regarding storing the bytecodes, I'd really like to show a nice UI for the users in the future how these bytecodes diff for failed jobs. Like this:

Image

We find ourselves debugging into people's failed verifications a lot of times and it would be useful to have this information without doing everything locally, even better if we can convey this information to the user and they don't have to contact us at all. IMO this is an essential part of building an overall better contract verification experience.

However this is not a requisite from my side and this can be added later by extending the table. We can even add another table like failed_verification_bytecodes that only retains the failed bytecodes of the last e.g. 2 weeks. So that the database won't blow. The bytecodes are only useful immediately to see what went wrong but we can keep the verification_jobs table minimal and store it longer term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Sprint - Needs Review
Development

No branches or pull requests

3 participants