Iterate on the new `verification_jobs` table #1826

marcocastignoli · 2025-01-06T12:12:12Z

Iterate on the new table definition, making sure that it contains all the need information. In particular focus on the foreign key verification_id in the case of a successful verification and understand what information to include in the case of a failed verification:

jobId
endpoint: /v2/verify/etherscan/
chainId
address
status
server hardware
compilation time
errors (compiler failures)
createdAt
finishedAt (success or fail)
when successful: verification-id
Do we want to store information when the compilation succeeds but match fails?
- onchain bytecodes
- recompiled bytecodes

The text was updated successfully, but these errors were encountered:

manuelwedler · 2025-01-30T11:44:31Z

I mostly agree with above fields. Errors need multiple fields due to the API design in Stoplight. I propose to use exactly these names for the columns:

Table: `verification_jobs`

Needed for the API response:

id: bigserial, primary key, incremental
status: varchar, not null
started_at: timestamptz, not null
completed_at: timestamptz
chain_id: numeric, not null
contract_address: bytea, not null
verified_contract_id: bigserial, foreign key to verified_contracts.id
error_code: varchar
error_message: varchar
error_id: uuid

These are only needed for stats:

verification_endpoint: varchar, not null
server_hardware: jsonb, not null
compilation_time: interval

Regarding server_hardware, I am not sure if we can get helpful information for it, since we run it in Cloud Run and CPU and memory allocations are just limits. We can try to use the Node.js functions for those (os.cpus(), os.totalmem(), ...). Let's see what they return under this environment. I think it makes sense to store this information for each verification job in the following json format:

{
  "cpu": string,
  "memory": string
}

Anything else that should be added to the server_hardware?

Additional information?

Do we want to store information when the compilation succeeds but match fails?

onchain bytecodes

recompiled bytecodes

In my opinion, we should not store more information than necessary, because I don't want to blow the database. The error_code with the error_message should be enough to tell the user what went wrong. If we want to debug something, we should be able to find the corresponding logs by using the error_id.

kuzdogan · 2025-01-31T12:04:32Z

Small comment on chain_id, I think it should be bigInt instead of numeric, see verifier-alliance/database-specs#23

kuzdogan · 2025-01-31T12:06:29Z

Additionally this was what's recommended by Claude. I don't know to what extent this is useful but I see we can go beyond the minimal cpu+memory info:

const os = require('os');
const v8 = require('v8');

const getHardwareInfo = () => {
  const memoryUsage = process.memoryUsage();
  const heapStats = v8.getHeapStatistics();
  
  return {
    cpu: {
      model: os.cpus()[0].model,
      cores: os.cpus().length,
      speed: os.cpus()[0].speed,
      allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
      cpu_quota: process.resourceUsage().maxCPU || null
    },
    memory: {
      total_allocated: process.env.MEMORY_LIMIT || os.totalmem(),
      available: os.freemem(),
      process_usage: {
        heapTotal: memoryUsage.heapTotal,
        heapUsed: memoryUsage.heapUsed,
        external: memoryUsage.external,
        rss: memoryUsage.rss
      }
    },
    environment: {
      type: process.env.K_SERVICE ? 'cloud-run' : 'local',
      node_version: process.version,
      container_id: process.env.HOSTNAME || null
    }
  };
};

Edit: When I asked for only the essential info Claude gave me these:

const os = require('os');

const getEssentialHardwareInfo = () => {
  return {
    cpu: {
      allocated_cores: process.env.CPU_LIMIT || os.cpus().length,
      model: os.cpus()[0].model
    },
    memory: {
      allocated_mb: Math.floor((process.env.MEMORY_LIMIT || os.totalmem()) / (1024 * 1024))
    },
    environment: {
      type: process.env.K_SERVICE ? 'cloud-run' : 'local'
    }
  };
};

kuzdogan · 2025-01-31T12:12:05Z

Regarding storing the bytecodes, I'd really like to show a nice UI for the users in the future how these bytecodes diff for failed jobs. Like this:

We find ourselves debugging into people's failed verifications a lot of times and it would be useful to have this information without doing everything locally, even better if we can convey this information to the user and they don't have to contact us at all. IMO this is an essential part of building an overall better contract verification experience.

However this is not a requisite from my side and this can be added later by extending the table. We can even add another table like failed_verification_bytecodes that only retains the failed bytecodes of the last e.g. 2 weeks. So that the database won't blow. The bytecodes are only useful immediately to see what went wrong but we can keep the verification_jobs table minimal and store it longer term.

github-project-automation bot added this to Sourcify Public Jan 6, 2025

marcocastignoli mentioned this issue Jan 6, 2025

[Milestone] APIv2 #1367

Open

github-project-automation bot moved this to Triage in Sourcify Public Jan 6, 2025

manuelwedler mentioned this issue Jan 6, 2025

Iterate on the new verification_jobs table #1825

Closed

marcocastignoli moved this from Triage to Sprint - Candidates in Sourcify Public Jan 6, 2025

marcocastignoli mentioned this issue Jan 6, 2025

Verification endpoints #1828

Open

marcocastignoli moved this from Sprint - Candidates to Sprint - Up Next in Sourcify Public Jan 6, 2025

marcocastignoli assigned manuelwedler Jan 20, 2025

manuelwedler assigned manuelwedler and unassigned manuelwedler Jan 27, 2025

manuelwedler moved this from Sprint - Up Next to Sprint - In Progress in Sourcify Public Jan 28, 2025

marcocastignoli moved this from Sprint - In Progress to Sprint - Needs Review in Sourcify Public Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iterate on the new `verification_jobs` table #1826

Iterate on the new `verification_jobs` table #1826

marcocastignoli commented Jan 6, 2025 •

edited

Loading

manuelwedler commented Jan 30, 2025 •

edited

Loading

kuzdogan commented Jan 31, 2025

kuzdogan commented Jan 31, 2025 •

edited

Loading

kuzdogan commented Jan 31, 2025

Iterate on the new verification_jobs table #1826

Iterate on the new verification_jobs table #1826

Comments

marcocastignoli commented Jan 6, 2025 • edited Loading

manuelwedler commented Jan 30, 2025 • edited Loading

Table: verification_jobs

Additional information?

kuzdogan commented Jan 31, 2025

kuzdogan commented Jan 31, 2025 • edited Loading

kuzdogan commented Jan 31, 2025

Iterate on the new `verification_jobs` table #1826

Iterate on the new `verification_jobs` table #1826

marcocastignoli commented Jan 6, 2025 •

edited

Loading

manuelwedler commented Jan 30, 2025 •

edited

Loading

Table: `verification_jobs`

kuzdogan commented Jan 31, 2025 •

edited

Loading