Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import { readFileSync } from 'fs';
import globby from 'globby';
import * as path from 'path';
import type ts from 'typescript';
import { parseUsageCollection } from './ts_parser';
import type { TelemetryRC } from './config';
import { createKibanaProgram, getAllSourceFiles } from './ts_program';
Expand Down Expand Up @@ -40,8 +41,7 @@ export async function getProgramPaths({
);

if (filePaths.length === 0) {
return []; // Temporarily accept empty directories while https://github.com/elastic/kibana-team/issues/1066 is completed
// throw Error(`No files found in ${root}`);
return [];
}

const fullPaths = filePaths
Expand All @@ -55,11 +55,12 @@ export async function getProgramPaths({
return fullPaths;
}

export function filterCollectorPaths(fullPaths: string[]): string[] {
return fullPaths.filter((p) => COLLECTOR_RE.test(readFileSync(p, 'utf-8')));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there's the potential of making this function async (and could cut more time).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — filterCollectorPaths currently uses readFileSync on all 36,308 globbed files and takes 1.79s. Making it async with fs.promises.readFile + Promise.all would overlap the I/O and could cut that roughly in half.

That said, it's ~3% of the total task time (53s is createKibanaProgram), so the absolute savings would be ~1s. Happy to do it in a follow-up for cleanliness!

}

export function* extractCollectors(fullPaths: string[], tsConfig: any) {
// Pre-filter to only files that reference collector APIs so TS doesn't
// parse thousands of unrelated source files (36K → ~70 root files).
// TS still resolves transitive imports needed for type-checking.
const collectorPaths = fullPaths.filter((p) => COLLECTOR_RE.test(readFileSync(p, 'utf-8')));
const collectorPaths = filterCollectorPaths(fullPaths);

if (collectorPaths.length === 0) {
return;
Expand All @@ -72,3 +73,13 @@ export function* extractCollectors(fullPaths: string[], tsConfig: any) {
yield* parseUsageCollection(sourceFile, program);
}
}

export function* extractCollectorsWithProgram(collectorPaths: string[], program: ts.Program) {
if (collectorPaths.length === 0) {
return;
}
const sourceFiles = getAllSourceFiles(collectorPaths, program);
for (const sourceFile of sourceFiles) {
yield* parseUsageCollection(sourceFile, program);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,39 +10,65 @@
import ts from 'typescript';
import * as path from 'path';
import type { TaskContext } from './task_context';
import { extractCollectors, getProgramPaths } from '../extract_collectors';
import {
extractCollectorsWithProgram,
filterCollectorPaths,
getProgramPaths,
} from '../extract_collectors';
import { createKibanaProgram } from '../ts_program';

export function extractCollectorsTask(
{ roots }: TaskContext,
restrictProgramToPath?: string | string[]
) {
return roots.map((root) => ({
task: async () => {
const tsConfig = ts.findConfigFile('./', ts.sys.fileExists, 'tsconfig.json');
if (!tsConfig) {
throw new Error('Could not find a valid tsconfig.json.');
}
const programPaths = await getProgramPaths(root.config);

if (typeof restrictProgramToPath !== 'undefined') {
const restrictProgramToPaths = Array.isArray(restrictProgramToPath)
? restrictProgramToPath
: [restrictProgramToPath];

const fullRestrictedPaths = restrictProgramToPaths.map((collectorPath) =>
path.resolve(process.cwd(), collectorPath)
);
const restrictedProgramPaths = programPaths.filter((programPath) =>
fullRestrictedPaths.includes(programPath)
return [
{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! I wonder if we can make it even faster by defining concurrent tasks:

  1. One task that gets the program
  2. A set of parallel tasks that run extractCollectorsWithProgram with the shared program created in 1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I benchmarked this — extractCollectorsWithProgram across all 10 roots takes 0.06s total (most roots have 0-5 collectors, the largest two have 26 and 30). Since it's purely CPU-bound synchronous work (AST traversal + type checker lookups), Promise.all in single-threaded Node.js wouldn't actually parallelize it — it would just interleave the synchronous generators sequentially. True parallelism would need worker_threads, but the ts.Program can't be serialized across threads.

The bottleneck is createKibanaProgram at 53s (95.9% of the task). Extraction is negligible by comparison, so I'll leave this as-is.

task: async () => {
const tsConfig = ts.findConfigFile('./', ts.sys.fileExists, 'tsconfig.json');
if (!tsConfig) {
throw new Error('Could not find a valid tsconfig.json.');
}

const rootPathsMap = new Map<number, string[]>();
await Promise.all(
roots.map(async (root, idx) => {
const programPaths = await getProgramPaths(root.config);
rootPathsMap.set(idx, programPaths);
})
);
if (restrictedProgramPaths.length) {
root.parsedCollections = [...extractCollectors(restrictedProgramPaths, tsConfig)];

const rootCollectorMap = new Map<number, string[]>();
let allCollectorPaths: string[] = [];

for (const [idx, programPaths] of rootPathsMap) {
let paths = programPaths;

if (typeof restrictProgramToPath !== 'undefined') {
const restrictProgramToPaths = Array.isArray(restrictProgramToPath)
? restrictProgramToPath
: [restrictProgramToPath];
const fullRestrictedPaths = restrictProgramToPaths.map((collectorPath) =>
path.resolve(process.cwd(), collectorPath)
);
paths = paths.filter((p) => fullRestrictedPaths.includes(p));
}

const collectorPaths = filterCollectorPaths(paths);
rootCollectorMap.set(idx, collectorPaths);
allCollectorPaths = allCollectorPaths.concat(collectorPaths);
}

if (allCollectorPaths.length === 0) {
return;
}
return;
}

root.parsedCollections = [...extractCollectors(programPaths, tsConfig)];
const program = createKibanaProgram(allCollectorPaths, tsConfig);

for (const [idx, collectorPaths] of rootCollectorMap) {
roots[idx].parsedCollections = [...extractCollectorsWithProgram(collectorPaths, program)];
}
},
title: 'Extracting collectors across all roots',
},
title: `Extracting collectors in ${root.config.root}`,
}));
];
}
Loading