Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions packages/kbn-es-query/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
* Side Public License, v 1.
*/

import { compact } from 'lodash';
import { KueryNode } from './src/kuery/types';

export type {
BoolQuery,
DataViewBase,
Expand Down Expand Up @@ -127,3 +130,21 @@ export {
} from './src/utils';

export type { ExecutionContextSearch } from './src/expressions/types';

export function getKueryFields(nodes: KueryNode[]): string[] {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a function I just copied from

export function getKueryFields(nodes: KueryNode[]): string[] {

it should be sufficient to extract fields from a KQL query

Copy link
Copy Markdown
Contributor

@mattkime mattkime May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lukas is addressing this here - #183573 - its a bit more thorough and has a couple more tests

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's great!

const allFields = nodes
.map((node) => {
const {
arguments: [fieldNameArg],
} = node;

if (fieldNameArg.type === 'function') {
return getKueryFields(node.arguments);
}

return fieldNameArg.value;
})
.flat();

return compact(allFields);
}
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,29 @@ export const createSearchSource = (
indexPatterns: DataViewsContract,
searchSourceDependencies: SearchSourceDependencies
) => {
const createFields = async (searchSourceFields: SerializedSearchSourceFields = {}) => {
const createFields = async (
searchSourceFields: SerializedSearchSourceFields = {},
useDataViewLazy = false
) => {
const { index, parent, ...restOfFields } = searchSourceFields;
const fields: SearchSourceFields = {
...restOfFields,
};

// hydrating index pattern
if (searchSourceFields.index) {
if (typeof searchSourceFields.index === 'string') {
fields.index = await indexPatterns.get(searchSourceFields.index);
if (!useDataViewLazy) {
if (typeof searchSourceFields.index === 'string') {
fields.index = await indexPatterns.get(searchSourceFields.index);
} else {
fields.index = await indexPatterns.create(searchSourceFields.index);
}
} else {
fields.index = await indexPatterns.create(searchSourceFields.index);
if (typeof searchSourceFields.index === 'string') {
fields.index = await indexPatterns.getDataViewLazy(searchSourceFields.index);
} else {
fields.index = await indexPatterns.createDataViewLazy(searchSourceFields.index);
}
Comment on lines +47 to +58
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattkime let me shortly and hopefully explain in a better way what needs to be considered to get rid of the field_caps requests for alerting execution and why this code part is important

Originally, when a search source is created and an index is given, which is usually the case, it leads to a field caps request. I was interested what happens when I switch DataView with DataViewLazy, so no data view field caps request is triggered, and those fields, that are necessary, are fetched when they are needed. with a bit of copying code around, so DataViewLazy worked more similar to DavaView (except that no automatic field caps request is triggered), this worked. So in the alerting case just the fields that were necessary were fetched.

However, when a a saved search with an id is fetched in Discover, then this lazy data view with just the necessary fields, is being used as the main data view, which mean, just a few fields were mapped, rest unmapped. When a new saved search was created, unpersisted, this was not the case, because the index / data view was already there. Works as expected. This triggered my "Rabbit Hole Detection Alarm", because changing the behavior generally of SearchSource, high likely means e.g. changing the way SavedSearch are handled in Discover, and maybe more. That's why I tried to handle SearchSource differently depending on context, which in alerting rule context meant, DataViewLazy FTW, just fetch what's needed. while better let Discover etc search source consumption to be the same .. for now ... interested in hear your thought or how you intended to work around this. because, of course, I could be wrong

}
}

Expand All @@ -55,8 +66,11 @@ export const createSearchSource = (
return fields;
};

const createSearchSourceFn = async (searchSourceFields: SerializedSearchSourceFields = {}) => {
const fields = await createFields(searchSourceFields);
const createSearchSourceFn = async (
searchSourceFields: SerializedSearchSourceFields = {},
useDataViewLazy: boolean
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. we need to handle the hydration of searchSourceFields index differently for alerting, if we don't do this, e.g. in Discover and there are just the fields loaded, that are need, those are missing when a saved search is being loaded. We can work around of this, but I'd say, focus on the alerting relevant part

Copy link
Copy Markdown
Contributor

@mattkime mattkime May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to create and make available both DataView and DataViewLazy objects during the transition.


Ack, no, that wouldn't work since the DataView would load all the fields...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's what I was thinking. That's why I was thinking, focusing on solving one problem, the more important one (alerting), keep the current way it works for the other one, so there are no tears

) => {
const fields = await createFields(searchSourceFields, useDataViewLazy);
const searchSource = new SearchSource(fields, searchSourceDependencies);

// todo: move to migration script .. create issue
Expand All @@ -65,6 +79,9 @@ export const createSearchSource = (
if (typeof query !== 'undefined') {
searchSource.setField('query', migrateLegacyQuery(query));
}
if (useDataViewLazy) {
await searchSource.loadDataViewFields();
}

return searchSource;
};
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
*/

import { estypes } from '@elastic/elasticsearch';
import type { DataView } from '@kbn/data-views-plugin/common';
import { DataView, DataViewLazy } from '@kbn/data-views-plugin/common';
import { EsQuerySortValue } from './types';

type FieldSortOptions = estypes.FieldSort &
Expand All @@ -19,7 +19,7 @@ type FieldSortOptions = estypes.FieldSort &

export function normalizeSortRequest(
sortObject: EsQuerySortValue | EsQuerySortValue[],
indexPattern: DataView | string | undefined,
indexPattern: DataView | DataViewLazy | string | undefined,
defaultSortOptions: FieldSortOptions | string = {}
) {
const sortArray: EsQuerySortValue[] = Array.isArray(sortObject) ? sortObject : [sortObject];
Expand All @@ -35,14 +35,14 @@ export function normalizeSortRequest(
*/
function normalize(
sortable: EsQuerySortValue,
indexPattern: DataView | string | undefined,
indexPattern: DataView | DataViewLazy | string | undefined,
defaultSortOptions: FieldSortOptions | string
) {
const [[sortField, sortOrder]] = Object.entries(sortable);
const order = typeof sortOrder === 'object' ? sortOrder : { order: sortOrder };

if (indexPattern && typeof indexPattern !== 'string') {
const indexField = indexPattern.fields.find(({ name }) => name === sortField);
const indexField = indexPattern.fields?.find(({ name }) => name === sortField);
if (indexField && indexField.scripted && indexField.sortable) {
return {
_script: {
Expand Down
109 changes: 97 additions & 12 deletions src/plugins/data/common/search/search_source/search_source.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,20 +59,32 @@
*/

import { setWith } from '@kbn/safer-lodash-set';
import { difference, isEqual, isFunction, isObject, keyBy, pick, uniqueId, concat } from 'lodash';
import {
difference,
isEqual,
isFunction,
isObject,
keyBy,
pick,
uniqueId,
concat,
compact,
} from 'lodash';
import { catchError, finalize, first, last, map, shareReplay, switchMap, tap } from 'rxjs';
import { defer, EMPTY, from, lastValueFrom, Observable } from 'rxjs';
import type * as estypes from '@elastic/elasticsearch/lib/api/typesWithBodyKey';
import {
buildEsQuery,
Filter,
fromKueryExpression,
isOfQueryType,
isPhraseFilter,
isPhrasesFilter,
KueryNode,
} from '@kbn/es-query';
import { fieldWildcardFilter } from '@kbn/kibana-utils-plugin/common';
import { getHighlightRequest } from '@kbn/field-formats-plugin/common';
import type { DataView } from '@kbn/data-views-plugin/common';
import { DataView, DataViewLazy } from '@kbn/data-views-plugin/common';
import {
ExpressionAstExpression,
buildExpression,
Expand Down Expand Up @@ -135,6 +147,24 @@ interface ExpressionAstOptions {
asDatatable?: boolean;
}

export function getKueryFields(nodes: KueryNode[]): string[] {
const allFields = nodes
.map((node) => {
const {
arguments: [fieldNameArg],
} = node;

if (fieldNameArg.type === 'function') {
return getKueryFields(node.arguments);
}

return fieldNameArg.value;
})
.flat();

return compact(allFields);
}

/** @public **/
export class SearchSource {
private id: string = uniqueId('data_source');
Expand Down Expand Up @@ -681,9 +711,15 @@ export class SearchSource {
* flat representation (taking into account merging rules)
* @resolved {Object|null} - the flat data of the SearchSource
*/
private mergeProps(root = this, searchRequest: SearchRequest = { body: {} }): SearchRequest {
private mergeProps(
root = this,
searchRequest: SearchRequest = { body: {} },
filter?: string[]
): SearchRequest {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that I don't really understand what these mergeProps changes are doing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't
but now

What do I do with it? It's needed in loadDataViewFields

    const request = this.mergeProps(this, { body: {} }, ['query', 'filter']);

for getting fields to fetch I need merged props, if mergeProps without filtering down to those needed, it runs into code that needs fields.

If I just merge query & filter, I get what's needed for the field retrieval.

Object.entries(this.fields).forEach(([key, value]) => {
this.mergeProp(searchRequest, value, key as keyof SearchSourceFields);
if (!filter || filter.includes(key)) {
this.mergeProp(searchRequest, value, key as keyof SearchSourceFields);
}
});
if (this.parent) {
this.parent.mergeProps(root, searchRequest);
Expand All @@ -696,7 +732,7 @@ export class SearchSource {
}

private readonly getFieldName = (fld: SearchFieldValue): string =>
typeof fld === 'string' ? fld : (fld.field as string);
typeof fld === 'string' ? fld : (fld?.field as string);

private getFieldsWithoutSourceFilters(
index: DataView | undefined,
Expand Down Expand Up @@ -762,17 +798,64 @@ export class SearchSource {
return field;
}

public async loadDataViewFields() {
const dataView = this.getField('index');
if (!dataView || !(dataView instanceof DataViewLazy)) {
return;
}
// eslint-disable-next-line no-console
console.log('loadDataViewFields because of DataViewLazy');
const request = this.mergeProps(this, { body: {} }, ['query', 'filter']);
let fields = dataView.timeFieldName ? [dataView.timeFieldName] : [];
const sort = this.getField('sort');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I find this code very useful although I need to see how it compares to the various computations in the SearchSource.flatten method. I think it likely covers all cases where fields need to be loaded but I need to verify.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, of course this needs to be evaluated, I might have missed something, however, in local testing, adding a alert from discover with query, sorting, filters, it seemed to work (I just created the alert executing every few seconds, watching kibana log output)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really only concerns scripted fields, so it doesn't need to be fetched. That said, skipping the field fetch for this is an optimization that can be left for later.

if (sort) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was poking through the code to determine if fields referenced in sort need to be loaded. Did you figure this out? I dug for a bit but didn't come to a conclusion.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if you remove it, it breaks. I think it's because of this:

case 'sort':
const sort = normalizeSortRequest(
val,
this.getField('index'),
getConfig(UI_SETTINGS.SORT_OPTIONS)
);
return addToBody(key, sort);

which leads to this:

if (indexPattern && typeof indexPattern !== 'string') {
const indexField = indexPattern.fields.find(({ name }) => name === sortField);
if (indexField && indexField.scripted && indexField.sortable) {
return {
_script: {
script: {
source: indexField.script,
lang: indexField.lang,
},
type: castSortType(indexField.type),
...order,
},
};
}
}

it seems to be necessary for sorting by a scripted field

const sortArr = Array.isArray(sort) ? sort : [sort];
for (const s of sortArr) {
const keys = Object.keys(s);
fields = fields.concat(keys);
}
}
for (const query of request.query) {
if (query.query) {
const nodes = fromKueryExpression(query.query);
const queryFields = getKueryFields([nodes]);
fields = fields.concat(queryFields);
}
}
const filters = request.filters;
if (filters) {
const filtersArr = Array.isArray(filters) ? filters : [filters];
for (const f of filtersArr) {
fields = fields.concat(f.meta.field);
}
}
fields = fields.filter((f) => Boolean(f));

if (dataView.getSourceFiltering() && dataView.getSourceFiltering().excludes.length) {
// if source filtering is enabled, we need to fetch all the fields
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this arrived at? I just don't see why its needed based on existing search source code but I might be overlooking something.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know exactly where we do this in search source, didn't look, but for sure it needs to be done

then there is data view source filtering enabled, we need to send all other fields that we don't need to filter out ... using the fields API (default), works differently in using _source. You could configure a data view with a field excluded, add a debugger in the search source code, and watch what and where it happens in the code

however, the solution is pretty simple. if there are fields to be filtered, fetch all fields. here we can't work around until ES supports excluding fields in this request

await dataView.getFields({ fieldName: ['*'] });
} else if (fields.length) {
const fieldSpec = await dataView.getFields({
fieldName: fields,
// fieldTypes: ['date', 'date_nanos'],
});
const spec = Object.values(fieldSpec.getFieldMap()).map((field) => field.spec);
dataView.setFields(spec);
}
}

Comment on lines +801 to +846
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this is new, and just needed for the alerting execution, where in this POC a DataViewLazy is being used. in this code we figure out when fields we need to load, that's the timefield, sort field, query fields, filter fields. once we know which fields we need, we get those, and they can build the query to ES

private flatten() {
const { getConfig } = this.dependencies;
const searchRequest = this.mergeProps();
searchRequest.body = searchRequest.body || {};
const { body, index, query, filters, highlightAll, pit } = searchRequest;

searchRequest.indexType = this.getIndexType(index);
const metaFields = getConfig(UI_SETTINGS.META_FIELDS) ?? [];

// get some special field types from the index pattern
const { docvalueFields, scriptFields, runtimeFields } = index
? index.getComputedFields()
? index.getComputedFields({})
: {
docvalueFields: [],
scriptFields: {},
Expand Down Expand Up @@ -802,11 +885,11 @@ export class SearchSource {

const filter = fieldWildcardFilter(body._source.excludes, metaFields);
// also apply filters to provided fields & default docvalueFields
body.fields = body.fields.filter((fld: SearchFieldValue) => filter(this.getFieldName(fld)));
fieldsFromSource = fieldsFromSource.filter((fld: SearchFieldValue) =>
body.fields = body.fields?.filter((fld: SearchFieldValue) => filter(this.getFieldName(fld)));
fieldsFromSource = fieldsFromSource?.filter((fld: SearchFieldValue) =>
filter(this.getFieldName(fld))
);
filteredDocvalueFields = filteredDocvalueFields.filter((fld: SearchFieldValue) =>
filteredDocvalueFields = filteredDocvalueFields?.filter((fld: SearchFieldValue) =>
filter(this.getFieldName(fld))
);
}
Expand All @@ -820,7 +903,7 @@ export class SearchSource {
// filter down script_fields to only include items specified
body.script_fields = pick(
body.script_fields,
Object.keys(body.script_fields).filter((f) => uniqFieldNames.includes(f))
Object.keys(body.script_fields)?.filter((f) => uniqFieldNames.includes(f))
);
}

Expand All @@ -829,7 +912,7 @@ export class SearchSource {
const remainingFields = difference(uniqFieldNames, [
...Object.keys(body.script_fields),
...Object.keys(body.runtime_mappings),
]).filter((remainingField) => {
])?.filter((remainingField) => {
if (!remainingField) return false;
if (!body._source || !body._source.excludes) return true;
return !body._source.excludes.includes(remainingField);
Expand All @@ -848,7 +931,7 @@ export class SearchSource {
// already set in docvalue_fields
body.fields = [
...body.fields,
...filteredDocvalueFields.filter((fld: SearchFieldValue) => {
...filteredDocvalueFields?.filter((fld: SearchFieldValue) => {
return (
fieldsFromSource.includes(this.getFieldName(fld)) &&
!(body.docvalue_fields || [])
Expand Down Expand Up @@ -894,6 +977,8 @@ export class SearchSource {
} else {
body.fields = filteredDocvalueFields;
}
// @ts-ignore
body.fields = body.fields.filter((field) => Boolean(field));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this is filtering out empty values?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just made it work, I had undefined coming from somewhere, and filtered it out. should be solved differently, but I didn't have so much time I was like DataViewLazy, didn't do all the work :)


// If sorting by _score, build queries in the "must" clause instead of "filter" clause to enable scoring
const filtersInMustClause = (body.sort ?? []).some((sort: EsQuerySortValue[]) =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@ export class SearchSourceService {
* creates searchsource based on serialized search source fields
*/
create: createSearchSource(indexPatterns, dependencies),
createLazy: (searchSourceFields: SerializedSearchSourceFields = {}) => {
const fn = createSearchSource(indexPatterns, dependencies);
return fn(searchSourceFields, true);
},
Comment on lines +64 to +67
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if tried to add just a configuration to the create function which uses a DataViewLazy then, for whatever reason it didn't work. with a dedicated function it did. it's a POC, it doesn't need to be polished

/**
* creates an enpty search source
*/
Expand Down
6 changes: 4 additions & 2 deletions src/plugins/data/common/search/search_source/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ import { SerializableRecord } from '@kbn/utility-types';
import { PersistableStateService } from '@kbn/kibana-utils-plugin/common';
import type { Filter } from '@kbn/es-query';
import { ISearchOptions } from '@kbn/search-types';
import type { DataView, DataViewSpec } from '@kbn/data-views-plugin/common';
import { DataView, DataViewLazy, DataViewSpec } from '@kbn/data-views-plugin/common';
import type { AggConfigSerialized, IAggConfigs } from '../../../public';
import type { SearchSource } from './search_source';

Expand All @@ -34,6 +34,8 @@ export interface ISearchStartSearchSource
* @param fields
*/
create: (fields?: SerializedSearchSourceFields) => Promise<ISearchSource>;

createLazy: (fields?: SerializedSearchSourceFields) => Promise<ISearchSource>;
/**
* creates empty {@link SearchSource}
*/
Expand Down Expand Up @@ -110,7 +112,7 @@ export interface SearchSourceFields {
/**
* {@link IndexPatternService}
*/
index?: DataView;
index?: DataView | DataViewLazy;
Copy link
Copy Markdown
Contributor

@mattkime mattkime May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This affects any code that reads a data view from search source, causing typescript errors. I'm going to examine using DataViews internally with a custom field list.

It seems this wasn't immediately clear since type check often only reports a subset of type problems.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick note without having the ability to go into depth, this change would be redundant if for alerting no data view would be assigned initially , and just for the query a DataView with just the necessary fields that were fetched before would be assigned before the query is built .

timeout?: string;
terminate_after?: number;
searchAfter?: estypes.SortResults;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,12 @@ import type {
SourceFilter,
TypeMeta,
RuntimeField,
DataViewFieldMap,
} from '../types';
import { removeFieldAttrs } from './utils';
import { metaUnitsToFormatter } from './meta_units_to_formatter';

import type { DataViewAttributes, FieldAttrs, FieldAttrSet } from '..';
import type { DataViewAttributes, FieldAttrs, FieldAttrSet, IIndexPatternFieldList } from '..';

import type { DataViewField } from '../fields';

Expand Down Expand Up @@ -68,6 +69,10 @@ export abstract class AbstractDataView {
* Only used by rollup indices, used by rollup specific endpoint to load field list.
*/
public typeMeta?: TypeMeta;
/**
* Field list, in extended array format
*/
public fields?: IIndexPatternFieldList & { toSpec: () => DataViewFieldMap };
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a change for DataViewLazy for the alerting case. we just load the field we need for the alert, and then it's used for the alert ES query execution. works

Copy link
Copy Markdown
Contributor

@mattkime mattkime May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think the general technique is good, adding a field list to DataViewLazy of the same sort as DataView is likely to lead to confusion and problems elsewhere. The DataViewLazy instances are cached so its possible that two different code paths could be setting the field list with different content. Instead, we could create a DataView locally (without the data view service!) and populate the field list. I think you suggested this elsewhere.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this could be an approach, instead of DataViewLazy. Just keep in mind in the alerting context caching ain't a problem, it because fields are just loaded once on every execution. That's why the given code already works. It's handling Alerting execution differently than other executions. This allows us to fix the main problem without having to worry about other usages, they continue working


/**
* Timestamp field name
Expand Down
Loading