-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[ML] Adding ability to override various settings when importing data #24346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Adding ability to override various settings when importing data #24346
Conversation
|
Pinging @elastic/ml-ui |
💔 Build Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember I had issues with creating euiDescriptionList__* overrides in a BEM style, maybe it's worth adding a comment that this nested structure and direct overwrites of styles was necessary because BEM (something like analysisSummaryList__title) didn't work because the elements themselves are not accessible or the override doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Error is native to JavaScript maybe better to rename this so something like ErrorAccordion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a duplicate of .analysis-summary-list in _analysis_summary.scss, maybe this can be consolidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd quite like to keep these styles separate as it keeps the components independent.
Also, I can see in the future the title width in this component needing to be increased once we get some real user indices being created with long names.
walterra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just added some minor notes.
💔 Build Failed |
peteharverson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments, but otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is putting this EuiDescriptionList inside a div necessary? Can you just add a className prop to the EuiDescriptionList?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, is the div necessary here, or could you just use a className prop on the EuiDescriptionList?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's unlikely, but is it possible that the file import could only contain one line? If so, should do a plural check on documents in the text here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've been unable to produce a situation where 1 out of 1 document fails to be ingested.
With 1 line in a semi structured text file, the file structure endpoint fails to produce results.
I'm going to leave this as is, as I don't want to change it without testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the backend needs to see 3 lines with timestamps in the same format to detect semi-structured text. Two complete messages plus the last one that gets ignored by the structure finder in case it’s partial (but proves the one above is complete).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a delay in ms? We often add _MS to these constants to make it explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - should be the rather than he!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - recheckTimeout rather than camel-casing the Check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - should be and end rather than and and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking you meant to leave this comment out code in? Is it for a future bit of work?
💔 Build Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is state being used anywhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, switching to a function
💔 Build Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a clarifying question - for the isInvalid prop - seeing this specific check for empty string a lot. Does this mean indexNameError is getting initially defaulted to an empty string somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in the default state of the ImportView component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit 🕸 - debugger comment left in.
alvarezmelissa87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just left a couple of nit/question/comments.
💔 Build Failed |
|
retest |
2321aef to
a26105a
Compare
💔 Build Failed |
💚 Build Succeeded |
* [ML] File datavisualizer initial commit (#22828) * [ML] File datavisualizer initial commit * removing mocked data and adding initial stats * adding card styling to fields * Revert "". accidentally added with no commit message This reverts commit d762d20b706e6a770e631f863b9e7d8879bb7ee6. * adding date type to timestamp field * renaming FileStats to FieldsStats * code clean up * changes based on review * changes to error handling * [ML] Adding file datavisualizer overrides (#23194) * [ML] Adding file datavisualizer overrides * improvements to overrides * removing comment * small refactor * removing accidentally added file * updates based on review * fixing broken test * adding missing grok pattern override * fixing test * [ML] Refactoring override option lists (#23424) * [ML] Refactoring override option lists * moving lists out of class * updating test snapshot * [ML] Fixing field editing (#23500) * [ML] Changes to timestamp formats (#23498) * [ML] Changes to timestamp formats * updating test snapshot * [ML] Allow Datavisualizer use on basic license (#23748) * [ML] Allow ML use on basic license * removing timeout change * adding permission checks * updating tests * removing unnecessary checks * [ML] Adds new page for choosing file or index based data visualizer (#23763) * [ML] Adding license check to datavisualizer landing page (#23809) * [ML] Adding license check to datavisualizer landing page * removing comments * updating redirect to landing page * [ML] Adding ability to upload data to elasticsearch from datavisualizer (#24042) * [ML] Initial work for delimited file upload * adding results links cards * adding nav menu * removing accidental debugger * initial work for importing semi structured text * using ingest pipeline for import * adding json importer and better error reporting * better progress steps * time range added to results links * first import only creates index and pipeline * adding status constants * using status constants * adding explanation comment * updating yarn.lock * changes based on review * fixing space * fixing space again, stort it out git * removing oversized background container causing constant scrollbar * [ML] Adding basic license check when loading privileges (#24173) * [ML] Adding basic license check * missing import * [ML] Adds an About panel to the file data visualizer landing page (#24260) * [ML] Adds an About panel to the file data visualizer landing page * [ML] Remove unnecessary style from file data visualizer scss * [ML] Adding better error reporting for reading and importing data (#24269) * [ML] Adding better error reporting for reading and importing data * changes to endpoint errors * displaying errors * step logic refactor * removing log statements * [ML] Switch file data visualizer to use Papa Parse for CSV parsing (#24329) * [ML] Fixes layout of Data Visualizer selector page for IE (#24387) * [ML] Adding ability to override various settings when importing data (#24346) * [ML] Adding ability to override various settings when importing data * second commit with most of the outstanding code * improving index pattern name validation * better index pattern matching * adding comments * adding empty index pattern check * changes based on review * fixing test
* [ML] File datavisualizer initial commit (elastic#22828) * [ML] File datavisualizer initial commit * removing mocked data and adding initial stats * adding card styling to fields * Revert "". accidentally added with no commit message This reverts commit d762d20b706e6a770e631f863b9e7d8879bb7ee6. * adding date type to timestamp field * renaming FileStats to FieldsStats * code clean up * changes based on review * changes to error handling * [ML] Adding file datavisualizer overrides (elastic#23194) * [ML] Adding file datavisualizer overrides * improvements to overrides * removing comment * small refactor * removing accidentally added file * updates based on review * fixing broken test * adding missing grok pattern override * fixing test * [ML] Refactoring override option lists (elastic#23424) * [ML] Refactoring override option lists * moving lists out of class * updating test snapshot * [ML] Fixing field editing (elastic#23500) * [ML] Changes to timestamp formats (elastic#23498) * [ML] Changes to timestamp formats * updating test snapshot * [ML] Allow Datavisualizer use on basic license (elastic#23748) * [ML] Allow ML use on basic license * removing timeout change * adding permission checks * updating tests * removing unnecessary checks * [ML] Adds new page for choosing file or index based data visualizer (elastic#23763) * [ML] Adding license check to datavisualizer landing page (elastic#23809) * [ML] Adding license check to datavisualizer landing page * removing comments * updating redirect to landing page * [ML] Adding ability to upload data to elasticsearch from datavisualizer (elastic#24042) * [ML] Initial work for delimited file upload * adding results links cards * adding nav menu * removing accidental debugger * initial work for importing semi structured text * using ingest pipeline for import * adding json importer and better error reporting * better progress steps * time range added to results links * first import only creates index and pipeline * adding status constants * using status constants * adding explanation comment * updating yarn.lock * changes based on review * fixing space * fixing space again, stort it out git * removing oversized background container causing constant scrollbar * [ML] Adding basic license check when loading privileges (elastic#24173) * [ML] Adding basic license check * missing import * [ML] Adds an About panel to the file data visualizer landing page (elastic#24260) * [ML] Adds an About panel to the file data visualizer landing page * [ML] Remove unnecessary style from file data visualizer scss * [ML] Adding better error reporting for reading and importing data (elastic#24269) * [ML] Adding better error reporting for reading and importing data * changes to endpoint errors * displaying errors * step logic refactor * removing log statements * [ML] Switch file data visualizer to use Papa Parse for CSV parsing (elastic#24329) * [ML] Fixes layout of Data Visualizer selector page for IE (elastic#24387) * [ML] Adding ability to override various settings when importing data (elastic#24346) * [ML] Adding ability to override various settings when importing data * second commit with most of the outstanding code * improving index pattern name validation * better index pattern matching * adding comments * adding empty index pattern check * changes based on review * fixing test
* [ML] File datavisualizer initial commit (#22828) * [ML] File datavisualizer initial commit * removing mocked data and adding initial stats * adding card styling to fields * Revert "". accidentally added with no commit message This reverts commit d762d20b706e6a770e631f863b9e7d8879bb7ee6. * adding date type to timestamp field * renaming FileStats to FieldsStats * code clean up * changes based on review * changes to error handling * [ML] Adding file datavisualizer overrides (#23194) * [ML] Adding file datavisualizer overrides * improvements to overrides * removing comment * small refactor * removing accidentally added file * updates based on review * fixing broken test * adding missing grok pattern override * fixing test * [ML] Refactoring override option lists (#23424) * [ML] Refactoring override option lists * moving lists out of class * updating test snapshot * [ML] Fixing field editing (#23500) * [ML] Changes to timestamp formats (#23498) * [ML] Changes to timestamp formats * updating test snapshot * [ML] Allow Datavisualizer use on basic license (#23748) * [ML] Allow ML use on basic license * removing timeout change * adding permission checks * updating tests * removing unnecessary checks * [ML] Adds new page for choosing file or index based data visualizer (#23763) * [ML] Adding license check to datavisualizer landing page (#23809) * [ML] Adding license check to datavisualizer landing page * removing comments * updating redirect to landing page * [ML] Adding ability to upload data to elasticsearch from datavisualizer (#24042) * [ML] Initial work for delimited file upload * adding results links cards * adding nav menu * removing accidental debugger * initial work for importing semi structured text * using ingest pipeline for import * adding json importer and better error reporting * better progress steps * time range added to results links * first import only creates index and pipeline * adding status constants * using status constants * adding explanation comment * updating yarn.lock * changes based on review * fixing space * fixing space again, stort it out git * removing oversized background container causing constant scrollbar * [ML] Adding basic license check when loading privileges (#24173) * [ML] Adding basic license check * missing import * [ML] Adds an About panel to the file data visualizer landing page (#24260) * [ML] Adds an About panel to the file data visualizer landing page * [ML] Remove unnecessary style from file data visualizer scss * [ML] Adding better error reporting for reading and importing data (#24269) * [ML] Adding better error reporting for reading and importing data * changes to endpoint errors * displaying errors * step logic refactor * removing log statements * [ML] Switch file data visualizer to use Papa Parse for CSV parsing (#24329) * [ML] Fixes layout of Data Visualizer selector page for IE (#24387) * [ML] Adding ability to override various settings when importing data (#24346) * [ML] Adding ability to override various settings when importing data * second commit with most of the outstanding code * improving index pattern name validation * better index pattern matching * adding comments * adding empty index pattern check * changes based on review * fixing test
Summary
User can make changes to the settings, mappings and ingest pipeline in an advanced tab before importing

Existing Indices and index patterns are checked for

Index names are checked to match naming rules specified here

https://www.elastic.co/guide/en/elasticsearch/reference/6.4/indices-create-index.html
Index patterns are checked to match index

Removing ability to override the charset as an existing issue prevents it from being anything other than UTF-8
When messing up an override, such as the grok pattern which causes the file structure finder endpoint to fail completely. The endpoint is run again with the previous settings allowing the user to try again.

Better displaying of hard errors from the bulk ingest call.

Better import summary information.

Better displaying of docs which failed to be ingested.

Semi-structured text imports now includes the last line. This may not be ingested if the line is actually incomplete. But there is no way of knowing without having a following line starting with the
multiline_start_patternFixes issue where an index patten is created too quickly for the results links. causing the
toandfromtimes to benull. If they arenull, a second attempt to load them happens a few seconds later.