Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn on unrecognized configuration properties and on configured pages with unusual names #1265

Merged
merged 8 commits into from
May 22, 2023

Conversation

mint-thompson
Copy link
Collaborator

Fixes #1225 and completes task CIMPL-1084.

An unrecognized configuration property is likely to represent something like a misspelled property, so emit a warning to bring it to the author's attention. Similarly, a mis-indented property may end up in the list of pages, so emit a warning when an entry in the list of pages has a URL that doesn't have .md or .xml at the end.

The giant array of property names is, admittedly, a little bit awkward. But in my research, I couldn't find a clean way to generate the array.

Try this change out locally, and you may find some old, leftover configuration properties that are hanging around in the FSH projects you use for testing. I was surprised to discover I still had ignoreWarnings in my configuration.

An unrecognized configuration property is likely to represent something
like a misspelled property, so emit a warning to bring it to the
author's attention. Similarly, a mis-indented property may end up in the
list of pages, so emit a warning when an entry in the list of pages has
a URL that doesn't have .md or .xml at the end.
Copy link
Member

@cmoesel cmoesel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great start, but it might be even more helpful if it handled more than top-level properties. Several of the properties have nested properties and it's certainly possible to get those wrong too. In fact, I think a fairly easy mistake is to indent a property too much -- which would potentially make it nested under another property and go undetected.

I've made this a comment rather than requesting changes, because I'd like your assessment on how much it would take to go this next step. There are some gotchas since some of our sub-properties allow arbitrary names -- but then require specific subproperties under them.

A couple of quick ideas off the top of my head:

  • I think we have code somewhere that flattens objects so nested keys end up like foo.bar.baz. We could use that w/ regexes to determine allowable keypaths.
  • We could create a JSON schema and then use a JSON schema validator against it. I forget if JSON schema allows for arbitrary key names in places, but if so, this might be a good bet and it also better documents what is allowed.

@cmoesel cmoesel self-assigned this May 1, 2023
Copy link
Collaborator

@jafeltra jafeltra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
I uncommented a whole block I had in an old config file and just blindly ran it, and this was a fun message to see!

Besides the one question I asked inline, I think this looks good. I agree with Chris's comment that this could be cool if we handled deeper validation, but I'll defer to you and Chris on if that is reasonable.

potentialKey => !ALL_YAML_CONFIG_PROPERTIES.includes(potentialKey)
);
if (unrecognizedKeys.length > 0) {
const propertyWord = unrecognizedKeys.length === 1 ? 'property' : 'properties';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol English is hard.

Comment on lines 910 to 919
if (yaml.pages != null) {
const unusualPages = Object.keys(yaml.pages).filter(nameUrl => !/\.(md|xml)$/.test(nameUrl));
if (unusualPages.length > 0) {
const pageWord = unusualPages.length === 1 ? 'Page' : 'Pages';
logger.warn(
`${pageWord} not ending in .md or .xml found in configuration: ${unusualPages.join(', ')}`
);
}
}
}
Copy link
Collaborator

@jafeltra jafeltra May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One configuration file I had that I tried this on had a pages object like so:

pages:
  index.html:
    title: Home
    background.xml: 
      title: Background Info

I can't remember if this will cause issues down the line (and I haven't run the IG Publisher on this yet), but do you know if html files are allowed? Or if index.html is a special case?

@mint-thompson
Copy link
Collaborator Author

@cmoesel I think using a JSON schema for detecting potential mistakes would be a good strategy. There are enough subproperties to check that I don't want to do all the validation by hand, and the patternProperties keyword would let us enforce rules about extra properties. So from there, it's just a matter of taking the errors reported by the validator and turning them into useful warning messages.
@jafeltra I think you're right about .html files being allowed here. I'm about to test it out to be sure, but my guess is based on the fact that a configured page can specify #html for the generation property.

A JSON schema is defined that is approximately equivalent to the type
definition for the YAML configuration. It is less strict, since it is
not designed for full validation. Emit a warning for each property that
has schema validation errors. The message is different if the schema
validation error is due to an additional property, since that usually
represents a mistake in spelling or indentation.
Copy link
Member

@cmoesel cmoesel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool. I like how once the schema is defined, the validation is a lot simpler.

I've suggested a few changes for your consideration.

src/import/YAMLschema.ts Outdated Show resolved Hide resolved
src/import/YAMLschema.ts Outdated Show resolved Hide resolved
src/import/YAMLschema.ts Outdated Show resolved Hide resolved
src/import/YAMLschema.ts Outdated Show resolved Hide resolved
Provide unique message when additional properties are found on a path
that starts with "pages", since that property has patternProperties
defined.
Copy link
Collaborator

@jafeltra jafeltra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes a lot of sense and is really going to be very helpful! I just had a couple questions on the JSON schema. Let me know if you want to talk about any of them.

src/import/YAMLschema.json Outdated Show resolved Hide resolved
src/import/YAMLschema.json Outdated Show resolved Hide resolved
src/import/YAMLschema.json Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a general JSON schema question, but is it helpful and/or possible to mark some of the properties (or properties within definitions) are required? For example, the publisher TS type has name as required and url and email are optional. This might be going too far for SUSHI - at some point, I assume we can let the publisher do the validation, so I'm not sure if this is crossing that line, so I'm curious what you think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I hadn't thought about that as much. The original focus had been on finding properties appearing in unexpected places due to incorrect spelling or indentation. Those types of issues are found setting rules on what can't appear by using additionalProperties. For properties that are required by SUSHI, I feel like the existing logic in the various helper functions in importConfiguration.ts are handling cases where values are required for processing, and not just required by a type definition. In the case of publisher specifically, the name is expected to exist on the object when using this part of the configuration in parseContact. But since it's used to populate ContactDetail.name, which is optional, nothing bad actually happens if it's undefined? Which maybe means it shouldn't be actually required in the type definition? This is a little tricky to navigate. I think that if it is actually required, the check should not be in the schema, but in the parsing functions, just so that the usage of the schema keeps its original narrow purpose.

src/import/YAMLschema.json Show resolved Hide resolved
Copy link
Member

@cmoesel cmoesel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I confirmed that you addressed Julia and my comments -- and I also confirmed that the schema JSON is packaged up in the module tgz and works when globally installed.

@jafeltra said that as long as you addressed her feedback, she was good with us assuming her approval, so I think we can merge this whenever you want, @mint-thompson!

@mint-thompson mint-thompson merged commit a77e340 into master May 22, 2023
@mint-thompson mint-thompson deleted the cimpl-1084-configuration-indentation branch May 22, 2023 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider reporting a warning when top-level configuration properties are indented
3 participants