-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Add model cache to speed up code generation #7250
Conversation
Merge upstream fixes
@spacether @wing328 @jimschubert |
@spacether, interesting observation: this PR is not supposed to change the code generation, but it does. I took a look at the generated python-experimental code. The generated code for the fake API is changed as follows: diff --git a/samples/openapi3/client/petstore/python-experimental/petstore_api/api/fake_api.py b/samples/openapi3/client/petstore/python-experimental/petstore_api/api/fake_api.py
index 40f7ef9a82..1fa17d289b 100644
--- a/samples/openapi3/client/petstore/python-experimental/petstore_api/api/fake_api.py
+++ b/samples/openapi3/client/petstore/python-experimental/petstore_api/api/fake_api.py
@@ -1549,7 +1549,7 @@ class FakeApi(object):
string (str): None. [optional]
binary (file_type): None. [optional]
date (date): None. [optional]
- date_time (datetime): None. [optional]
+ date_time (datetime): None. [optional] if omitted the server will use the default value of dateutil_parser('2010-02-01T10:20:10.11111+01:00')
password (str): None. [optional]
param_callback (str): None. [optional]
_return_http_data_only (bool): response data without head status
Perform git status This file and the specific change comes from the dateTime:
description: None
type: string
format: date-time
default: '2010-02-01T10:20:10.11111+01:00'
example: '2020-02-02T20:20:20.22222Z' The The other python diffs are markdown changes for the same default value. |
Yes, the new default is correct. My work in the PR that is blocked by this one showed that some of my old bad code in toDefault is setting schema values :( that's probably what is happening here. My guess is that it is caused by some bad mutation that python-experimental is doing. I am hitting this same issue in my PR also for this default value. |
Thanks for the quick reply. One difference is the |
The other diffs are in Haskell, and I don't know anything about this code generator. It's strange that the changes only occur in Haskell and python-experimental. My hunch is the caching enhancement uncovered a bug in the Haskell and python code generators, but I'll keep looking. There are also several weird changes in the generated -model/./apiResponse.ts
+model/apiResponse.ts |
private static Cache<String, String> underscoreWordsCache; | ||
|
||
// A cache of escaped words, used to optimize the performance of the escape() method. | ||
private static Cache<EscapedNameOptions, String> escapedWordsCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do this. Word escaping is going to be very generator-specific, and this static cache will break functionality for any application where openapi-generator is "long-lived" (IDE plugins, our online generator, and other long-running services holding oag in memory).
I think we need to improve our model and the logic around it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. What if the cache lifecycle is tied to the lifecycle of the generator, similar to the other caches? Would this work for the cases you mention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at the method again and I'm trying to think of scenarios where there would be conflicts across threads if all the parameters are the same, and I can't think of any. I think you're probably fine with your current implementation for memoizing the escape function. I thought I had noticed something when I commented yesterday that prevented this function from being pure, and I'm not seeing it today.
But because you asked about a tying to the generation, if we ever wanted to have per-run caching we could do something like the following (for something like a CodegenModel cache for $ref
!):
We could create a context object with unique ID as a hash key (an entity in DDD-land), and use this as the Cache key or as a component of the type defining the cache key.
For instance:
Id
package org.openapitools.codegen.context;
import java.util.Objects;
import java.util.UUID;
public final class Id {
private final UUID value;
public Id() {
this.value = UUID.randomUUID();
}
@Override
public int hashCode() {
return Objects.hash(getValue());
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Id id = (Id) o;
return getValue().equals(id.getValue());
}
public UUID getValue() {
return value;
}
}
InvocationContext
package org.openapitools.codegen.context;
import java.util.Objects;
import java.util.UUID;
public final class InvocationContext {
private final Id id;
public InvocationContext() {
this.id = new Id();
}
public UUID getIdentifier() {
return this.id.getValue();
}
@Override
public int hashCode() {
// InvocationContext is an entity and should only maintain id in hashcode.
return Objects.hash(id);
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
InvocationContext that = (InvocationContext) o;
// InvocationContext is an entity and should only maintain id in equals.
return id.equals(that.id);
}
}
And then update GlobalSettings to provide this per-run specific identifiable object:
public class GlobalSettings {
private static ThreadLocal<InvocationContext> context = new InheritableThreadLocal<InvocationContext>() {
@Override
protected InvocationContext initialValue() {
return new InvocationContext();
}
};
private static ThreadLocal<Properties> properties = new InheritableThreadLocal<Properties>() {
@Override
protected Properties initialValue() {
return (Properties) System.getProperties().clone();
}
};
public static String getProperty(String key, String defaultValue) {
return properties.get().getProperty(key, defaultValue);
}
public static String getProperty(String key) {
return properties.get().getProperty(key);
}
public static void setProperty(String key, String value) {
properties.get().setProperty(key, value);
}
public static void clearProperty(String key) {
properties.get().remove(key);
}
public static InvocationContext getContext() {
return context.get();
}
public static void reset() {
properties.remove();
context.remove();
}
}
Then, you'd just need to work out how you'd want to work with this in your cache. For your escape function, for example, you could change EscapedNameOptions
to accept a UUID which is added to equals and hashcode, and then construct it with the invocation context's identifier. So,
EscapedNameOptions ns = new EscapedNameOptions(
GlobalSettings.getContext().getIdentifier(),
name, replacementMap.keySet(), charactersToAllow, appendToReplacement);
in Thread1 would be different than the same in Thread2.
I could see us using this for a really simple ref lookup like this as the key and CodegenModel as the value:
package org.openapitools.codegen.context;
import java.util.Objects;
import java.util.UUID;
public final class ContextualRef {
private final UUID invocationId;
private final String ref;
public ContextualRef(UUID invocationId, String ref) {
if (ref == null) throw new IllegalArgumentException("ref can't be null");
this.invocationId = invocationId;
// normalize/validate ref?
this.ref = ref;
}
@Override
public int hashCode() {
return Objects.hash(invocationId, ref);
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ContextualRef that = (ContextualRef) o;
return invocationId.equals(that.invocationId) &&
ref.equals(that.ref);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't for the life of me come up with an example of the concern I expressed previously. Let's roll with this. As always, thanks for the great contribution!
Thanks @sebastien-rosset this is very helpful! |
This is a performance enhancement. The code generation behavior does not change:
StringUtils.escape()
fromModel()
fromProperty()
I've noticed for a large OpenAPI doc, the code generation was taking more than 3 minutes and 47% of all the CPU processing time was spent invoking the
findFirst()
method in PythonClientExperimentalCodegen.java (as shown below). This PR creates a map from toModelName() to schema such that toModelName() does not have to be called many time repeatedly and the Schema map does not have to be traversed repeatedly.https://github.com/OpenAPITools/openapi-generator/blob/master/modules/openapi-generator/src/main/java/org/openapitools/codegen/languages/PythonClientExperimentalCodegen.java#L453
PR checklist
./bin/generate-samples.sh
to update all Petstore samples related to your fix. This is important, as CI jobs will verify all generator outputs of your HEAD commit as it would merge with master. These must match the expectations made by your contribution. You may regenerate an individual generator by passing the relevant config(s) as an argument to the script, for example./bin/generate-samples.sh bin/configs/java*
. For Windows users, please run the script in Git BASH.master