-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10777. S3 gateway error when parsing xml in concurrent execution #6609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10777. S3 gateway error when parsing xml in concurrent execution #6609
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix. Left a few comments.
I did some research on the issue and yes it seems the concurrent requests might cause xml parsing to fail.
It seems the error (FWK005...) is thrown from XMLParserConfiguration#parse implementation (e.g. XML11Configuration#parse) because it's found that fParseInProgress flag is set during the parse. The fParseInProgress is set at the start of the method and unset at the end of it (not really sure why not use a synchronized method instead).
If we see the XmlNamespaceFilter#parse method, it will call XMLReader#parse method which will eventually calls XMLParserConfiguration#parse. In the unmarshaller code it instantiates XMLReader once in the constructor and the XMLReader instance will be passed in setParent call for each request. So, it seems the issue was because of the same XMLReader is used by concurrent requests.
Therefore, instead of instantiating JAXBContext, SAXParserFactory for every request, maybe we can try to only instantiate a new XMLReader in the readFrom and see whether if it resolves the issue?
Another possible solution might be to use a ThreadLocal for XMLReader. From my understanding currently the S3G Jetty thread model is "thread-per-request" model, so IMO it should be fine.
| JAXBContext context = JAXBContext.newInstance(MultiDeleteRequest.class); | ||
| SAXParserFactory saxParserFactory = SAXParserFactory.newInstance(); | ||
| saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not need to instantiate this for every request. Could you help check whether instantiating only XMLReader for each request is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivandika3 Thanks for the review.
Yes, this modification solves the problem as well, I've done a test, please see the new code.
| try { | ||
| context = JAXBContext.newInstance(CompleteMultipartUploadRequest.class); | ||
| SAXParserFactory saxParserFactory = SAXParserFactory.newInstance(); | ||
| saxParserFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); | ||
| xmlReader = saxParserFactory.newSAXParser().getXMLReader(); | ||
| } catch (Exception ex) { | ||
| throw new AssertionError("Can not instantiate " + | ||
| "CompleteMultipartUploadRequest parser", ex); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also need to change for PutBucketAclRequestUnmarshaller since it has a similar pattern (although not used as often).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update. LGTM +1. Let's wait for the CI run.
|
@kerneltime Would you like to take a look? |
|
Thanks @guohao-rosicky for the fix, @ivandika3 for the review. |
(cherry picked from commit ff78dc8)
(cherry picked from commit ff78dc8)
What changes were proposed in this pull request?
An exception occurs occasionally when the business uses s3 gateway to perform Complete Multipart Upload or Delete Objects.
It was analyzed as a result of concurrent requests; See details: https://issues.apache.org/jira/browse/HDDS-10777
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10777
How was this patch tested?
UT
passed ci: https://github.com/guohao-rosicky/ozone/actions/runs/8890601157