[core-http] Update XML serialization to support Atom XML operations by ramya0820 · Pull Request #4916 · Azure/azure-sdk-for-js

ramya0820 · 2019-08-27T22:15:32Z

EDIT:

This PR is for changes in core-http required for supporting Atom XML serialization required of Service Bus (and other SDKs in future such as Event Hubs). Code was mostly moved over from older JS only azure-sb implementation and has been significantly updated to fit into TS best practices.
Most of the initial work has been moved to Service Bus #4922 as per latest discussion.

Also includes

refactoring of general XML utilities
bug fix for XML serialization in browser

AlexGhiondea · 2019-08-28T17:00:00Z

common/config/rush/pnpm-lock.yaml

  eslint-plugin-promise: 4.2.1
  events: 3.0.0
  express: 4.17.1
+  fast-xml-parser: 3.12.20


Is this a canonical implementation of an XML parser? Is this the default used in the community?

The dependencies to review would be in package.json file.
This is a temporary artifact that gets generated on local and will be rebased on master. (After which this line would go away since it's not being used in implementation)

About the dependency in itself, I tried using it but it doesn't allow for XML manipulation in memory. So, not sure if it's a good choice for an alternative just yet.

bterlson

High level:

Remove each util
Remove unused utils
Remove any types
Remove dependency on cryptojs in the browser (consult with Johnathan and Daniel on what they did with keyvault)
Remove isUndefined helper (x === undefined is better). In general the helpers seem less useful with modern JS so take a stab at removing as many as makes sense :)
Some specific comments were left, but I did not point out all the places where the feedback applies. Take these comments as general feedback.
Replace vars with let/const.

sdk/core/core-http/lib/util/utils.ts

sdk/core/core-http/lib/atomResourceSerializerBase.ts

sdk/core/core-http/lib/util/atomHandler.ts

sdk/core/core-http/lib/util/utils.ts

ramya0820 · 2019-08-29T22:27:25Z

Thanks @sadasant for helping with the WebCrypto stuff! Let me know if there's anything else you suggest around it. The changes we looked into are present in crypto.browser.ts file

sdk/core/core-http/lib/atomResourceSerializerBase.ts

sdk/core/core-http/lib/credentials/sasServiceClientCredentials.ts

sdk/core/core-http/lib/util/atomHandler.ts

sdk/core/core-http/lib/policies/atomSerializationPolicy.ts

sdk/core/core-http/lib/util/xml.ts

sdk/core/core-http/lib/util/utils.ts

sdk/core/core-http/lib/policies/atomSerializationPolicy.ts

sdk/core/core-http/lib/util/atomHandler.ts

sdk/core/core-http/lib/policies/msRestUserAgentPolicy.browser.ts

sdk/core/core-http/lib/util/xml.ts

…to sb-atom-p5

ramya0820 · 2019-10-15T22:22:39Z

PR updated to not house the atomSerializationPolicy or any Atom XML specific things per discussions yesterday.
In addition to refactoring, changes so far include few more bug fixes (with XML serialization in browser) and unit tests (#5548)
@ramya-rao-a @bterlson @chradek @daviwil

AlexGhiondea

The changes overall look good, but I have some concerns around throwing new exceptions when we didn't throw them before.

I left a couple of comments, please take a look.

AlexGhiondea · 2019-10-15T22:41:36Z

sdk/core/core-http/lib/util/utils.ts

+
+/**
+ * @ignore
+ * Helper utility to clean up unintended characters that get appended by OS.


My understanding is that the OS won't automatically append the BOM to a file as it doesn't know the content of the file. It is up to the application to add those bytes or not.

In our scenario, who is going to put the BOM into the stream that contains the XML doc?

This is the comment that was added by original authors in azure-sb

Based on https://www.w3.org/International/questions/qa-byte-order-mark.en, looks like this gets added commonly by editors and any app that's used to generate the text content.

I understand the scenario where applications put this header. What I am trying to understand is how is the service bus SDK receiving XML objects that contain this BOM.

I don't mind adding this in if there is a scenario in which we know the BOM is being returned by the service -- what I am asking is for is that scenario.

So there is no test scenario that was written earlier that surfaced this as being sent by the service.
As discussed earlier, our understanding is that the OS on which the code generating files (if any) would end up appending this.

So no, I haven't encountered a concrete scenario supporting this other than the fact that this check was added with a comment describing reason by authors in azure-sb

Notes from offline discussion yesterday afternoon for ref -
• Handling of unwanted BOM characters - This was being taken care of in azure-sb implementation and seems less understood by us overall.
https://en.wikipedia.org/wiki/Byte_order_mark
https://www.screenaware.com/en/blog/xml2js-sax-js-non-whitespace-before-first-tag
Based on research and testing things, looks like it's a general problem that users face and must have been added for a reason in azure-sb. The documentation and intent is pretty clear too.
In Node, while xml2js handles this internally for XML parsing, it doesn't do so for XML building.
In browser, both building and parsing require BOM handling.

AlexGhiondea · 2019-10-15T22:46:54Z

sdk/core/core-http/lib/util/utils.ts

+ */
+export function removeBOM(str: any) {
+  if (typeof str == "string") {
+    if (str.charCodeAt(0) === 0xfeff || str.charCodeAt(0) === 0xffef) {


I believe that this checks for the BOM header when the encoding is expected to be UTF16.
Would the correct check be for the following sequence of bytes EF BB BF ?

https://www.w3.org/International/questions/qa-byte-order-mark.en

Looks like this is for utf-8, utf-16 seems to have some other considerations added in a different section

Here is another resource -- there seems to be a difference between UTF8 and UTF16.

https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

The UTF-8 representation of the BOM is the (hexadecimal) byte sequence 0xEF,0xBB,0xBF

https://en.wikipedia.org/wiki/Byte_order_mark#UTF-16

In UTF-16, a BOM (U+FEFF) may be placed as the first character of a file or character stream to indicate the endianness (byte order) of all the 16-bit code units of the file or stream. If an attempt is made to read this stream with the wrong endianness, the bytes will be swapped, thus delivering the character U+FFFE, which is defined by Unicode as a "non character" that should never appear in the text.

Can you share more information about the section you mention?

In the earlier link at https://www.w3.org/International/questions/qa-byte-order-mark.en#bytheway
It says

If you use a UTF-16 encoding for your page (and we strongly recommend that you don't), there are some additional considerations.

So the directions mentioned in entirety of article seemed to refer to UTF-8

In the same document, they have this:

With a binary editor capable of displaying the hexadecimal byte values in the file, the UTF-8 signature displays as EF BB BF.

At this point, this is irrelevant as we don't have the BOM code anymore, but this is something to keep in mind for the SB changes.

I created 3 files using notepad and saved them with different encodings (UTF8, UTF16 big endian, UTF16 little endian).

Here is how they look like in a binary editor
UTF8

UTF16 Little Endian

UTF16 Big Endian

Since SB doesn't directly feed any files to input, it's not going to be necessary.
But when it comes to dealing with output sent by service, such as for deserializationPolicy and maybe others where the response content from service is being parsed directly, would be good to have these in core-http I think.
Especially cause xml2js handles it in Node, but for in browser we are manually parsing and this isn't being handled implicitly.
So it seems to do with general text stream handling things, and, wonder if Storage handles the BOM characters explicitly in the SDK.

About the actual characters to use, above is interesting - unclear why the utf-16 code was being added in azure-sb - it might be possible that this specific character was noted in output. (Or maybe it was before xml2js (or the service itself) was updated to implicitly handle the BOM characters)

sdk/core/core-http/lib/util/xml.browser.ts

AlexGhiondea · 2019-10-15T22:48:30Z

sdk/core/core-http/lib/util/xml.browser.ts

+export function parseXML(str: string, opts?: { includeRoot?: boolean }): Promise<any> {
  try {
+    if (str == undefined || str == "") {
+      throw new Error("Document is empty");


This is a new error case -- what used to happen when you passed and empty XML string before?

I would prefer that we match the existing behavior so that we don't break existing scenarios.

This is actually the same error that gets thrown by parser in Node - same for all errors in .browser.ts file

The error message is not what I was wondering. If another SDK was using this browser code before and they passed either undefined or and empty document, what would they get back?

I am asking because changing that behavior (and throwing in that case) could be a breaking change.

Based on usage in Storage SDK, this scenario doesn't seem to exist
It seems a very less likely possibility since the SDKs are all targeting the same Azure services. And for an agnostic thing such as empty/undefined document being disallowed in one SDK, it seems like the behavior should be same for others as well.

For instance, if parsing of undefined/empty documents is allowed in browser, it would be a bug if it doesn't work from the Node SDK?
Similarly, if it is not allowed in browser, then the tests and checks would have helped catch that type of bug early on.

But I think more investigation is needed on this, since maybe we should allow handling of empty/undefined documents if that is a valid scenario? Ideally, this discrepancy wouldn't be there if we are using the same parser for Node and browser, but for most part the behaviors would be expected to be same unless very explicitly required as otherwise I think..
cc @bterlson @daviwil

sdk/core/core-http/lib/util/xml.browser.ts

sdk/core/core-http/lib/util/xml.ts

sdk/core/core-http/lib/util/constants.ts

ramya-rao-a

I have 2 main concerns left

I would have the consumer of the library worry about bom and not have the bom handling here in this library
I would remove the new validation errors that are being thrown when in browser mode. These are made to match the errors coming from xml2js library when run in node which can change over time and we have no control over.

sdk/core/core-http/lib/util/xml.browser.ts

ramya0820 · 2019-10-16T02:11:08Z

@ramya-rao-a @AlexGhiondea
About the concern that browser tests would fail if xml2js changes their error messages, Node tests would fail as well since they are the same simple strings.

However, reduced the test coverage and removed BOM handling to unblock PR. We can continue discussing/learning more later..

sdk/core/core-http/lib/util/utils.ts

ramya-rao-a · 2019-10-16T04:08:40Z

sdk/core/core-http/lib/util/xml.ts

-  });
+/**
+ * Converts given XML string into JSON
+ * @param str String containg the XML content to be parsed into JSON


Suggested change

* @param str String containg the XML content to be parsed into JSON

* @param str String containing the XML content to be parsed into JSON

ramya-rao-a · 2019-10-16T04:09:11Z

sdk/core/core-http/test/xmlTests.ts

+      );
+    });
+
+    it.only("with element with child undefined element", async function() {


Suggested change

it.only("with element with child undefined element", async function() {

it("with element with child undefined element", async function() {

ramya-rao-a

2 more things

Remove the .only on one of the test cases
A spelling mistake in one of the jsdocs.

I have added suggested edits for both that you can use

ramya-rao-a · 2019-10-16T17:56:21Z

/azp run js - client - ci

azure-pipelines · 2019-10-16T17:56:32Z

Azure Pipelines successfully started running 1 pipeline(s).

…to sb-atom-p5

Implement Atom XML support in core-http

03f23cf

ramya0820 requested a review from daviwil as a code owner August 27, 2019 22:15

ramya0820 self-assigned this Aug 27, 2019

Ramya Raja added 2 commits August 27, 2019 16:06

Minor refactor / simplification of XML serialization code

dfa9f56

Add in pnpm-lock.yaml file

e9dbea0

ramya0820 requested review from KarishmaGhiya and mikeharder as code owners August 27, 2019 23:09

Fix export

fbf047e

ramya-rao-a requested review from bterlson and chradek and removed request for KarishmaGhiya and daviwil August 28, 2019 00:02

Browserified changes

1082642

AlexGhiondea reviewed Aug 28, 2019

View reviewed changes

New pnpm-lock.yaml file

b0db034

ramya0820 mentioned this pull request Aug 28, 2019

[Service Bus] [core-http] Implement ATOM based management API #4880

Closed

Minor refactoring and verify all tests pass in browser

d64198c

bterlson requested changes Aug 29, 2019

View reviewed changes

Streamline browserification, address comments on modernization

03156b3

ramya0820 requested a review from sadasant August 29, 2019 22:25

Some more modernization fixes

529909e

chradek reviewed Sep 3, 2019

View reviewed changes

Ramya Raja added 2 commits September 4, 2019 14:09

Address comments

da009a6

Add copyrights

54fea8f

bterlson requested changes Sep 4, 2019

View reviewed changes

sdk/core/core-http/lib/util/utils.ts Outdated Show resolved Hide resolved

sdk/core/core-http/lib/util/utils.ts Outdated Show resolved Hide resolved

sdk/core/core-http/lib/util/utils.ts Outdated Show resolved Hide resolved

Ramya Raja added 3 commits September 4, 2019 17:55

Address comments

6e634c3

Fix result parsing expectations

bddf3d2

Export ProxySettings

0d60708

chradek reviewed Sep 5, 2019

View reviewed changes

Ramya Raja added 5 commits October 14, 2019 11:03

Address comments

772f12b

Fix test

929e289

Move out atomSerializationPolicy

4746c32

Fixes for XML serialization, add tests

faa4809

Merge branch 'master' of https://github.com/Azure/azure-sdk-for-js in…

1718bfb

…to sb-atom-p5

ramya0820 changed the title ~~[core-http] Implement Atom XML support in core-http~~ [core-http] Update XML serialization to support Atom XML operations Oct 15, 2019

ramya0820 mentioned this pull request Oct 15, 2019

[core-http] Add missing unit test coverage for XML serializing utilities #5548

Closed

AlexGhiondea suggested changes Oct 15, 2019

View reviewed changes

ramya-rao-a reviewed Oct 15, 2019

View reviewed changes

sdk/core/core-http/lib/util/xml.ts Outdated Show resolved Hide resolved

ramya-rao-a reviewed Oct 15, 2019

View reviewed changes

sdk/core/core-http/lib/util/constants.ts Outdated Show resolved Hide resolved

Address comments

2ca993f

ramya-rao-a suggested changes Oct 16, 2019

View reviewed changes

ramya-rao-a reviewed Oct 16, 2019

View reviewed changes

sdk/core/core-http/lib/util/xml.browser.ts Show resolved Hide resolved

Remove test coverage and BOM handling

151ed1d

AlexGhiondea reviewed Oct 16, 2019

View reviewed changes

sdk/core/core-http/lib/util/utils.ts Show resolved Hide resolved

ramya-rao-a reviewed Oct 16, 2019

View reviewed changes

ramya-rao-a suggested changes Oct 16, 2019

View reviewed changes

Fix typo and tag

a426163

ramya-rao-a approved these changes Oct 16, 2019

View reviewed changes

Dummy commit

4e2054e

AlexGhiondea approved these changes Oct 16, 2019

View reviewed changes

Merge branch 'master' of https://github.com/Azure/azure-sdk-for-js in…

5dc1a95

…to sb-atom-p5

ramya0820 merged commit c410b5d into Azure:master Oct 16, 2019

ramya0820 mentioned this pull request Oct 31, 2019

Implement the ATOM based management plane #4508

Closed

2 tasks

jeremymeng mentioned this pull request Feb 14, 2020

[core-http] Fix utils isValidUuid() regression #7387

Merged

	* @param str String containg the XML content to be parsed into JSON
	* @param str String containing the XML content to be parsed into JSON

	it.only("with element with child undefined element", async function() {
	it("with element with child undefined element", async function() {

Conversation

ramya0820 commented Aug 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bterlson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ramya0820 commented Aug 29, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ramya0820 commented Oct 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexGhiondea left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramya0820 Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramya0820 commented Aug 27, 2019 •

edited

Loading

ramya0820 commented Oct 15, 2019 •

edited

Loading

ramya0820 Oct 16, 2019 •

edited

Loading

ramya-rao-a left a comment •

edited

Loading