Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(performance#38): Performance improvements #1871

Merged
merged 2 commits into from
Jan 20, 2023

Conversation

metcoder95
Copy link
Member

@metcoder95 metcoder95 commented Jan 20, 2023

While working on Performance#38, I decided to use it as starting point Undici for experimenting with possible performance improvements, as Undici has a greatly clean implementation of the WHATWG for parsing a MIMEType.

While experimenting I was able to notice substantial performance improvements while changing smaller pieces, while trying to respect them as much as possible the spec.

Was able to obtain a substantial ~48% improvement compared to the current implementation.

The PR as always is just a suggestion as pointed out in the performance thread, in Undici this is not one of the biggest worries as it is used only while parsing data: strings 🙂

Feel free to close it if there's no value added.

Benchmarks:

Machine

Hardware Overview:
	Model Name: MacBook Pro
	Model Identifier: MacBookPro18,1
	Chip: Apple M1 Pro
	Total Number of Cores: 10 (8 performance and 2 efficiencies)
	Memory: 16 GB
	System Firmware Version: 8419.60.44
	OS Loader Version: 7459.141.1

Results

Code

const str = 'application/json; charset=utf-8'

suite
  .add('util#MIMEType', function () {
    new util.MIMEType(str)
  })
  .add('undici#parseMIMEType', function () {
    parseMIMEType(str)
  })
  .add('undici#parseMIMEType(original)', function () {
    parseMIMETypeOriginal(str)
  })
  .add('fast-content-type-parse#parse', function () {
    fastContentType.parse(str)
  })
  .add('fast-content-type-parse#safeParse', function () {
    fastContentType.safeParse(str)
  })
  .on('cycle', function (event) {
    console.log(String(event.target))
  })
  .on('complete', function () {
    console.log('Fastest is ' + this.filter('fastest').map('name'))
  })
  .run({ async: true })
util#MIMEType x 1,330,919 ops/sec ±0.36% (92 runs sampled)
undici#parseMIMEType x 2,366,624 ops/sec ±0.22% (98 runs sampled)
undici#parseMIMEType(original) x 1,588,546 ops/sec ±0.67% (98 runs sampled)

@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2023

Codecov Report

Base: 90.35% // Head: 90.33% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (d7cd095) compared to base (f61a902).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1871      +/-   ##
==========================================
- Coverage   90.35%   90.33%   -0.02%     
==========================================
  Files          70       70              
  Lines        6042     6044       +2     
==========================================
+ Hits         5459     5460       +1     
- Misses        583      584       +1     
Impacted Files Coverage Δ
lib/fetch/dataURL.js 87.50% <100.00%> (+0.17%) ⬆️
lib/fetch/file.js 89.65% <0.00%> (-1.15%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@metcoder95 metcoder95 marked this pull request as ready for review January 20, 2023 12:25
get essence () {
return `${this.type}/${this.subtype}`
}
essence: `${type}/${subtype}`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the main driver for the improvements. Tried to find something in the spec that indicates this is a getter but sadly couldn't find any point indicating it. I might overlook it, but @KhafraDev can you confirm this approach is ok?

Copy link
Member

@ronag ronag Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work as it can be overwritten... it must be a getter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was what I was missing then. Does that mean that the mime type object can be altered after creation, right?

This comment was marked as outdated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is more stuff here than should get getters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but then it becomes a case of consistency, if you change subtype then essence need to be automatically updated, which is why a getter is necessary either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is more stuff here than should get getters?

You mean the PR itself, then yes? Otherwise, I might be missing something

I might be wrong, but then it becomes a case of consistency, if you change subtype then essence need to be automatically updated, which is why a getter is necessary either way

Yeah, once you said that about that must be a getter, that is what I started to see, as most likely the issue is not on the getter itself but the recomputation of the string on every get call.

Copy link
Member Author

@metcoder95 metcoder95 Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems is actually more about the string computation, as if I replace it with the following class that computes the essence on instantiation, and on set/get (kinda caching), the results remains stable.

Class:

class MimeType {
  #type = ''
  #subtype = ''
  #essence = ''
  #parameters = new Map()

  constructor ({ type, subtype }) {
    this.#type = type
    this.#subtype = subtype
    this.#essence = `${type}/${subtype}`
  }

  set type (value) {
    this.#type = value
    this.#essence = `${type}/${subtype}`
  }

  get type () {
    this.#type
  }

  set subtype (value) {
    this.#subtype = value
    this.#essence = `${type}/${subtype}`
  }

  get subtype () {
    return this.#subtype
  }

  get essence () {
    return this.#essence
  }

  get parameters () {
    return this.#parameters
  }
}

Scenario

const str = 'application/json; charset=utf-8'

suite
  .add('util#MIMEType', function () {
    new util.MIMEType(str)
  })
  .add('undici#parseMIMEType', function () {
    parseMIMEType(str).essence
  })
  .add('undici#parseMIMEType(original)', function () {
    parseMIMETypeOriginal(str).essence
  })
  .add('fast-content-type-parse#parse', function () {
    fastContentType.parse(str)
  })
  .add('fast-content-type-parse#safeParse', function () {
    fastContentType.safeParse(str)
  })
  .on('cycle', function (event) {
    console.log(String(event.target))
  })
  .on('complete', function () {
    console.log('Fastest is ' + this.filter('fastest').map('name'))
  })
  .run({ async: true })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results:

util#MIMEType x 1,309,932 ops/sec ±0.48% (94 runs sampled)
undici#parseMIMEType x 2,331,911 ops/sec ±0.32% (96 runs sampled)
undici#parseMIMEType(original) x 1,501,260 ops/sec ±0.50% (99 runs sampled)

@H4ad
Copy link
Member

H4ad commented Jan 20, 2023

I read the original function and these lines get my attention:

undici/lib/fetch/dataURL.js

Lines 325 to 331 in d7cd095

// 2. Collect a sequence of code points that are not
// U+003B (;) from input, given position.
collectASequenceOfCodePoints(
(char) => char !== ';',
input,
position
)

Is part of the spec but is not used anymore, either the result of either the position variable, could this be removed?
Edit: I didn't see that is inside a while loop, so just ignore it.

Also in this part:

undici/lib/fetch/dataURL.js

Lines 360 to 365 in d7cd095

if (
parameterName.length !== 0 &&
HTTP_TOKEN_CODEPOINTS.test(parameterName) &&
!HTTP_QUOTED_STRING_TOKENS.test(parameterValue) &&
!mimeType.parameters.has(parameterName)
) {

I think that worth to put the validation of !mimeType.parameters.has(parameterName) after parameterName.length !== 0, maybe we could extract a little bit of performance when the parameter already was defined because I think is faster than run the regex first.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina mcollina merged commit 1813408 into nodejs:main Jan 20, 2023
@metcoder95 metcoder95 deleted the chore/performance_38 branch January 22, 2023 20:42
anonrig pushed a commit to anonrig/undici that referenced this pull request Apr 4, 2023
metcoder95 added a commit to metcoder95/undici that referenced this pull request Jul 21, 2023
crysmags pushed a commit to crysmags/undici that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants