Skip to content

Conversation

Lukasa
Copy link
Contributor

@Lukasa Lukasa commented Feb 20, 2023

Motivation:

To know when we next need to wake up, we keep track of what the next deadline will be. This works great, but in order to keep track of this UInt64 we save off an entire ScheduledTask. This object is quite wide (6 pointers wide), and two of those pointers require ARC traffic, so doing this saving produces unnecessary overhead.

Worse, saving this task plays poorly with task cancellation. If the saved task is cancelled, this has the effect of "retaining" that task until the next event loop tick. This is unlikely to produce catastrophic bugs in real programs, where the loop does tick, but it violates our tests which rigorously assume that we will always drop a task when it is cancelled. In specific manufactured cases it's possible to produce leaks of non-trivial duration.

Modifications:

  • Wrote a weirdly complex test.
  • Moved the implementation of Task.readyIn to a method on NIODeadline
  • Saved a NIODeadline instead of a ScheduledTask

Result:

Minor performance improvement in the core event loop processing, minor correctness improvement.

@Lukasa Lukasa added the 🔨 semver/patch No public API change. label Feb 20, 2023
@Lukasa Lukasa requested a review from weissi February 20, 2023 14:15
Copy link
Member

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one!

Motivation:

To know when we next need to wake up, we keep track of what the next
deadline will be. This works great, but in order to keep track of this
UInt64 we save off an entire ScheduledTask. This object is quite wide (6
pointers wide), and two of those pointers require ARC traffic, so doing
this saving produces unnecessary overhead.

Worse, saving this task plays poorly with task cancellation. If the
saved task is cancelled, this has the effect of "retaining" that task
until the next event loop tick. This is unlikely to produce catastrophic
bugs in real programs, where the loop does tick, but it violates our
tests which rigorously assume that we will always drop a task when it is
cancelled. In specific manufactured cases it's possible to produce leaks
of non-trivial duration.

Modifications:

- Wrote a weirdly complex test.
- Moved the implementation of Task.readyIn to a method on NIODeadline
- Saved a NIODeadline instead of a ScheduledTask

Result:

Minor performance improvement in the core event loop processing, minor
correctness improvement.
@Lukasa Lukasa force-pushed the cb-dont-lretain-tasks branch from 5e24b67 to bb0c1fd Compare February 20, 2023 14:47
@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 20, 2023

@swift-server-bot test perf please

@swift-server-bot
Copy link

performance report

build id: 148

timestamp: Mon Feb 20 14:52:10 UTC 2023

results

nameminmaxmeanstd
write_http_headers 0.04301601 0.04326456 0.0431198291 0.00011165041110582675
http_headers_canonical_form 0.098862882 0.100599702 0.09938796159999999 0.0005788471769603206
http_headers_canonical_form_trimming_whitespace 0.019593722 0.02031712 0.0197734496 0.00020640574024059202
http_headers_canonical_form_trimming_whitespace_from_short_string 0.017664914 0.018333524 0.0177900683 0.00020445069773531105
http_headers_canonical_form_trimming_whitespace_from_long_string 0.028581473 0.029228128 0.028795966299999996 0.00022899679657587089
bytebuffer_write_12MB_short_string_literals 0.13691893 0.143322155 0.1382172535 0.0018186200039018163
bytebuffer_write_12MB_short_calculated_strings 0.067575546 0.068302445 0.0679379742 0.0002515523043127738
bytebuffer_write_12MB_medium_string_literals 0.971425031 1.025360417 0.9877755913999999 0.014871078072153526
bytebuffer_write_12MB_medium_calculated_strings 0.089347922 0.091004072 0.0901016582 0.0005005889669634294
bytebuffer_write_12MB_large_calculated_strings 0.156543628 0.157729094 0.1571649737 0.0003220528270638061
bytebuffer_lots_of_rw 0.043519171 0.044155887 0.043676716 0.00023397339512146742
bytebuffer_write_http_response_ascii_only_as_string 0.029973135 0.030638404 0.030146124899999998 0.00018558215473503932
bytebuffer_write_http_response_ascii_only_as_staticstring 0.030927703 0.031554329 0.031148809300000004 0.0002144347976876523
bytebuffer_write_http_response_some_nonascii_as_string 0.030152013 0.030979976 0.0304728956 0.0002136577334951082
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.031083117 0.03142977 0.0312112003 0.00012277456874020993
no-net_http1_1k_reqs_1_conn 0.011403928 0.011821145 0.0115323732 0.00010978336596001726
http1_1k_reqs_1_conn 0.061171627 0.064352009 0.0624696338 0.0010629858981243776
http1_1k_reqs_100_conns 0.091345892 0.092189647 0.0916728706 0.0002952580473420942
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.07772768 0.078625595 0.0781935743 0.0002879489147432051
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.077968158 0.085176637 0.0790648884 0.002168126394311979
future_whenallsucceed_10k_deferred_off_loop 0.031010818 0.031780514 0.0313492105 0.00027290128049015696
future_whenallsucceed_10k_deferred_on_loop 0.013932578 0.014230317 0.014059228099999998 9.8534323291655e-05
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.038944733 0.040304241 0.0395150274 0.00043333290628984
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.039557494 0.040038559 0.039726609 0.0001700981885800747
future_whenallcomplete_10k_deferred_off_loop 0.02367289 0.024051371 0.0238650604 0.00012698805564409052
future_whenallcomplete_100k_deferred_on_loop 0.081116203 0.084878476 0.08208487140000001 0.0011810423817293484
future_reduce_10k_futures 0.016621322 0.01713397 0.0167263715 0.00014659941581075743
future_reduce_into_10k_futures 0.014420526 0.014529118 0.0144820043 3.830354844034997e-05
channel_pipeline_1m_events 0.09767967 0.097844292 0.0977594 7.07774190245564e-05
websocket_encode_50b_space_at_front_100k_frames_cow 0.050003248 0.050532173 0.050196977499999997 0.00020220060753027238
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.65542939 0.65809257 0.6560555603 0.0007674449023542724
websocket_encode_1kb_space_at_front_1m_frames_cow 0.529690501 0.538696765 0.530866993 0.0027586089619518034
websocket_encode_50b_no_space_at_front_100k_frames_cow 0.050215512 0.05070509 0.0503690647 0.00020900880186306355
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.053270012 0.053722889 0.053422973500000005 0.0001967082154287577
websocket_encode_50b_space_at_front_100k_frames 0.074206266 0.074934063 0.0746032743 0.0002604614093697928
websocket_encode_50b_space_at_front_10k_frames_masking 0.008868871 0.008897994 0.0088805536 7.404077363182853e-06
websocket_encode_1kb_space_at_front_10k_frames 0.01382756 0.014278332 0.0139041673 0.0001322775263712622
websocket_encode_50b_no_space_at_front_100k_frames 0.073898881 0.074589929 0.07429062859999999 0.0002755867261318022
websocket_encode_1kb_no_space_at_front_10k_frames 0.013274584 0.013722578 0.0133646718 0.00012751304571585322
websocket_decode_125b_10k_frames 0.013322309 0.013444505 0.013378162999999998 4.262583755579875e-05
websocket_decode_125b_with_a_masking_key_10k_frames 0.013754304 0.013917074 0.013851169899999998 5.104831724389462e-05
websocket_decode_64kb_10k_frames 0.013882372 0.014334675 0.014053178999999999 0.00014110889713818758
websocket_decode_64kb_with_a_masking_key_10k_frames 0.014291051 0.014760902 0.0144440079 0.0001254817410625941
websocket_decode_64kb_+1_10k_frames 0.013850253 0.014226916 0.0139224536 0.00011087523608312557
websocket_decode_64kb_+1_with_a_masking_key_10k_frames 0.014173885 0.014348292 0.0142640121 5.358472668680865e-05
circular_buffer_into_byte_buffer_1kb 0.032996383 0.03348418 0.033058993800000006 0.00015012510397650298
circular_buffer_into_byte_buffer_1mb 0.064712498 0.06515929 0.0648554285 0.00020262190892247835
byte_buffer_view_iterator_1mb 0.017565022 0.018085143 0.0176316884 0.00015989381562093758
byte_buffer_view_contains_12mb 0.053036391 0.053575005 0.053228230700000004 0.0002278070668468051
byte_to_message_decoder_decode_many_small 0.042196585 0.042723144 0.0423267618 0.00019805455279448095
generate_10k_random_request_keys 0.089745002 0.089989343 0.08987030700000001 8.722173637982029e-05
bytebuffer_rw_10_uint32s 0.037849426 0.051353355 0.0409083867 0.0037472256534760067
bytebuffer_multi_rw_10_uint32s 0.062915529 0.070882168 0.0662927074 0.0030228230844882697
lock_1_thread_10M_ops 0.151050734 0.15179721 0.1514131098 0.00021259232184514824
lock_2_threads_10M_ops 0.969808402 0.998634903 0.9835990839000001 0.009287069778005547
lock_4_threads_10M_ops 0.989817422 1.046309086 1.0226409172 0.016298119098115786
lock_8_threads_10M_ops 1.020275509 1.0468856 1.0317928702999999 0.007428873022312308
schedule_100k_tasks 0.067887343 0.111227669 0.077471234 0.013462421811769753
schedule_and_run_100k_tasks 0.443836452 0.449422002 0.44598710849999995 0.002116246480168668
execute_100k_tasks 0.21595416 0.216950355 0.2165444061 0.0003919299389591753
bytebufferview_copy_to_array_100k_times_1kb 0.011671469 0.011759767 0.011710221600000001 2.368462253680893e-05
circularbuffer_copy_to_array_10k_times_1kb 0.019864632 0.020338718 0.0199214604 0.0001470832732624613
deadline_now_1M_times 0.026613611 0.026879146 0.0266757744 7.837123357485493e-05
asyncwriter_single_writes_1M_times 0.342111361 0.342679775 0.3424488376 0.000211640465309136
asyncsequenceproducer_consume_1M_times 0.804860118 0.81470643 0.8080773359 0.0029518783811947964
udp_10k_writes 0.381714531 0.382587928 0.3819909552 0.00025780848280234384
udp_10k_vector_writes 0.211465311 0.212149634 0.21179081160000002 0.0002293948685758582
udp_10k_vector_reads 0.387020674 0.389205726 0.3882052406 0.0007219501426211499
udp_10k_vector_reads_and_writes 0.115082879 0.115498922 0.1153368191 0.00013826105976855004
tcp_100k_messages_throughput 0.807796465 0.842003085 0.8183118664 0.011004750428248765

comparison

name current previous winner diff
write_http_headers 0.04301601 0.043043657 current 0%
http_headers_canonical_form 0.098862882 0.098872574 current 0%
http_headers_canonical_form_trimming_whitespace 0.019593722 0.019596916 current 0%
http_headers_canonical_form_trimming_whitespace_from_short_string 0.017664914 0.017670313 current 0%
http_headers_canonical_form_trimming_whitespace_from_long_string 0.028581473 0.028586899 current 0%
bytebuffer_write_12MB_short_string_literals 0.13691893 0.137308731 current 0%
bytebuffer_write_12MB_short_calculated_strings 0.067575546 0.067931117 current 0%
bytebuffer_write_12MB_medium_string_literals 0.971425031 0.954816574 previous 1%
bytebuffer_write_12MB_medium_calculated_strings 0.089347922 0.090410963 current -1%
bytebuffer_write_12MB_large_calculated_strings 0.156543628 0.157615785 current 0%
bytebuffer_lots_of_rw 0.043519171 0.043120783 previous 0%
bytebuffer_write_http_response_ascii_only_as_string 0.029973135 0.030959015 current -3%
bytebuffer_write_http_response_ascii_only_as_staticstring 0.030927703 0.030947597 current 0%
bytebuffer_write_http_response_some_nonascii_as_string 0.030152013 0.030743087 current -1%
bytebuffer_write_http_response_some_nonascii_as_staticstring 0.031083117 0.031208141 current 0%
no-net_http1_1k_reqs_1_conn 0.011403928 0.011322212 previous 0%
http1_1k_reqs_1_conn 0.061171627 0.06075556 previous 0%
http1_1k_reqs_100_conns 0.091345892 0.091361931 current 0%
future_whenallsucceed_100k_immediately_succeeded_off_loop 0.07772768 0.079081084 current -1%
future_whenallsucceed_100k_immediately_succeeded_on_loop 0.077968158 0.079378646 current -1%
future_whenallsucceed_10k_deferred_off_loop 0.031010818 0.031504782 current -1%
future_whenallsucceed_10k_deferred_on_loop 0.013932578 0.014168836 current -1%
future_whenallcomplete_100k_immediately_succeeded_off_loop 0.038944733 0.03917651 current 0%
future_whenallcomplete_100k_immediately_succeeded_on_loop 0.039557494 0.039057258 previous 1%
future_whenallcomplete_10k_deferred_off_loop 0.02367289 0.023458832 previous 0%
future_whenallcomplete_100k_deferred_on_loop 0.081116203 0.079994293 previous 1%
future_reduce_10k_futures 0.016621322 0.016924897 current -1%
future_reduce_into_10k_futures 0.014420526 0.01417913 previous 1%
channel_pipeline_1m_events 0.09767967 0.097693675 current 0%
websocket_encode_50b_space_at_front_100k_frames_cow 0.050003248 0.053871766 current -7%
websocket_encode_50b_space_at_front_1m_frames_cow_masking 0.65542939 0.71443737 current -8%
websocket_encode_1kb_space_at_front_1m_frames_cow 0.529690501 0.572539974 current -7%
websocket_encode_50b_no_space_at_front_100k_frames_cow 0.050215512 0.054240574 current -7%
websocket_encode_1kb_no_space_at_front_100k_frames_cow 0.053270012 0.057215535 current -6%
websocket_encode_50b_space_at_front_100k_frames 0.074206266 0.07703005 current -3%
websocket_encode_50b_space_at_front_10k_frames_masking 0.008868871 0.009187349 current -3%
websocket_encode_1kb_space_at_front_10k_frames 0.01382756 0.014073205 current -1%
websocket_encode_50b_no_space_at_front_100k_frames 0.073898881 0.077059187 current -4%
websocket_encode_1kb_no_space_at_front_10k_frames 0.013274584 0.013516191 current -1%
websocket_decode_125b_10k_frames 0.013322309 0.012468072 previous 6%
websocket_decode_125b_with_a_masking_key_10k_frames 0.013754304 0.01281177 previous 7%
websocket_decode_64kb_10k_frames 0.013882372 0.012785168 previous 8%
websocket_decode_64kb_with_a_masking_key_10k_frames 0.014291051 0.013142572 previous 8%
websocket_decode_64kb_+1_10k_frames 0.013850253 0.012801753 previous 8%
websocket_decode_64kb_+1_with_a_masking_key_10k_frames 0.014173885 0.01311309 previous 8%
circular_buffer_into_byte_buffer_1kb 0.032996383 0.033043821 current 0%
circular_buffer_into_byte_buffer_1mb 0.064712498 0.064719466 current 0%
byte_buffer_view_iterator_1mb 0.017565022 0.017568471 current 0%
byte_buffer_view_contains_12mb 0.053036391 0.053028596 previous 0%
byte_to_message_decoder_decode_many_small 0.042196585 0.041932563 previous 0%
generate_10k_random_request_keys 0.089745002 0.089630789 previous 0%
bytebuffer_rw_10_uint32s 0.037849426 0.037718215 previous 0%
bytebuffer_multi_rw_10_uint32s 0.062915529 0.062322747 previous 0%
lock_1_thread_10M_ops 0.151050734 0.151504564 current 0%
lock_2_threads_10M_ops 0.969808402 0.833628249 previous 16%
lock_4_threads_10M_ops 0.989817422 0.857372609 previous 15%
lock_8_threads_10M_ops 1.020275509 0.871526224 previous 17%
schedule_100k_tasks 0.067887343 0.070351004 current -3%
schedule_and_run_100k_tasks 0.443836452 0.441815331 previous 0%
execute_100k_tasks 0.21595416 0.215057568 previous 0%
bytebufferview_copy_to_array_100k_times_1kb 0.011671469 0.011186757 previous 4%
circularbuffer_copy_to_array_10k_times_1kb 0.019864632 0.019768387 previous 0%
deadline_now_1M_times 0.026613611 0.026616368 current 0%
asyncwriter_single_writes_1M_times 0.342111361 0.341853038 previous 0%
asyncsequenceproducer_consume_1M_times 0.804860118 0.801967259 previous 0%
udp_10k_writes 0.381714531 0.381524006 previous 0%
udp_10k_vector_writes 0.211465311 0.21024111 previous 0%
udp_10k_vector_reads 0.387020674 0.387845743 current 0%
udp_10k_vector_reads_and_writes 0.115082879 0.115790392 current 0%
tcp_100k_messages_throughput 0.807796465 0.78205478 previous 3%

significant differences found

@Lukasa Lukasa enabled auto-merge (squash) February 20, 2023 14:55
@Lukasa Lukasa merged commit 9afaf80 into apple:main Feb 20, 2023
@Lukasa Lukasa deleted the cb-dont-lretain-tasks branch February 20, 2023 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants