Skip to content
This repository has been archived by the owner on Oct 4, 2022. It is now read-only.

Statistics collection and thread safety fixes for AZ::IO::DedicatedCache #494

Closed

Conversation

dkondrashkin
Copy link

@dkondrashkin dkondrashkin commented Jun 4, 2020

This commit addresses 2 problems we've encountered:

  1. Broken RAD telemetry plots (collected statistics) for AZ::IO::Streamer;
  2. Rare crashes of game client.

For the first issue we have a following visualization:

image

Streamer stat names appeared to be corrupted. This happened because of Statistic::CreateXXX(...) functions, accepting AZStd::string_view as a first argument, were supplied with AZStd::string variables living in method scope. To fix this we've introduced a separate cache for stat names.

Second issue (crash) was preceded by the following assertion:

<2020-05-29T18:22:20:529+03> (System) - Trace::Assert ...\Code\Framework\AzCore\AzCore/std/containers/vector.h(585): (14664) 'const class std::unique_ptr<class AZ::IO::BlockCache,struct std::default_delete<class AZ::IO::BlockCache> > &__cdecl AZStd::vector<class std::unique_ptr<class AZ::IO::BlockCache,struct std::default_delete<class AZ::IO::BlockCache> >,class AZStd::allocator>::operator [](unsigned __int64) const'
<2020-05-29T18:22:20:529+03> (System) - AZStd::vector<>::at - position is out of range
<2020-05-29T18:22:20:530+03> (System) - ------------------------------------------------
<2020-05-29T18:23:37:320+03> (System) - 00007FF7B9793A27 (game01Launcher) : AZ::IO::FullFileDecompressor::CollectStatistics
<2020-05-29T18:23:37:321+03> (System) - 00007FFC2188AEE9 (CryRenderD3D11) : AZ::IO::Device::CollectStatistics
<2020-05-29T18:23:37:321+03> (System) - 00007FFC21890D54 (CryRenderD3D11) : AZ::IO::Device::OnTick
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B968B787 (game01Launcher) : AZ::Internal::EBusContainer<AZ::TickEvents,AZ::TickEvents,0,2>::Dispatcher<AZ::EBus<AZ::TickEvents,AZ::TickEvents> >::Broadcast<void (__cdecl AZ::TickEvents::*)(float,AZ::ScriptTimePoint) __ptr64,float & __ptr64,AZ::Sc
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B96E57C7 (game01Launcher) : AZ::ComponentApplication::Tick
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9178292 (game01Launcher) : LumberyardLauncher::Run
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9179A33 (game01Launcher) : WinMain
<2020-05-29T18:23:37:321+03> (System) - 00007FF7B9AC2FCE (game01Launcher) : __scrt_common_main_seh
<2020-05-29T18:23:37:321+03> (System) - 00007FFCBE897974 (KERNEL32) : BaseThreadInitThunk
<2020-05-29T18:23:37:321+03> (System) - 00007FFCBFBCA271 (ntdll) : RtlUserThreadStart

This can happen due to the fact that DedicatedCache::CollectStatistics() and DedicatedCache::DestroyDedicatedCache() methods can be called from different threads (main and streamer threads respectively). While this happens rarely, access to DedicatedCache internal structures should be protected with sync constructs.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Added locks to methods operating on file caches
Added cache for file names used in statistics
@AMZN-nggieber
Copy link

Thank you for the pull request!

@AMZN-alexpete
Copy link

@dkondrashkin thank you for submitting this fix! We've integrated the change and it will be available in a future version of Lumberyard.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants