Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regions] pCorProfilerInfo->GetObjectGeneration may fail when running under USE_REGIONS #55965

Closed
cshung opened this issue Jul 19, 2021 · 1 comment · Fixed by #57101
Closed
Milestone

Comments

@cshung
Copy link
Member

cshung commented Jul 19, 2021

This is a bug found when working on the filtering issue.

Problem:

The crux of the issue is that under USE_REGIONS, the GC may use some unseen memory address since the last GC during allocation. In particular, in soh_try_fit, it may use the get_new_region method to acquire a new region, chain it to the end of gen0 and allocate the object there. At that point of time, the s_currentGenerationTable maintained by the profiler is unaware of those new addresses, therefore during the ObjectAllocated callback, GetObjectGeneration for that newly allocated object would fail.

Repro:
The profiler authored for the gcallocate test can be repurposed to reproduce this bug. Here is the modified code. It does 3 different things:

  1. It changes the flags so that it captures SOH as well
  2. removed a println, and
  3. assert in failure.
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

#include "gcallocateprofiler.h"

GUID GCAllocateProfiler::GetClsid()
{
    // {55b9554d-6115-45a2-be1e-c80f7fa35369}
	GUID clsid = { 0x55b9554d, 0x6115, 0x45a2,{ 0xbe, 0x1e, 0xc8, 0x0f, 0x7f, 0xa3, 0x53, 0x69 } };
	return clsid;
}

HRESULT GCAllocateProfiler::Initialize(IUnknown* pICorProfilerInfoUnk)
{
    Profiler::Initialize(pICorProfilerInfoUnk);

    HRESULT hr = S_OK;
    if (FAILED(hr = pCorProfilerInfo->SetEventMask2(COR_PRF_ENABLE_OBJECT_ALLOCATED | COR_PRF_MONITOR_OBJECT_ALLOCATED, COR_PRF_HIGH_BASIC_GC)))
    {
        printf("FAIL: ICorProfilerInfo::SetEventMask2() failed hr=0x%x", hr);
        return hr;
    }

    return S_OK;
}

HRESULT STDMETHODCALLTYPE GCAllocateProfiler::ObjectAllocated(ObjectID objectId, ClassID classId)
{
    COR_PRF_GC_GENERATION_RANGE gen;
    HRESULT hr = pCorProfilerInfo->GetObjectGeneration(objectId, &gen);
    if (FAILED(hr))
    {
        assert (false);
        printf("GetObjectGeneration failed hr=0x%x\n", hr);
        _failures++;
    }
    else if (gen.generation == COR_PRF_GC_LARGE_OBJECT_HEAP)
    {
        _gcLOHAllocations++;
    }
    else if (gen.generation == COR_PRF_GC_PINNED_OBJECT_HEAP)
    {
        _gcPOHAllocations++;
    }
    else
    {
        _failures++;
    }

    return S_OK;
}

HRESULT GCAllocateProfiler::Shutdown()
{
    Profiler::Shutdown();
    if (_gcPOHAllocations == 0)
    {
        printf("There is no POH allocations\n");
    }
    else if (_gcLOHAllocations == 0)
    {
        printf("There is no LOH allocations\n");
    }
    else if (_failures == 0)
    {
        printf("%d LOH objects allocated\n", (int)_gcLOHAllocations);
        printf("%d POH objects allocated\n", (int)_gcPOHAllocations);
        printf("PROFILER TEST PASSES\n");
    }
    fflush(stdout);

    return S_OK;
}

The bug can then be reproduced by setting these environment,

set CORECLR_ENABLE_PROFILING=1
set CORECLR_PROFILER={55b9554d-6115-45a2-be1e-c80f7fa35369}
set CORECLR_PROFILER_PATH=C:\dev\runtime\artifacts\tests\coreclr\windows.x64.Debug\profiler\gc\gcallocate\Profiler.dll

and run GCPerfSim with these parameters:

c:\dev\runtime\artifacts\tests\coreclr\Windows.x64.Debug\Tests\Core_Root\CoreRun.exe C:\dev\performance\artifacts\bin\GCPerfSim\release\netcoreapp5.0\GCPerfSim.dll -tc 6 -tagb 100.0 -tlgb 2.0 -lohar 0 -pohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohsr 100-4000 -lohsr 102400-204800 -pohsr 100-4000 -sohpi 10 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time
@ghost
Copy link

ghost commented Jul 19, 2021

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

This is a bug found when working on the filtering issue.

Problem:

The crux of the issue is that under USE_REGIONS, the GC may use some unseen memory address since the last GC during allocation. In particular, in soh_try_fit, it may use the get_new_region method to acquire a new region, chain it to the end of gen0 and allocate the object there. At that point of time, the s_currentGenerationTable maintained by the profiler is unaware of those new addresses, therefore during the ObjectAllocated callback, GetObjectGeneration for that newly allocated object would fail.

Repro:
The profiler authored for the gcallocate test can be repurposed to reproduce this bug. Here is the modified code. It does 3 different things:

  1. It changes the flags so that it captures SOH as well
  2. removed a println, and
  3. assert in failure.
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

#include "gcallocateprofiler.h"

GUID GCAllocateProfiler::GetClsid()
{
    // {55b9554d-6115-45a2-be1e-c80f7fa35369}
	GUID clsid = { 0x55b9554d, 0x6115, 0x45a2,{ 0xbe, 0x1e, 0xc8, 0x0f, 0x7f, 0xa3, 0x53, 0x69 } };
	return clsid;
}

HRESULT GCAllocateProfiler::Initialize(IUnknown* pICorProfilerInfoUnk)
{
    Profiler::Initialize(pICorProfilerInfoUnk);

    HRESULT hr = S_OK;
    if (FAILED(hr = pCorProfilerInfo->SetEventMask2(COR_PRF_ENABLE_OBJECT_ALLOCATED | COR_PRF_MONITOR_OBJECT_ALLOCATED, COR_PRF_HIGH_BASIC_GC)))
    {
        printf("FAIL: ICorProfilerInfo::SetEventMask2() failed hr=0x%x", hr);
        return hr;
    }

    return S_OK;
}

HRESULT STDMETHODCALLTYPE GCAllocateProfiler::ObjectAllocated(ObjectID objectId, ClassID classId)
{
    COR_PRF_GC_GENERATION_RANGE gen;
    HRESULT hr = pCorProfilerInfo->GetObjectGeneration(objectId, &gen);
    if (FAILED(hr))
    {
        assert (false);
        printf("GetObjectGeneration failed hr=0x%x\n", hr);
        _failures++;
    }
    else if (gen.generation == COR_PRF_GC_LARGE_OBJECT_HEAP)
    {
        _gcLOHAllocations++;
    }
    else if (gen.generation == COR_PRF_GC_PINNED_OBJECT_HEAP)
    {
        _gcPOHAllocations++;
    }
    else
    {
        _failures++;
    }

    return S_OK;
}

HRESULT GCAllocateProfiler::Shutdown()
{
    Profiler::Shutdown();
    if (_gcPOHAllocations == 0)
    {
        printf("There is no POH allocations\n");
    }
    else if (_gcLOHAllocations == 0)
    {
        printf("There is no LOH allocations\n");
    }
    else if (_failures == 0)
    {
        printf("%d LOH objects allocated\n", (int)_gcLOHAllocations);
        printf("%d POH objects allocated\n", (int)_gcPOHAllocations);
        printf("PROFILER TEST PASSES\n");
    }
    fflush(stdout);

    return S_OK;
}

The bug can then be reproduced by setting these environment,

set CORECLR_ENABLE_PROFILING=1
set CORECLR_PROFILER={55b9554d-6115-45a2-be1e-c80f7fa35369}
set CORECLR_PROFILER_PATH=C:\dev\runtime\artifacts\tests\coreclr\windows.x64.Debug\profiler\gc\gcallocate\Profiler.dll

and run GCPerfSim with these parameters:

c:\dev\runtime\artifacts\tests\coreclr\Windows.x64.Debug\Tests\Core_Root\CoreRun.exe C:\dev\performance\artifacts\bin\GCPerfSim\release\netcoreapp5.0\GCPerfSim.dll -tc 6 -tagb 100.0 -tlgb 2.0 -lohar 0 -pohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohsr 100-4000 -lohsr 102400-204800 -pohsr 100-4000 -sohpi 10 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time
Author: cshung
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 19, 2021
@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jul 20, 2021
@mangod9 mangod9 added this to the 7.0.0 milestone Jul 20, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 21, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 27, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants