Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Native] - Implement some existing C++ methods in C# #1896

Merged
merged 5 commits into from
Oct 10, 2023

Conversation

Jklawreszuk
Copy link
Collaborator

@Jklawreszuk Jklawreszuk commented Oct 5, 2023

PR Details

Description

This PR focuses on migrating most of the C++ code in Stride.Native to C#.
My change also applies to Stride.Graphics, so with a few gimmicks it should be possible to build this library outside of Windows.

Related Issue

#1394

Types of changes

  • Docs change / refactoring / dependency upgrade

TODO

  • My change requires testing, so It could break existing code.

@Jklawreszuk Jklawreszuk changed the title [Native] - Implement some existing C++ methods in c# [Native] - Implement some existing C++ methods in C# Oct 5, 2023
@xen2
Copy link
Member

xen2 commented Oct 5, 2023

Thanks for the PR.
Before we merge this in, I would like to make sure we don't loose much performance. I remember it was in C++ for real perf reason (but at the time mono mobile was very slow so it might not be necessary anymore).

@Ethereal77
Copy link
Contributor

If I remember correctly, the 'FastTextRenderer' is only used for the debug text. Is this correct?
A long time since I've seen this code, but I think the regular spritebatch inherit from BatchBase, and is all C#.

@ly3027929699
Copy link

image
I test the NativeInvoke.xnGraphicsFastTextRendererGenerateVertices with native and cs metod based on .net8.0

@Eideren
Copy link
Collaborator

Eideren commented Oct 6, 2023

Hey @ly3027929699 thanks a bunch for testing this out for us, can you share the benchmark source as well ?

@ly3027929699
Copy link

here is the source
TestAOTUnitTests.zip
this is a .rar file.
change .zip to .rar

@IXLLEGACYIXL
Copy link
Collaborator

here is the source TestAOTUnitTests.zip this is a .rar file. change .zip to .rar

the folder is empty, could you make a repo instead?

@Eideren
Copy link
Collaborator

Eideren commented Oct 6, 2023

Nah, it works fine, make sure to change it to rar

@ly3027929699
Copy link

is your result similar to me? @Eideren

@ly3027929699
Copy link

https://github.com/ly3027929699/TestNativeVSCharp
here is repo of the source
@IXLLEGACYIXL

@Eideren
Copy link
Collaborator

Eideren commented Oct 6, 2023

Fixed the benchmark's usage of unsafe, results are still significantly better for c# though, even under dotnet6
TestAOTUnitTests.zip

Method num Mean
NativeMethod 100 30.00 us
CSharpMethod 100 21.51 us
NativeMethod 500 160.88 us
CSharpMethod 500 107.07 us
NativeMethod 1000 299.19 us
CSharpMethod 1000 213.93 us

Thanks a lot @ly3027929699 for taking the time to prepare this benchmark !

Comment on lines 256 to 258
public unsafe void GraphicsFastTextRendererGenerateVertices(RectangleF constantInfos, RectangleF renderInfos, string textPointer, ref int textLength, IntPtr vertexBufferPointer)
{
var vertexBuffer = (VertexPositionNormalTexture**)vertexBufferPointer;
Copy link
Collaborator

@Eideren Eideren Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use VertexPositionNormalTexture** as the parameter type instead of IntPtr, it'll be clearer for the caller and you can move the type conversion outside the loop, it's likely already inlined by the jit but might as well.

Copy link
Collaborator

@froce froce Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually incorrect as written (and crashes when enabling the Profiler for example). It should just be VertexPositionNormalTexture* as I see it.
I suggest something like
private int GraphicsFastTextRendererGenerateVertices(RectangleF constantInfos, RectangleF renderInfos, string text, Span<VertexPositionNormalTexture> vertexBuffer) or maybe
private int GraphicsFastTextRendererGenerateVertices(RectangleF constantInfos, RectangleF renderInfos, string text, ref VertexPositionNormalTexture vertexBuffer) and iterating with vertexBuffer = ref Unsafe.Add(ref vertexBuffer, 1) instead of pointer arithmetic if that's simpler/faster than Span. I see no reason to keep a C-like signature.

Comment on lines 590 to 593
protected override unsafe void UpdateBufferValuesFromElementInfo(ref ElementInfo elementInfo, IntPtr vertexPtr, IntPtr indexPtr, int vertexOffset)
{
var vertex = (VertexPositionColorTextureSwizzle*)vertexPtr;
fixed (SpriteDrawInfo* drawInfo = &elementInfo.DrawInfo)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with VertexPositionColorTextureSwizzle* if possible

@Eideren
Copy link
Collaborator

Eideren commented Oct 6, 2023

Here's the result of another benchmark following the suggestions @froce made

Method num Mean
NativeMethod 100 32.42 us
Span 100 19.18 us
SpansFor 100 24.21 us
Optimized 100 10.54 us
Ptr 100 18.92 us
NativeMethod 500 163.00 us
Span 500 98.99 us
SpansFor 500 124.67 us
Optimized 500 54.89 us
Ptr 500 96.45 us
NativeMethod 1000 331.56 us
Span 1000 196.77 us
SpansFor 1000 250.98 us
Optimized 1000 107.84 us
Ptr 1000 195.63 us

Where

  • Ptr is with the fixed VertexPositionNormalTexture* vertexBuffer signature.
  • Span is using a span instead of a pointer.
  • SpansFor is a for loop instead of the manually unrolled loop in source.
  • I wrote an Optimized version where loop-constants are pre-computed, making it almost two times faster than pointer:
public static unsafe void Optimized(RectangleF constantInfos, RectangleF renderInfos, string textPointer, ref int textLength, Span<VertexPositionNormalTexture> vertexBuffer)
{
    float fX = renderInfos.X / renderInfos.Width;
    float fY = renderInfos.Y / renderInfos.Height;
    float fW = constantInfos.X / renderInfos.Width;
    float fH = constantInfos.Y / renderInfos.Height;

    RectangleF destination = new(fX, fY, fW, fH);
    RectangleF source = new(0.0f, 0.0f, constantInfos.X, constantInfos.Y);

    // Copy the array length (since it may change during an iteration)
    int textCharCount = textLength;

    float scaledDestinationX;
    float scaledDestinationY = -(destination.Y * 2f - 1f);

    float invertedWidth = 1f / constantInfos.Width;
    float invertedHeight = 1f / constantInfos.Height;

    Span<(Vector2 Position, Vector2 TextureCoordinate)> baseData = stackalloc (Vector2, Vector2)[4]
    {
        ( new(-destination.Width, +destination.Height), new(0 * source.Width * invertedWidth, 0 * source.Height * invertedHeight) ),
        ( new(+destination.Width, +destination.Height), new(1 * source.Width * invertedWidth, 0 * source.Height * invertedHeight) ),
        ( new(-destination.Width, -destination.Height), new(0 * source.Width * invertedWidth, 1 * source.Height * invertedHeight) ),
        ( new(+destination.Width, -destination.Height), new(1 * source.Width * invertedWidth, 1 * source.Height * invertedHeight) ),
    };

    int j = 0;

    for (int i = 0; i < textCharCount; i++)
    {
        char currentChar = textPointer[i];

        if (currentChar == '\v')
        {
            // Tabulation
            destination.X += 8 * fX;
            --textLength;
            continue;
        }
        else if (currentChar >= 10 && currentChar <= 13) // '\n' '\v' '\f' '\r'
        {
            destination.X = fX;
            destination.Y += fH;
            scaledDestinationY = -(destination.Y * 2f - 1f);
            --textLength;
            continue;
        }
        else if (currentChar < 32 || currentChar > 126)
        {
            currentChar = ' ';
        }

        source.X = (currentChar % 32 * constantInfos.X) * invertedWidth;
        source.Y = (currentChar / 32 % 4 * constantInfos.Y) * invertedHeight;

        scaledDestinationX = destination.X * 2f - 1f;

        // 0
        vertexBuffer[j].Position.X = scaledDestinationX + baseData[0].Position.X;
        vertexBuffer[j].Position.Y = scaledDestinationY + baseData[0].Position.Y;
        vertexBuffer[j].TextureCoordinate.X = source.X + baseData[0].TextureCoordinate.X;
        vertexBuffer[j].TextureCoordinate.Y = source.Y + baseData[0].TextureCoordinate.Y;
        j++;

        // 1
        vertexBuffer[j].Position.X = scaledDestinationX + baseData[1].Position.X;
        vertexBuffer[j].Position.Y = scaledDestinationY + baseData[1].Position.Y;
        vertexBuffer[j].TextureCoordinate.X = source.X + baseData[1].TextureCoordinate.X;
        vertexBuffer[j].TextureCoordinate.Y = source.Y + baseData[1].TextureCoordinate.Y;
        j++;

        // 2
        vertexBuffer[j].Position.X = scaledDestinationX + baseData[2].Position.X;
        vertexBuffer[j].Position.Y = scaledDestinationY + baseData[2].Position.Y;
        vertexBuffer[j].TextureCoordinate.X = source.X + baseData[2].TextureCoordinate.X;
        vertexBuffer[j].TextureCoordinate.Y = source.Y + baseData[2].TextureCoordinate.Y;
        j++;

        // 3
        vertexBuffer[j].Position.X = scaledDestinationX + baseData[3].Position.X;
        vertexBuffer[j].Position.Y = scaledDestinationY + baseData[3].Position.Y;
        vertexBuffer[j].TextureCoordinate.X = source.X + baseData[3].TextureCoordinate.X;
        vertexBuffer[j].TextureCoordinate.Y = source.Y + baseData[3].TextureCoordinate.Y;
        j++;

        destination.X += destination.Width;
    }
}

TestAOTUnitTests.zip

@@ -209,14 +217,13 @@ public unsafe void End([NotNull] GraphicsContext graphicsContext)

//Draw the strings
var constantInfos = new RectangleF(GlyphWidth, GlyphHeight, DebugSpriteWidth, DebugSpriteHeight);
Span<VertexPositionNormalTexture> vertexPositionSpan = stackalloc VertexPositionNormalTexture[4* textInfo.Text.Length];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic expects mappedVertexBufferPointer to receive the data, you should create a span from that, so something like new Span<VertexPositionNormalTexture>(mappedVertexBufferPointer, amountOfItemsTheBufferContains);

Also, stackalloc should be used with care, I used it in the benchmark as it wasn't that important there, but if you don't know in advance how much bytes you're going to allocate (the string in this case might be very long), you may be filling the thread's stack resulting in a StackOverflowException see stackalloc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, Thanks everyone for your help 😀. @Eideren Thanks, I've updated my branch now. When it comes to amountOfItemsTheBufferContains do you had in mind something like in line 214? (Unsafe.InitBlock..)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost, that's the amount of bytes, new Span<T>(void*, int)'s second argument is the amount of T or items in the collection, see here. So it should just be new((void*)mappedVertexBufferPointer, VertexBufferLength)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my 🤦 You're right. Updated now 😀

@Eideren Eideren merged commit d61d532 into stride3d:master Oct 10, 2023
@Eideren
Copy link
Collaborator

Eideren commented Oct 10, 2023

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants