Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent Search Bug #393

Closed
whyfate opened this issue Jun 11, 2021 · 18 comments · Fixed by #407
Closed

Concurrent Search Bug #393

whyfate opened this issue Jun 11, 2021 · 18 comments · Fixed by #407
Labels

Comments

@whyfate
Copy link
Contributor

whyfate commented Jun 11, 2021

Describe the bug
Such as the title

To Reproduce
it's my test code

static void ConcurrentSearchTest()
{
    var client = new FhirClient("https://spark.incendi.no/fhir");
    // This request will return two resource: Encounter and Patient
    var hasResourceURL = "https://spark.incendi.no/fhir/Encounter?_id=2&_include=Encounter:subject";
    // This request has no resource returned
    var noResourceURL = "https://spark.incendi.no/fhir/Encounter?_id=22222222&_include=Encounter:subject";
    for (int i = 0; i < 500; i++)
    {
        var j = i;
        System.Threading.Tasks.Task.Run(async () =>
        {
            if (j % 2 == 0)
            {
                var bundle = (Bundle)await client.GetAsync(hasResourceURL);
                if (bundle.Entry.Count != 2)
                {
                    Console.WriteLine($"{j}---search has resource error");
                }
                else
                {
                    Console.WriteLine($"{j}---succeed");
                }
            }
            else
            {
                var bundle = (Bundle)await client.GetAsync(noResourceURL);
                if (bundle.Entry.Count > 0)
                {
                    Console.WriteLine($"{j}---search no resource error");
                }
                else
                {
                    Console.WriteLine($"{j}---success");
                }
            }
        });
    }
}

Expected behavior
all print {j}---success

Screenshots

image

@whyfate
Copy link
Contributor Author

whyfate commented Jun 11, 2021

I debugged the source code and found that the cause of the error was the concurrency problem in the SnapshotPaginationProvider

public class SnapshotPaginationProvider : ISnapshotPaginationProvider, ISnapshotPagination
{
private IFhirStore _fhirStore;
private readonly ITransfer _transfer;
private readonly ILocalhost _localhost;
private readonly ISnapshotPaginationCalculator _snapshotPaginationCalculator;
private Snapshot _snapshot;
public SnapshotPaginationProvider(IFhirStore fhirStore, ITransfer transfer, ILocalhost localhost, ISnapshotPaginationCalculator snapshotPaginationCalculator)
{
_fhirStore = fhirStore;
_transfer = transfer;
_localhost = localhost;
_snapshotPaginationCalculator = snapshotPaginationCalculator;
}
public ISnapshotPagination StartPagination(Snapshot snapshot)
{
_snapshot = snapshot;
return this;
}

the Call chain is:
AsyncFhirService -> PagingService -> SnapshotPaginationProvider
Although SnapshotPaginationProvider is injected Transient, because AsyncFhirService is a Singleton, SnapshotPaginationProvider is also a Singleton.
private async Task<FhirResponse> CreateSnapshotResponseAsync(Snapshot snapshot, int pageIndex = 0)
{
var pagingExtension = FindExtension<IPagingService>();
if (pagingExtension == null)
{
var bundle = new Bundle
{
Type = snapshot.Type,
Total = snapshot.Count
};
var resourceStorage = FindExtension<IResourceStorageService>();
bundle.Append(await resourceStorage.GetAsync(snapshot.Keys).ConfigureAwait(false));
return _responseFactory.GetFhirResponse(bundle);
}
else
{
var pagination = await pagingExtension.StartPaginationAsync(snapshot).ConfigureAwait(false);
var bundle = await pagination.GetPageAsync(pageIndex).ConfigureAwait(false);
return _responseFactory.GetFhirResponse(bundle);
}
}

At present, I have changed AsyncFhirService to Transient to solve the problem temporarily, but I don't think this is the best way.Maybe need to refactor FhirService.

@kennethmyhra
Copy link
Collaborator

kennethmyhra commented Jun 14, 2021

It was singleton in good old classic Spark (.NET FW), and probably therefore it is singleton in the .NET Core/Standard version of Spark. Actually I do not see a good reason for AsyncFhirService to be a singleton and I think we should consider making it transient instead.

@whyfate
Copy link
Contributor Author

whyfate commented Jun 15, 2021

But creating an AsyncFhirService is time consuming.

public AsyncFhirService(
IFhirServiceExtension[] extensions,
IFhirResponseFactory responseFactory, // TODO: can we remove this dependency?
ITransfer transfer,
ICompositeServiceListener serviceListener = null) // TODO: can we remove this dependency? - CCR
{
_responseFactory = responseFactory ?? throw new ArgumentNullException(nameof(responseFactory));
_transfer = transfer ?? throw new ArgumentNullException(nameof(transfer));
_serviceListener = serviceListener ?? throw new ArgumentNullException(nameof(serviceListener));
foreach (var serviceExtension in extensions)
{
AddExtension(serviceExtension);
}
}

This has a effect on Performance.

@kennethmyhra
Copy link
Collaborator

Yes, that is a concern. I have not measured the effect it would have on performance, but yes that would be the concern.

@one0410
Copy link

one0410 commented Jul 8, 2021

Hi,
I have the same problem.
If I query API at almost the same time, sometimes I got weird response different from what I expected.
So that I couldn't put this project into a production environment.

For example, I have 20 Device resource in DB.
If I query identifier 000000000001, I expect a result with same identifier.
Please see the following code, sometimes I got a random result.
Please help.

  testFhir(): void {
    for (let i = 0; i <20; i++) {
      this.http.get('https://esp32fhir.azurewebsites.net/fhir/Device?identifier=' + i.toString().padStart(12, '0'))
      .toPromise()
      .then((res: any) => {
        console.log(i, res.entry[0].resource);
      });
    }
  }

image

Andy

@one0410
Copy link

one0410 commented Jul 8, 2021

Hi @whyfate,

Could you tell me how you changed AsyncFhirService to Transient to solve the problem temporarily ?

Andy

@whyfate
Copy link
Contributor Author

whyfate commented Jul 8, 2021

@one0410 I modified the source code.

services.TryAddSingleton<IAsyncFhirService, AsyncFhirService>();

services.TryAddTransient<IAsyncFhirService, AsyncFhirService>();

@kennethmyhra
Copy link
Collaborator

@whyfate how did that affect the performance?

@one0410
Copy link

one0410 commented Jul 8, 2021

@whyfate You are right, the performance is terrible.
@kennethmyhra Is this what you want ?

singleton:
image

transient:
image

@kennethmyhra
Copy link
Collaborator

@one0410, absolutely not. I haven't yet had time to look into the root cause of this. If any of you have time to do that it would be much appreciated. Anyways, I'll put it into the current sprint for the next release

@kennethmyhra kennethmyhra added this to To do in Release v1.5.7 via automation Jul 8, 2021
@whyfate
Copy link
Contributor Author

whyfate commented Jul 8, 2021

I have an immature suggestion, which is to pass snapshot into the parameter of this method.

Bundle GetPage(int? index = null, Action<Entry> transformElement = null);

Bundle GetPage(Snapshot snapshot,int? index = null, Action<Entry> transformElement = null); 

Replacing SnapshotPaginationProvider internal variables _snapshot with parameters

private Snapshot _snapshot;
public SnapshotPaginationProvider(IFhirStore fhirStore, ITransfer transfer, ILocalhost localhost, ISnapshotPaginationCalculator snapshotPaginationCalculator)
{
_fhirStore = fhirStore;
_transfer = transfer;
_localhost = localhost;
_snapshotPaginationCalculator = snapshotPaginationCalculator;
}
public ISnapshotPagination StartPagination(Snapshot snapshot)
{
_snapshot = snapshot;
return this;
}
public Bundle GetPage(int? index = null, Action<Entry> transformElement = null)
{
if (_snapshot == null)
throw Error.NotFound("There is no paged snapshot");
if (!_snapshot.InRange(index ?? 0))
{
throw Error.NotFound(
"The specified index lies outside the range of available results ({0}) in snapshot {1}",
_snapshot.Keys.Count(), _snapshot.Id);
}
return CreateBundle(index);
}

But emmm...bad code warning!!!

@kennethmyhra
Copy link
Collaborator

I've done some research and can confirm that I am seeing the error as well. I tried putting a SemaphoreSlim around the code in AsyncFhirService.SearchAsync (as a hot fix), it helps, but it slows it down since it forces the requests to run synchronously one-by-one.

I need to try and reproduce this with FhirServiceService to see if we had this problem before we added AsyncFhirService

@whyfate
Copy link
Contributor Author

whyfate commented Jul 9, 2021

This problem is Snapshot design, which has existed for a long time, not async.

@kennethmyhra
Copy link
Collaborator

This is fixed in both STU3 and R4. If you are compiling from source, please verify by pulling. If you are using the NuGet packages I will create a new release today
STU3 - https://www.nuget.org/packages/Spark.Engine.STU3
R4 - https://www.nuget.org/packages/Spark.Engine.R4

Summarized I split ISnapshotPaginationProvider and ISnapshotPagination into two implementations making sure as @whyfate pointed out that Snapshot is not affected by the singleton behavior of ISnapshotPaginationProvider. There is a more lengthy explanation in the commit/PR

@kennethmyhra kennethmyhra removed their assignment Jul 9, 2021
@one0410
Copy link

one0410 commented Jul 10, 2021

Thank you very much. This does help me a lot. You are really amazing.

@whyfate
Copy link
Contributor Author

whyfate commented Jul 12, 2021

@kennethmyhra Hi,I saw your implementation,I don't think it's good to directly a new SnapshotPaginationService.
How about this?

services.AddTransient<ISnapshotPaginationProvider, SnapshotPaginationProvider>();
public class PagingService : IPagingService
{
    private readonly ISnapshotStore _snapshotstore;
    private readonly IServiceProvider _serviceProvider;

    public PagingService(ISnapshotStore snapshotstore, IServiceProvider serviceProvider)
    {
        _snapshotstore = snapshotstore;
        _serviceProvider = serviceProvider;
    }

    public ISnapshotPagination StartPagination(Snapshot snapshot)
    {
        return Task.Run(() => StartPaginationAsync(snapshot)).GetAwaiter().GetResult();
    }

    public ISnapshotPagination StartPagination(string snapshotkey)
    {
        return Task.Run(() => StartPaginationAsync(snapshotkey)).GetAwaiter().GetResult();
    }

    public async Task<ISnapshotPagination> StartPaginationAsync(Snapshot snapshot)
    {
        if (_snapshotstore != null)
        {
            await _snapshotstore.AddSnapshotAsync(snapshot).ConfigureAwait(false);
        }
        else
        {
            snapshot.Id = null;
        }

        var snapshotPaginationProvider = _serviceProvider.GetRequiredService<ISnapshotPaginationProvider>();
        return snapshotPaginationProvider.StartPagination(snapshot);
    }
    public async Task<ISnapshotPagination> StartPaginationAsync(string snapshotkey)
    {
        if (_snapshotstore == null)
        {
            throw new NotSupportedException("Stateful pagination is not currently supported.");
        }

        var snapshotPaginationProvider = _serviceProvider.GetRequiredService<ISnapshotPaginationProvider>();
        return snapshotPaginationProvider.StartPagination(await _snapshotstore.GetSnapshotAsync(snapshotkey).ConfigureAwait(false));
    }
}

@kennethmyhra
Copy link
Collaborator

kennethmyhra commented Jul 12, 2021

@whyfate the plan is to do this in the next version. While I agree with you - right now the only downside is that you can't make your own implementation of the interface ISnapshotPagination which was the interface touched in this iteration

I also see now that you are referring to some other interface (IPagingService) and implementation (PagingService) which was not touched in this iteration. The whole infrastructure around snapshot pagination seems messy and unfinished which I also noticed when I started touching this part and therefore did not put too much effort into the infrastructure part at this point

@whyfate
Copy link
Contributor Author

whyfate commented Jul 12, 2021

I see,thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants