Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port corehost to QNX7 #33374

Open
guesshe opened this issue Mar 9, 2020 · 120 comments
Open

Port corehost to QNX7 #33374

guesshe opened this issue Mar 9, 2020 · 120 comments
Labels
area-Meta question Answer questions and provide assistance, not an issue with source code or documentation.
Milestone

Comments

@guesshe
Copy link

guesshe commented Mar 9, 2020

Hi,

I am trying to port the entire runtime to qnx7 platform on x64 arch. I am able to build coreclr but it won't run unless I have dotnet executable built. Any suggestions on how to build corehost for qnx?

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-Infrastructure-coreclr untriaged New issue has not been triaged by the area owner labels Mar 9, 2020
@jkotas jkotas added question Answer questions and provide assistance, not an issue with source code or documentation. area-Host and removed area-Infrastructure-coreclr labels Mar 9, 2020
@jkotas
Copy link
Member

jkotas commented Mar 9, 2020

Any suggestions on how to build corehost for qnx?

The same way as coreclr? It lives under https://github.com/dotnet/runtime/tree/master/src/installer/corehost

@guesshe
Copy link
Author

guesshe commented Mar 9, 2020

How about the .nuget packages downloaded for specific RID? I used this repo https://github.com/dotnet/core-setup/tree/v2.2.8, when I tried on linux, it pulls down some .nuget files for linux platform, but I don't have these files for QNX to pull down.

@jkotas
Copy link
Member

jkotas commented Mar 9, 2020

You may want to build it from dotnet/runtime repo. dotnet/runtime has everything together that avoids the issues with publishing and downloading packages between repos.

@guesshe
Copy link
Author

guesshe commented Mar 9, 2020

@jkotas Oh. Thanks! Shall I start with all subprojects or only coreclr and corehost should be enough for me?

@jkotas
Copy link
Member

jkotas commented Mar 9, 2020

You can start src\coreclr, src\libraries\Native and corehost; and get the managed libraries from other Unix flavor.

@guesshe
Copy link
Author

guesshe commented Mar 9, 2020

Thanks! By saying managed libraries, do you mean the .dll libraries?

@jkotas
Copy link
Member

jkotas commented Mar 9, 2020

Right

@guesshe
Copy link
Author

guesshe commented Mar 9, 2020

@jkotas I tried the dotnet core 5.0.0-dev on linux and it can build a binary dotnet under artifacts directory, but when I tried to execute it, it gave me an error "A fatal error occurred. The folder [/home/<user_dir>/Github/runtime/artifacts/obj/linux-x64.Debug/cli/dotnet/host/fxr] does not exist". This is the same error when I tried the v2.2.8 version of ccorehost on linux. If I download the cli tar file and untar it, it has sub-directories host. What did I miss? Is the built dotnet directly executable or I have to do some post-processing?

@jkotas
Copy link
Member

jkotas commented Mar 9, 2020

obj is directory for intermediate build files. It does not have the right directory layout.

Try the one under bin, e.g. artifacts/bin/testhost/netcoreapp5.0-linux-Debug-x64

@guesshe
Copy link
Author

guesshe commented Mar 10, 2020

@jkotas Thanks! I will try it out and let you know the progress.

@guesshe
Copy link
Author

guesshe commented Mar 11, 2020

@jkotas Can I publish my app to netcore sdk 5.0.0-dev? Or the other way around, can I build dotnet/runtime for sdk version 3? Following command will build a dotnet executable but it missed host folder and can't run from there. It doesn't build the artifacts/bin/testhost folder though.
/home//Github/runtime/src/installer/corehost/build.sh Debug x64 -apphostver "2.1.802" -hostver "2.1.802" -fxrver "2.1.802" -policyver "2.1.802" -commithash "fc2e56c8e8d60180d9ca6ddff67076d779fd4a43"

@jkotas
Copy link
Member

jkotas commented Mar 11, 2020

What typically works best for initial bring ups like this is to publish standalone app (e.g. using dotnet publish -r linux-x64) and then overwrite the binaries what what you have built.

@guesshe
Copy link
Author

guesshe commented Mar 11, 2020

@jkotas Thanks! I tried replace the dotnet executable with my own built version of 5.0.0-dev and it seems working. So I think my next step is to build qnx version of following shared libraries and replace them, am I correct? Do I really need libuv.so and libe_sqlite3.so? They are under AspNet, not NetCore.
./shared/Microsoft.AspNetCore.All/2.2.8/libuv.so
./shared/Microsoft.AspNetCore.All/2.2.8/libe_sqlite3.so
./shared/Microsoft.NETCore.App/2.2.8/libhostpolicy.so
./shared/Microsoft.NETCore.App/2.2.8/System.Native.so
./shared/Microsoft.NETCore.App/2.2.8/libmscordbi.so
./shared/Microsoft.NETCore.App/2.2.8/libmscordaccore.so
./shared/Microsoft.NETCore.App/2.2.8/libcoreclr.so
./shared/Microsoft.NETCore.App/2.2.8/System.IO.Compression.Native.so
./shared/Microsoft.NETCore.App/2.2.8/System.Security.Cryptography.Native.OpenSsl.so
./shared/Microsoft.NETCore.App/2.2.8/libsos.so
./shared/Microsoft.NETCore.App/2.2.8/libcoreclrtraceptprovider.so
./shared/Microsoft.NETCore.App/2.2.8/libsosplugin.so
./shared/Microsoft.NETCore.App/2.2.8/System.Globalization.Native.so
./shared/Microsoft.NETCore.App/2.2.8/libclrjit.so
./shared/Microsoft.NETCore.App/2.2.8/System.Net.Http.Native.so
./shared/Microsoft.NETCore.App/2.2.8/libdbgshim.so
./shared/Microsoft.NETCore.App/2.2.8/System.Net.Security.Native.so
./host/fxr/2.2.8/libhostfxr.so

@jkotas
Copy link
Member

jkotas commented Mar 11, 2020

Do I really need libuv.so and libe_sqlite3.so?

It depends on the ASP.NET Core you are planning to use, and how you plan to configure it.

@am11
Copy link
Member

am11 commented Mar 11, 2020

libuv is not required for ASP.NET Core (it is an optional provider for KestrelHttpServer, primary backend is .NET's own managed sockets).
libe_sqlite3 (which comes from https://github.com/ericsink/SQLitePCL.raw) is required only when EntityFramework Core is used with SQLite provider.

@guesshe
Copy link
Author

guesshe commented Mar 12, 2020

@am11 @jkotas Thanks!

@guesshe
Copy link
Author

guesshe commented Mar 12, 2020

Any idea how this shared library is built? ./shared/Microsoft.NETCore.App/2.2.8/System.Net.Http.Native.so, I didn't find it after built src/libraries/Native/build-native.sh

@jkotas
Copy link
Member

jkotas commented Mar 12, 2020

This library no longer exists in dotnet/runtime repo.

@guesshe
Copy link
Author

guesshe commented Mar 12, 2020

@jkotas Thanks! I will work on the rest then.

@guesshe
Copy link
Author

guesshe commented Mar 12, 2020

For the managed libraries (.dll), can I reuse 2.2.8 sdk version? Only replacing .so and .a libraries with my own built version.

@jkotas
Copy link
Member

jkotas commented Mar 12, 2020

You are likely going to run into mismatches when combining 2.2.8 managed libraries with latest native binaries from dotnet/runtime

@guesshe
Copy link
Author

guesshe commented Mar 13, 2020

I am able to build corehost but got a ELF error while executing it in a QNX device. I am debugging on why it happened.

@guesshe
Copy link
Author

guesshe commented Mar 13, 2020

@jkotas Is netcore 5 sdk available to try out?

@jkotas
Copy link
Member

jkotas commented Mar 13, 2020

Yes, you can download the daily builds at https://github.com/dotnet/core-sdk#installers-and-binaries

@guesshe
Copy link
Author

guesshe commented Mar 13, 2020

@jkotas Thanks!

@guesshe
Copy link
Author

guesshe commented Apr 24, 2020

@janvorli @wfurt @jkotas With the help of our kernel developers, we managed to fix this crash and another stack issue. Now it proceeded to a point that looks very promising.

./corerun -c /lib hello_world_dotnet_core_qnx_netcore5_0.dll

coreclr_initialize failed - status: 0x80004005
By reading porting notes from @wfurt, I downloaded netcore 5 sdk 5.0 using snap and published to netcoreapp5.0 targetframework. However, I still got the same issue.
The commit I checkout from master is 62112b0.
Any suggestions? Is this because I am on master not on the preview branch?

@wfurt
Copy link
Member

wfurt commented Apr 24, 2020

That maps to E_FAIL and there are many places where this can fail. You can try to set COREHOST_TRACE=1 and check if that provides any hints. (I assume you disabled r2r, right?)
I don't think the branch matters.

@guesshe
Copy link
Author

guesshe commented Apr 24, 2020

@wfurt Thanks! What is r2rm? Does this failure mean the cruntime is passed?

@wfurt
Copy link
Member

wfurt commented Apr 24, 2020

There was typo. R2R -> Ready To Run. With crossgen, we may put in native bits so make startup faster. Because of that, you many not be able to simply copy assemblies targeted for other platform. It should work for the hello but I'm wondering how did you get BCL assemblies.
Back then, I used COMPlus_ZapDisable=1 and COMPlus_ReadyToRun=0 when trying to use Linux assemblies on FreeBSD. @janvorli or @jkotas may know better if that is still applicable.

@guesshe
Copy link
Author

guesshe commented Apr 24, 2020

@wfurt Is that an environment variable? I don't recall I set that. For BCL assemblies, I plan to upload the built tools and source code to target and build from there directly instead of cross-compiling.

@wfurt
Copy link
Member

wfurt commented Apr 24, 2020

yes, environment. I'm not quite sure what you mean by the previous post. In order to build assemblies you need to have working dotnet cli and c# compiler is written (mostly) in c#.
forerun cannot function without System.Private.CoreLib.dll (and perhaps others), so the question is how did you get one?

@guesshe
Copy link
Author

guesshe commented Apr 24, 2020 via email

@wfurt
Copy link
Member

wfurt commented Apr 24, 2020

sure, ping me with details: tweinfurt at yahoo.
I don't think your test is valid. You can try the steps on Linux (or other supported platform)

@guesshe
Copy link
Author

guesshe commented Apr 29, 2020

@wfurt @janvorli We are trying to debug this 0x80004005 error and following is the trace output. It looks like it failed to load System.Private.CoreLib.dll. The trace is trimmed and formatted to a way that is easier to read. Is System.Private.CoreLib.dll a mandatory to have in order to run a empty main function? My hello_world app only have one line " static void Main(string[] args) {}".
Starting
corhost.cpp - CorRuntimeHostBase::Start
ceemain.cpp - EnsureEEStarted - g_fEEShutDown==0
ceemain.cpp - EEStartup - InitializeClrNotifications - status==0000000000
ceemain.cpp - EEStartup - InitializeJITNotificationTable - status==0000000000
ceemain.cpp - EEStartup - Initialize - status==0000000000
ceemain.cpp - EEStartupHelper - start
ceemain.cpp - EEConfig::Setup - start
ceemain.cpp - EEConfig::Setup - done
ceemain.cpp - InitializeStratupFlags - done
ceemain.cpp - PAL_SetShutdownCallback - done
ceemain.cpp - InitializeLogging - done
ceemain.cpp - EnsureRtlFunctions - done
ceemain.cpp - g_pConfig->sync - done
ceemain.cpp - InitializeSpinConstants - done
ceemain.cpp - InitializeStubManagers - done
ceemain.cpp - Stubs - done
ceemain.cpp - Inits - done
rcthread.cpp - DebuggerRCTthread started
m_thread!=NULL, hr==0000000000
rcthread.cpp - Thread created: hr==0000000000
rcthread.cpp - Done: hr==0000000000
ceemain.cpp - InitializeDebugger - done
ceemain.cpp - Profiling service - hr==0000000000 - done
ceemain.cpp - InitPreStubManager - done
ceemain.cpp - g_pGCHeap->Initialize - hr==0000000000 - done
ceemain.cpp - SystemDomain debugging - done
ceemain.cpp - MethodDesc::Init - start
ceemain.cpp - MethodDesc::Init - done
ceemain.cpp - SD Init - start
appdomain.cpp - Init - start
appdomain.cpp - LOG - done
appdomain.cpp - ZapDisable - done
appdomain.cpp - GetInternalSystemDirectory - hr==0x8007007a - done
appdomain.cpp - GetInternalSystemDirectory(buffer) - hr==0x8007007a - done
appdomain.cpp - LoadBaseSystemClasses - start
appdomain.cpp - LoadBaseSystemClasses - start
appdomain.cpp - ETWOnStartup - done
appdomain.cpp - OpenSystem - start
pefile.cpp - OpenSystem - start
pefile.cpp - DoOpenSystem - start
pefile.cpp - ETWOnStartup - start
pefile.cpp - ETWOnStartup - done
pefile.cpp - BindToSystem - start
appdomain.hpp - SystemDirectory is /
coreclrbindercommon.cpp - AssemblyBinder::BindToSystem - start
assemblybinder.cpp - GetAssembly - sCoreLib==/home/qnxuser/ - start
assemblybinder.cpp - AssemblyBinder::GetAssembly - start
Assembly path is /
coreassemblyspec.cpp - BinderAcquirePEimage - start
coreassemblyspec.cpp - OpenImage - start
coreassemblyspec.cpp - TryOpenFile - start
peimage.cpp - TryOpenFile - m_path==/home/qnxuser/System.Private.CoreLib.dll
coreassemblyspec.cpp - TryOpenFile - done - hr==0x80070002
AssemblyBinder::BindToSystem - done - hr==0x80070002
ceemain.cpp - CATCH - done
ceemain.cpp - if !FAILED - hr==0000000000 - done
ceemain.cpp - EEStartup - EEStartupHelper - status==0x80004005
ceemain.cpp:327 - g_EEStartupStatus==0x80004005
corhost.cpp - Done - hr==0x80004005
Start: 0x80004005
coreclr_initialize failed - status: 0x80004005

@janvorli
Copy link
Member

The error 0x80070002 means "File not found". Is it possible that there is some access problem to the /home/qnxuser/System.Private.CoreLib.dll?

@janvorli
Copy link
Member

Btw, error codes starting with 0x8007 represent windows error codes. The lowest 16 bits of the code contain a windows error code. These windows error codes are described here: https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-

@guesshe
Copy link
Author

guesshe commented Apr 29, 2020

@janvorli I don't have this managed library built. I only have libcoreclr.so. Based on previous posts in this thread, I had a feeling I don't need managed libraries to test basic PAL functionalities. Following is quoted from previous posts.

"the first thing that was done was to pass all platform abstraction layer (PAL) tests, which excercise the CRT functions used by the runtime: https://github.com/dotnet/runtime/blob/59be94b69845ecfbd5a694483c2a4853e99cc64b/docs/workflow/testing/coreclr/unix-test-instructions.md#pal-tests

and then run a simple hello world app using corerun (a basic host that complies with the runtime): https://github.com/dotnet/runtime/blob/7d67d17a9f49ad5f365467fcd3bf0d25f2b9349a/docs/workflow/building/coreclr/linux-instructions.md

iff we get this far, then run the coreclr tests, see src/coreclr/build-test.sh"

I tried a Linux version of corerun and libcoreclr.so, it doesn't give me an error looking for System.Private.CoreLib.dll. Did I misunderstand something in the instructions above?

@janvorli
Copy link
Member

The part that tests the PAL is the pal test suite that you've ran before. That's the only part of the testing that doesn't run managed code.
The corerun is a tool to run managed applications. So it requires System.Private.CoreLib.dll and other managed assemblies (depending on what your hello world managed app needs).
I assume the Linux version didn't fail because the System.Private.CoreLib.dll is present.

@guesshe
Copy link
Author

guesshe commented Apr 29, 2020

@janvorli I don't recall I put the System.Private.CoreLib.dll in the same directory as libcoreclr.so, maybe it also searches for other locations? May I use a Linux-version of System.Private.CoreLib.dll to see if it works? If not, how can I build a QNX-version of System.Private.CoreLib.dll?

@janvorli
Copy link
Member

Yes, you can use the Linux version, it should just work (provided it is built from exactly the same state of the source tree as the libcoreclr.so that you've built for QNX and it is the same build flavor - you cannot combine Release build of libcoreclr.so with Debug or Checked build of System.Private.CoreLib.dll and vice versa).

@guesshe
Copy link
Author

guesshe commented Apr 29, 2020

@janvorli Thanks! I will give it a try. The same state you mean it should come out of the same commit? Or similar? What errors it could give if they are from different commit? I would prefer to actually build it for QNX but it doesn't seem to support cross-compiling. I might have to upload the source code to QNX directly and run the build from there.

@janvorli
Copy link
Member

I mean the same commit. There are shared data structures between libcoreclr.so and System.Private.CoreLib.dll, so any change in the layout of those structures would break things. Trying to use commits close to each other might work, but it is not worth the possible problems investigation.

@wfurt
Copy link
Member

wfurt commented Apr 29, 2020

also debug/release needs to match, right? (at least is did in the past that release System.Private.CoreLib.dll did not work with debug coreclr)

@guesshe
Copy link
Author

guesshe commented Apr 29, 2020

@janvorli @wfurt Thanks! I will try it out and let you know the result.

@janvorli
Copy link
Member

also debug/release needs to match, right?

Yes, I've mentioned that in a comment above.

@guesshe
Copy link
Author

guesshe commented Apr 30, 2020

@janvorli It seems we still have issue with Linux-version of System.Private.CoreLib.dll, any idea what does this error mean? The new error is that the PE Image file is not in native machine format.

@janvorli
Copy link
Member

Can you please set the following env variables and try again? This should let the runtime load only the IL code from the System.Private.CoreLib.dll and not the already precompiled binary code that is likely causing the trouble.

COMPlus_ZapDisable=1
COMPlus_ReadyToRun=0

@janvorli
Copy link
Member

janvorli commented May 6, 2020

@quesshe, it was discovered that the COMPlus_ZapDisable handling was accidentally disabled for some time and fixed four days ago in #35741. I'm not sure what state of the repository you are using, but you'll likely need that fix to be able to load the System.Private.CoreLib.dll built on Linux. You can easily port that change to any state of the repository as it just removes an #ifdef around getting the option related to that env variable.

@guesshe
Copy link
Author

guesshe commented May 6, 2020 via email

@am11
Copy link
Member

am11 commented May 30, 2020

/x86_64/usr/bin/x86_64-pc-nto-qnx7.0.0-ld: ../../../pal/src/libcoreclrpal.a(context2.S.o): relocation R_X86_64_PC32 against symbol `CONTEXT_CaptureContext' can not be used when making a shared object; recompile with -fPIC

I was also getting this error when compiling coreclr's superpmi project with illumos sysroot on Ubuntu 18.04. I was using gcc v8.4.0 and binutils v2.25.1, both built for illumos target. The fix was to upgrade binutils to v2.33.1, without code modifications in coreclr. It was due to an upstream bug in binutils's assembler (as) or archiver (ar), which was fixed around v2.29-v2.30.

@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2020
@joperezr joperezr added this to the Future milestone Jul 7, 2020
@karthikshanmugam
Copy link

@guesshe Can you please tell me if you get the corehost to work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Meta question Answer questions and provide assistance, not an issue with source code or documentation.
Projects
No open projects
Development

No branches or pull requests

9 participants