Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssimpImporter: import unrecognized materials + compatibility with 5.1.1 #116

Closed
wants to merge 48 commits into from

Conversation

pezcode
Copy link
Contributor

@pezcode pezcode commented Nov 25, 2021

👋 It is finally finished

This PR does a few things:

Raw material import

I took the raw material import from #91, fixed a few things and added tests. AssimpImporter now imports unrecognized material properties with their raw names and types. If a type is unsupported, it gets exported as an opaque buffer inside MaterialAttributeType::String. This default behaviour can be turned off with the ignoreUnrecognizedMaterialData option.
If this behaviour is always wanted, the forceRawMaterialData option has you covered. That last one is only really useful for tests I suppose.

There is some rudimentary documentation, but that could probably be extended a bit.

Compatibility with 5.1.1

As always, this new version of assimp broke a dozen things. It's kind of what happens when for two years, every PR gets insta-merged with no useful code review or tests, and obviously nobody uses this rickety library on latest master. Non-exhaustive list of disappointments:

  • FBX animations have incorrect timescales, needs to be detected by version and patched
  • Collada light import is broken in 5.1.0, but fixed in 5.1.1 a few days later
  • The Collada importer now uses mesh ids as the names, for no good reason. No other object type does.
  • if no importer reports that it can read a file, there is now X3DImporter to the rescue, that always reports that it can read a file. But that's not all, when you pass it any non- or invalid X3D file, it crashes!
  • the glTF importer used to import the last set of joint indices and weights, but they upgraded it to fail with a validation error for files with multiple sets of join data
  • the glTF importer now requires the global scene property, which is of course not mandatory in the glTF spec

I don't know if these are worth documenting. Let me know what you consider important. Cataloguing every single assimp bug in the docs makes me shudder.

Missing data

A few things were not previously imported, either because they were forgotten, or because they were added in a newer version of assimp or magnum. These things are now imported:

  • lights:
    • import names
    • ambient lights are now imported as LightData::Type::Ambient, not Point
  • cameras:
    • import names
    • import aspect ratio
    • import orthographic cameras, added in assimp 5.1.0
  • import scene name
    • added in assimp 5.0.0
    • there can only be one scene, and this currently only works in glTF

Comparing for equality used to be fine, since aiProcess_SortByPType made sure that only one flag was ever set. But assimp 5.1.0 now always sets a new aiPrimitiveType_NGONEncodingFlag for triangulated meshes, leading to an "unsupported aiPrimitiveType" error.
This needs minor changes to TinyGltfImporter test files, coming in the next commit. The last remaining failing test is an Assimp bug, fix follows in another commit.

We'll ignore 5.1.0 since there are only a few days in between and it's not worth special-casing the Collada light test because of this.
Assimp's glTF importer needs a default scene to load a file at all, and cameras need to be referenced by nodes to be imported
This got broken sometime after 5.1.0. Similarly to glTF mTicksPerSecond patching, this emits a warning if detected.
Largely taken from mosra#91

Anything unrecognized is imported with the raw Assimp name and type. Can be turned off with ignoreUnrecognizedMaterialData, or overridden to import everything raw with forceRawMaterialData. The latter implies ignoreUnrecognizedMaterialData=false.

For existing material tests we use ignoreUnrecognizedMaterialData, otherwise they'll break with every new Assimp version.

TODO: tests, docs
Not too sure about the bool because converting from a buffer kinda defies the definition of "raw"
Just a few things I stumbled upon while working on this
This seems to be needed in project scope, even if we inject enable_language(C) into the project that's being try_compile'd
@mosra mosra added this to the 2021.0a milestone Nov 25, 2021
@pezcode
Copy link
Contributor Author

pezcode commented Nov 25, 2021

I ran into one issue I hadn't considered, and one reason CI is failing:

The aiTextureType enums are used in a big switch to generate material attribute keys. This needs some kind of compile-time guarding to not use unknown values on older versions. I can do that easily with an #if for types added in 5.0.0, but for 5.1.0 it's already tricky, and this will be a maintenance nightmare for future versions adding more texture types.

There's a somewhat decent alternative:
Assimp 5.1.0 added const char *TextureTypeToString(enum aiTextureType) which means we can detect if that function is there and use that instead. The problem is that the naming is inconsistent, right now it's SNAKE_CASE, but TextureTypeToString produces CamelCase. Do we care about this? If yes, and we decide to patch the names up, which version is better? I think CamelCase fits better with built-in MaterialAttribute names.

Copy link
Owner

@mosra mosra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, as always 👍

There's one commit marked as [wip], is it still WIP or just a stale message?

src/MagnumPlugins/AssimpImporter/AssimpImporter.cpp Outdated Show resolved Hide resolved
the 5.0.1 code confirms that no importer sets attributes
starting uppercase, so an assert should be fine here.
Revisit if this breaks for someone. */
/** @todo $ conflicts with Magnum's material layer names */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, I didn't fix this yet? 😅

src/MagnumPlugins/AssimpImporter/AssimpImporter.cpp Outdated Show resolved Hide resolved
src/MagnumPlugins/AssimpImporter/AssimpImporter.h Outdated Show resolved Hide resolved
@@ -70,6 +70,8 @@ endif()
# temporary CMakeList.txt created by try_compile:
# https://github.com/Kitware/CMake/blob/v3.4.0/Source/cmCoreTryCompile.cxx#L313
file(WRITE ${CMAKE_CURRENT_BINARY_DIR}/checkAssimpVersionLanguageOverride.cmake "enable_language(C)")
# For some reason, this also needs C to be enabled in this scope
enable_language(C)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. Gotta admit I don't like this part at all :/ I'm wondering if I could expand the above static library workaround to not need to link to the assimp target ever, sidestepping the problem with a C99 feature enabled on it.

@mosra
Copy link
Owner

mosra commented Nov 25, 2021

compile-time guarding to not use unknown values on older versions

Would defining our own alternative enum directly in the plugin source work? This of course assumes the IDs don't change across Assimp versions. What I want to still have, however, is GCC/Clang notifying me if a new Assimp version adds more values to the enum... but according to my testing, this should work:

enum class AssimpTextureType { // alternative naming
    ...
    CLEARCOAT = 20,
    TRANSMISSION = 21,
    ...
};

switch(semantic) { // `semantic` is still an aiTextureType to make GCC/Clang check for unhandled values
    ...
    case aiTextureType(int(AssimpTextureType::CLEARCOAT)): // cast so it doesn't warn about mismatched types
        ...
}

I think CamelCase fits better with built-in MaterialAttribute names.

I'm a bit on the fence about this -- the UPPER_CASE is what people are used to from the Assimp API, so I guess the name should match what the aiTextureType has? Another reason is that if people see some weird unexpected texture name in there, they could then directly grep for that string in Assimp sources and get angry immediately. But if it would be translated to CamelCase then they can't.

And test that it works with the PBR texture types added in 5.0.0
@pezcode pezcode force-pushed the assimp-custom-material branch from 5eb6b21 to 63a48e7 Compare November 26, 2021 21:31
AI_MATKEY_TRANSPARENCYFACTOR only got added in 4.0.0. We're already testing other float attributes so we don't need to test this one, and there isn't really any good replacement that's universally supported
@pezcode
Copy link
Contributor Author

pezcode commented Nov 26, 2021

It compiles, now we enter the cursed bugs phase 😬

  • camera names in assimp 3.2 seem to use the name of the node that references the camera
  • "Containers::StringView: string expected to be smaller than 2^62 bytes, got 7957911638789586949". This happens on all Linux CIs with 3.2 except linux-nondeprecated. weird assimp hack, working around this now
  • "Trade::MaterialData: duplicate attribute NormalTextureCoordinates". This happens on all Mac CIs with 5.0.1. Extra-scary because I can't reproduce this locally with the same version. fixed with a paranoid update to this assert
  • Assimp 3.2 doesn't import layered materials, skipping
  • Assimp 3.2 doesn't import a few attributes tested for in the raw material tests, skipping

Remaining issues:

  • enable_language(C) required for try_compile
  • Assimp 5.1.0 binaries are not copied to bin/ unlike 5.0.1. Not sure if this is something we can/should fix on our end. , the importer docs already suggest compiling as static, which works

@mosra
Copy link
Owner

mosra commented Nov 27, 2021

Yay bugs! 🤩

Trade::MaterialData: duplicate attribute NormalTextureCoordinates

This one is funny! The code checking this is https://github.com/mosra/magnum/blob/f51110fe9847c63953aec382b90fb4ab24022133/src/Magnum/Trade/MaterialData.cpp#L233-L240 and I see two possibilities:

  • There actually is a duplicate but every STL except libc++ (libc++ is Clang's standard library and on the CIs it's just Mac and Emscripten, but Emscripten doesn't build Assimp) implements std::sort() in a way that doesn't end up comparing the two duplicate items ever. That sounds unlikely because it'd mean that if they are not equal but everything else is either before or after the two, it wouldn't get to know which of the two goes first. (Nevertheless, I was still paranoid about this check and so I have a test that verifies it indeed fires for all possible permutations of a list with one duplicate.) Nope, this wasn't it. (But now that I think of that, the assertion is done O(n log n) times while if it would be outside the std::sort() it'd be just O(n). TODO for me, I guess.)
  • There isn't a duplicate but somehow libc++ calls the comparator with two identical arguments (&a == &b). It's silly but I have no better explanation, and the assert doesn't expect this. Not sure what would be the easiest way to verify this scenario, maybe I should just commit a blind fix to the assert and then restart the mac jobs? :D -- YEP THAT WAS IT, shitty libc++ it is, your code is fine :]

7957911638789586949

Very funny also!! I stared into the abyss and the abyss stared back at me put this into a hex editor and this looked back at me:

image

Clearly a garbage in the upper 32 bits for some reason. I remember there's a y.png file in some of the tests, so I wonder how the hell did that get here 👀 ...

Actually, WHAT THE FUCK, it seems that aiString::length used to be a size_t until 5.0.0, and then it got changed to uint32_t BUT, at least from the peek onto the HORROR in assimp/assimp@69087ab (and assimp/assimp@686f265 before that), the string contents were apparently written right after the initial four bytes EVEN IF size_t was 8 bytes?!

I also see things in the CI log like property $tex.file.AMBIENT is too large with 7593752185079332877 bytes with the same problem (ambi in the upper 32 bits).

except linux-nondeprecated.

Sorry, the job is named misleadingly -- it disables deprecated APIs and also all asserts. So then the string view size assert doesn't fire because it doesn't exist. (I optimized for CI turnaround time and this seemed like an acceptable tradeoff.)

camera names in assimp 3.2 seem to use the name of the node that references the camera

There's this very amazing and stroke-inducingly funny way of how Assimp assigns cameras and lights to nodes. It just makes the name of both the same, which then you discover by going through all cameras and all nodes for every node. Very quadratic. It also says that "there can be multiple cameras with the same name, which means given node has all those cameras assigned" even though I was never able to reproduce that, so I guess this was "technically" why back then a camera inherited the name from its node and not the other way around, but then somebody complained that this sucks and so it got inverted?

Assimp 3.2 doesn't import layered materials
Assimp 3.2 doesn't import a few attributes tested for in the raw material tests

I guess some XFAIL / SKIP could do here? The Mac and Windows CIs test this thoroughly enough, so if the ancient Ubuntu won't test a few cases it's fine.

Assimp 5.1.0 binaries are not copied to bin/ unlike 5.0.1

Huh... you mean, when you have Assimp as a subproject? Or external? For external libraries I only copy DLLs of SDL and GLFW, nothing else. So if it's doing this, then it has to be some automagic out of my reach.

@pezcode
Copy link
Contributor Author

pezcode commented Nov 27, 2021

YEP THAT WAS IT, shitty libc++ it is, your code is fine :]

That's some impressive detective work 😱 Thanks for the quick fix!

the string contents were apparently written right after the initial four bytes EVEN IF size_t was 8 bytes?!

It gets even funnier, aiString::C_Str() returns the pointer correctly offset by 8, and only the material get/set functions do the 4-byte hack. Meaning we have to hack around this for all material strings, but only there and only on < 5.0.0.

Huh... you mean, when you have Assimp as a subproject? Or external? For external libraries I only copy DLLs of SDL and GLFW, nothing else. So if it's doing this, then it has to be some automagic out of my reach.

Sorry for not clarifying, yeah, I meant subprojects. I tracked it down to these commits:

assimp/assimp@be4fe13
assimp/assimp@11daed6

Assimp used to set CMAKE_*_OUTPUT_DIRECTORY if it was an in-source build, but now it unconditionally sets *_OUTPUT_DIRECTORY on the assimp target. This overrides magnum-plugin's cached CMAKE_RUNTIME_OUTPUT_DIRECTORY. Probably not much we can do there.

@mosra
Copy link
Owner

mosra commented Nov 27, 2021

only the material get/set functions do the 4-byte hack

❤️ 🔥 I'm speechless.

This overrides magnum-plugin's cached CMAKE_RUNTIME_OUTPUT_DIRECTORY.

This was never expected to have any impact on subprojects (which is why I'm doing explicit handling for GLFW and SDL DLLs).

Usually I build plugin dependencies as static, so the dynamic module is self-contained. That would "fix" this issue I think ... or do you need a dynamic build because the test uses Assimp APIs as well? Or ... is it possible the suggested CMake snippet in the docs (where it forces a static build) doesn't work anymore and you get a DLL?

Assimp 3.2 takes camera names from the node that they're attached to
My personal new favorite assimp bug: aiString::length is a size_t before version 5, but the material property functions dealing with strings decided to store them as a uin32_t directly followed by the string content. aiString::C_Str() however doesn't know about those hacks, so when you interpret the material property data as an aiString directly, this blows up in your face with parts of the string data being read as the length. The solution is to use the getter function, which leads to a copy. Alternatively I could have replicated the internal layout but it's enough if Assimp performs these atrocities. Any poor soul using prehistoric versions of this library has far more to worry about than an extra string copy.
And import them as a typeless buffer, since we have that functionality already. Currently untested.
Layered FBX materials are not supported, and skipping around indices in materialRawUnrecognized() would be too awkward. Should be enough to test this on 5.0.0 and up
@pezcode
Copy link
Contributor Author

pezcode commented Nov 27, 2021

This was never expected to have any impact on subprojects (which is why I'm doing explicit handling for GLFW and SDL DLLs).

Heh, good to know. I thought this was intended, because things just magically worked with 5.0.1 even without forcing BUILD_SHARED_LIBS off. But with that it works 👍

@pezcode
Copy link
Contributor Author

pezcode commented Nov 27, 2021

My brain slipped while writing that one commit message 👀 When you merge and rebase, can you change it to:
"AssimpImporter: work around" -> "AssimpImporter: work around material string property hack" 🙏

The new Linux build failures in KtxImporter seem to be caused by the recent changes to Utility::copy()

@mosra
Copy link
Owner

mosra commented Nov 27, 2021

Oh no, now cursed old Clang bugs 🙈

@mosra
Copy link
Owner

mosra commented Nov 27, 2021

Seems like the root cause is that Clang 3.8 treats C arrays as not trivially copyable for some reason, and the code in question uses a multi-dimensional array, so the static_assert happens on a type that's still an array and 💥

But it worked before (picking some other overload) so I'll just disable the new feature of copying to C arrays from mosra/corrade@dc173b4 for multi-dimensional arrays on Clang 3.8.

@pezcode
Copy link
Contributor Author

pezcode commented Nov 28, 2021

Homebrew apparently has bottles for 5.1.2 already, so we could probably upgrade to that on the Mac CI

@codecov
Copy link

codecov bot commented Nov 28, 2021

Codecov Report

Merging #116 (9366ae0) into master (f2c4e46) will increase coverage by 0.55%.
The diff coverage is 75.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #116      +/-   ##
==========================================
+ Coverage   94.86%   95.42%   +0.55%     
==========================================
  Files         115      115              
  Lines        9484    10510    +1026     
==========================================
+ Hits         8997    10029    +1032     
+ Misses        487      481       -6     
Impacted Files Coverage Δ
...rc/MagnumPlugins/AssimpImporter/AssimpImporter.cpp 89.83% <75.49%> (+7.96%) ⬆️
src/MagnumPlugins/KtxImporter/KtxImporter.cpp 90.69% <0.00%> (-5.80%) ⬇️
...ns/MiniExrImageConverter/MiniExrImageConverter.cpp 95.83% <0.00%> (-4.17%) ⬇️
...MagnumPlugins/DrMp3AudioImporter/DrMp3Importer.cpp 83.33% <0.00%> (-2.88%) ⬇️
...MagnumPlugins/Faad2AudioImporter/Faad2Importer.cpp 87.50% <0.00%> (-2.25%) ⬇️
...ugins/StbVorbisAudioImporter/StbVorbisImporter.cpp 80.00% <0.00%> (-2.06%) ⬇️
src/MagnumPlugins/IcoImporter/IcoImporter.cpp 95.83% <0.00%> (-1.79%) ⬇️
...gnumPlugins/DrFlacAudioImporter/DrFlacImporter.cpp 96.42% <0.00%> (-1.76%) ⬇️
src/MagnumPlugins/JpegImporter/JpegImporter.cpp 94.44% <0.00%> (-1.56%) ⬇️
...MagnumPlugins/DrWavAudioImporter/DrWavImporter.cpp 96.34% <0.00%> (-1.19%) ⬇️
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2c4e46...9366ae0. Read the comment docs.

@mosra
Copy link
Owner

mosra commented Nov 28, 2021

Amazing, didn't expect Assimp to not have any leaks in the newly explored areas. Ah, I'm excluding Assimp leaks on the CI, of course.

Homebrew apparently has bottles for 5.1.2 already

Unfortunately that would mean increasing macOS build times by 10 minutes (because yes of course, brew update is that slow). I'm thinking that I could build 5.1.2 for Windows instead.

@mosra mosra modified the milestone: 2021.0a Nov 28, 2021
@mosra mosra mentioned this pull request Nov 28, 2021
19 tasks
@mosra
Copy link
Owner

mosra commented Nov 28, 2021

Merged as d676a6b...5329b00. I squashed some commits together to always have code+docs+tests in a single commit, but other than that it's mostly following your original sequence.

I tried to figure out some alternative to enable_language(C) but since it needs to be there for CMake < 3.6 anyway, it'd only make the maintenance nightmare bigger. I'll revisit that when I drop support for old CMakes, hopefully after the next release.

The Windows CI argh, well, Windows CI without MinGW builds & tests against 5.1.2 now as of 3fbc61e, so I guess we're covered. MinGW 5.1.2 failed to link, I don't have patience to look into that. The remaining TODO for me is introducing MaterialAttributeType::Buffer, the magnum-sceneconverter --info-materials output looks kinda strange with all the �s :)

Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants