-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] ONNX support #772
Comments
REAL-ESRGAN is to large, It's too difficult to run it in real time on current computers. |
The main bottleneck is memory size. 2k game + 2x enlarge cost about 16g memory. The speed can be real time in 3060 (512k model) only if the memory is unlimited. 😂 Imo that could work in igpu, though 780m still not good enough, maybe Qualcomm elite x, another story... |
Some models can indeed be inferenced in real time, such as mpv-upscale-2x_animejanai. I plan to add support for ONNX in the future, but there is still a lot of uncertainty. |
The best ESR model I ever tried, not only the size but also the output(real + anime) |
The SuperUltraCompact model isn't much larger than Anime4k UL model (around 2x, I guess), so it's kinda possible to be ported to HLSL format. |
While porting to HLSL does indeed offer higher efficiency, the cost is also substantial unless there's an automated approach. I'm inclined to adopt ONNX Runtime, enabling us to seamlessly integrate any ONNX model with ease. |
i personal think this is a great idea as animejanai does offer much better grafic some times. I would personal donate 20 US if this happen. Magie is getting better everyday. Love this thing so much.
|
I ported Animejanai V3 SuperUltraCompact and 2x-DigitalFlim to Magpie's effect if anyone want to try.
|
Great job! It appears that Animejanai is well-suited for scenes from old anime, as it doesn’t produce sharp lines like Anime4K does. However, a significant issue is that it sacrifices many details. DigitalFlim is sharper than Animejanai, it also suffers from severe detail loss. In terms of performance, they are roughly 20-25 times slower than Lanczos. |
nothing happened after I put both files in effects folder (even rebooted the system) For experiment I also put the fakehdr.hlsl and it works... Don't know if I made any mistakes (version 10.05) |
You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525
|
Thank u for your great work and help! Anyway I still don't know how to download the build from GitHub action, so let me keep that surprise till the next upcoming release.😁
For that I think it's the common problem in ESR model, base on the structure (even large model can't keep many detail) and training datasets (animations?)
|
Thank u. After sigin again I can download it. |
Can you port the SD model of animejanai, which is more aggressive in its detail reconstruction? an UC model for those of us with more computing power would also be great.
|
@spiwar Do you have link for it? Didn't find it on their github |
For detail restore...2x-Futsuu-Anime, but its 4M... i think its a game for 4090 |
animejanai.zip |
Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv. Can you port the v3 sharp model? They are in the animejanai discord beta releases. |
You can find it in the full 1.1gb release, but i've included it here for convenience. |
The performance optimization space is very limited, because the bottleneck is in floating-point operations. @kato-megumi I found that 16-bit floating-point numbers (min16float) are more efficient, with about a 10% performance improvement on my side. But this is still not enough to make UC usable. Further performance improvement can only be achieved by using platform-specific APIs, such as TensorRT. |
find data to enhance SUC model might be the better way forward... |
@Blinue Sorry, can you elaborate. I though using |
这些还没适配新的渲染系统,耐心等待 #643 完成。 |
OK了解了 |
Upscale 1440p (or anything bigger than 1080p) with tensorrt result in black screen. |
The reason is that the TensorRT engine in Magpie is built to handle up to 1080p input at most. It can technically support bigger inputs, but consumer-grade graphics cards may have difficulty with real-time inference. |
Just tested, very good performance upscaling from 1080p -> 4k with the included model (animejanai v3 ultracompact) Using superultracompact model gets me to 60fps on the same scenario, huge improvements all around. This might be already in the works, but I think it's a good idea to have a pop up saying that the engine is being built when using TensorRT, as it happens in the background users might think nothing is happening when in fact the engine is being built. |
Could you tell me how do you monitor your fps? The inner monitor can't work at present, and this version is not compatible with Rivatuner, which I guess also can not get the right fps data because of the new rendering system. |
turn on developer mode by edit config.json |
RTSS works for me. You might have to add magpie as a separate application in the rtss whitelist. To monitor your fps with rtss, have an animation play or any moving scene and dont move your mouse.
|
Thanks! |
At the very least, this should be available as an option. |
I plan to enable the TensorRT backend to support inputs of any size in the future. This means that users will have to rebuild the engine multiple times to scale larger windows. |
Does this mean the engine would need to be rebuilt every time the window size changes, or would it need to be built just once for each different window size? |
Since building the engine is quite time-consuming, it’s crucial to minimize the frequency of rebuilds. The implementation details have not been decided yet, please be patient. |
Does the ONNX version not support Integrated graphics? The screen will go black on the AMD R6-6600H CPU... |
Kindly note that only DirectML backend is supported by non-NVIDIA graphics cards. Can you provide the logs to help diagnose the problem? |
|
It’s likely due to OOM; I suspect the integrated graphics card doesn’t have sufficient resources to perform the inference. Can you share the ONNX file you’re using? |
I just use the 2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx |
For effective inference on the UC model, a minimum of a 3060 GPU is essential. While it might be feasible to run much smaller models on integrated graphics cards, it doesn’t make much sense to do so: for tiny models, HLSL is significantly faster. |
If you only want to verify if ONNX would work. I only suggest use "tiny" model on IGPU(i.e. #847 (comment)) |
I conducted some tests again and found that it can scale certain windows correctly (nodepad, calculator, window terminal), but it may black out for certain windows (explorer, some games), so it may not be an error caused by ONNX, but rather a bug in window capture |
I believe this is related to the window size. Scaling larger windows requires more VRAM, leading to OOM. |
I have tested it, and for the software I mentioned earlier that can be scaled, no matter how I resize it, it can scale normally. For software that cannot be scaled, no matter how I resize it, it will black screen |
@HIllya51 Could you create an issue for this problem? |
Thank you so much! I love this tool. |
In the onnx-preview1 version, only Graphics Capture is functional; other capture methods have not been adapted. It's merely a technical preview, so stay tuned for future updates. |
现在发布的AIPC 258v hx370 已经有性能接近iGPU的NPU。据称提供45Tops算力,iGPU则是60左右 也许可以对NPU提供支持,NPU有低功耗的优势 |
In compact structure (model size 256k~4m) that would be a runtime effect base on DirectMl
Am I so greedy?😂
The text was updated successfully, but these errors were encountered: