Nothing is possible without our Lord and Saviour Jesus Christ. But everything is possible with Him. I was morally dying, addicted to ponrography and video-games. But He intervened, and gave me a new life. Reinfocement Learning to play with and University to support me.
This repository was created to support the 2024 draft paper. It is unification and simplification of Symphony-1.0, Symphony-2.0(2.1) and Symphony-3.0 (Draft) into single Symphony-Saya-UTD-5 version (Model-free Deterministic Algorithm)
Some ideas were dropped and some proven their worth were solidified:
⚙ No multi-agents/Without big ensemble of Critics/Model-free/Off-policy
-
Temporal (Immediate) Advantage ✅ (though UTD-5, batch size 128>>768)
-
Fading Replay Buffer ✅
-
Rectified Learnable Sine Wave Activation Function ✅
-
Rectified Huber Symmetric and Asymmetric Loss Functions ✅
-
Seamless Actor-Critic updates ✅
-
Silent Dropouts ✅
-
"movement is life" concept❌ -
reduced objective to learn Bellman's sum of dumped reward's variance❌ -
improve reward variance through immediate Advantage❌
Some modules were transferred from Pytorch nn.Module to Pytorch jit.ScriptModule.