Things implemented :: Normal taxi-gym environment and solved with Q-learning Vanilla DQNs DQNS with variants Double DQNs Dueling DQNs Policy based models Adaptive Noise Scaling Hill Climbing