Important enhancements

Important bugfixes

The bug that some examples use the same random seed across envs for env.seed is fixed.
The bug that batch training with n-step return and/or recurrent models is not successful is fixed.
The bug that examples/ale/train_dqn_ale.py uses LinearDecayEpsilonGreedy even when NoisyNet is used is fixed.
The bug that examples/ale/train_dqn_ale.py does not use the value specified by --noisy-net-sigma is fixed.
The bug that chainerrl.links.to_factorized_noisy does not work correctly with chainerrl.links.Sequence is fixed.

chainerrl.experiments.train_agent_async now requires eval_n_steps (number of timesteps for each evaluation phase) and eval_n_episodes (number of episodes for each evaluation phase) to be explicitly specified, with one of them being None.
examples/ale/dqn_phi.py is removed.
chainerrl.initializers.LeCunNormal is removed. Use chainer.initializers.LeCunNormal instead.

Rainbow (#374)
Make copy_param support scalar parameters (#410)
Enables batch DDPG agents to be trained. (#416)
Enables asynchronous time-based evaluations of agents. (#420)
Removes obsolete dqn_phi file (#424)
Add Branched and use it to simplify train_ppo_batch_gym.py (#427)
Remove LeCunNormal since Chainer has it from v3 (#428)
Precompute log probability in PPO (#430)
Recurrent PPO with a stateless recurrent model interface (#431)
Replace Variable.data with Variable.array (again) (#434)
Make IQN work with tuple observations (#435)
Add VectorStackFrame to reduce memory usage in train_dqn_batch_ale.py (#443)
DDPG example that reproduces the TD3 paper (#452)
TD3 agent (#453)
update requirements.txt and setup.py for gym (#461)
Support gym>=0.12.2 by stopping to use underscore methods in gym wrappers (#462)
Add warning about numpy 1.16.0 (#476)

Rainbow (#374)
Create an IQN example aimed at reproducing the original paper and its evaluation protocol. (#408)
Benchmarks DQN example (#414)
Enables batch DDPG agents to be trained. (#416)
Fixes scores for Demon Attack (#418)
Set observation_space of kuka env correctly (#421)
Fixes error in setting explorer in DQN ALE example. (#423)
Add Branched and use it to simplify train_ppo_batch_gym.py (#427)
A3C Example for reproducing paper results. (#433)
PPO example that reproduces the "Deep Reinforcement Learning that Matters" paper (#448)
DDPG example that reproduces the TD3 paper (#452)
TD3 agent (#453)
Apply noisy_net_sigma parameter (#465)

Use Python 3.6 in Travis CI (#411)
Increase tolerance of TestGaussianDistribution.test_entropy since sometimes it failed (#438)
make FrameStack follow original spaces (#445)
Split test_examples.sh (#472)
Fix Travis error (#492)
Use Python 3.6 for ipynb (#493)