Training Deep Q-Learning neural network based on ConvNetJS demo to use sonar range sensors and RatSLAM goals.
- ConvNetJS - demo
- ROSLibJS
- RatSLAM fork (extended ROS integration)
Got busy and distracted, it works well enough for direct goal seeking and that may be enough to train up an agent which makes pretty maps in RatSLAM (if not straying too far before turning back). Have some decent experiments with ReinforceJS. Finding goals on the other side of walls and traps will require a different implementation, namely Actor-critic and/or Actor-mimic style architectures to get around these opstacles (when a goal can be seen on the other side of a trap).
npm install
bower install
- Teleop.
- Integrate IMU/tilt/odom feedback.
- Catkin-ise.
- Define custom ROS messages.
- LTM/STM with long-term sets of "important" experiences.
- Save/load DQN experience sets.
roslaunch kulbu_base sim.launch world:=rat1
roslaunch kulbu_slam rat.launch use_rat_odom:=false topic_odom:=/kulbu/odometry/filtered
rosrun turtlebot_teleop turtlebot_teleop_key /turtlebot_teleop/cmd_vel:=/kulbu/diff_drive_controller/cmd_vel
roslaunch rosbridge_server rosbridge_websocket.launch # ROSLibJS
node src/main.js
node src/main.js --noise # Generate noise on extra sensors.
node src/ratsim.js # Simulate RatSLAM goals for training.
rqt_plot /dqn/reward:epsilon
rqt_plot /dqn/avg_reward:avg_loss
rostopic pub -1 /dqn/status std_msgs/String -- '"{\"learning\": true, \"moving\": true, \"sensors\": false}"' # TODO: Custom message format.
rostopic pub -1 /dqn/save std_msgs/String -- 'file' # Save DQN as JSON.
rostopic pub -1 /dqn/load std_msgs/String -- 'file' # Load DQN from JSON.
rostopic pub -1 /dqn/set_age std_msgs/String -- '"100000"' # FIXME: Datatype.
- Reverse goal order and tweak for use on exploration tasks.
- Discard experiences with many links.
- Quality metric for LV. Don't link low quality experiences.
- Reject closures with vastly different magnetic reading?
- Implement multi Experience Maps RatSLAM on Humanoids
- Further test Dropout uncertainty.
- Implement in Caffe fork or Theano if not Torch