-
Notifications
You must be signed in to change notification settings - Fork 74
Home
The simulator is based on a grid world and discrete time step. The grid world can be either hexagonal or quadrilateral. At the beginning of each time period, those activities happened:
Driver finished previous order and become available again.
Driver arrived the dispatched location based on allocation policy.
Generate orders.
Driver online offline status updates.
remove previous unserved orders.
----> state information collected and interact with agent <----
apply allocation policy delivered from agent.
assign orders to drivers and compute reward.
To be more specific, we can image each time step denotes a 10mins time interval. One episode in the simulator contains 144 time steps, which simulate one day's activities.
-
mapped_matrix_int:
The id of valid grid in the grid world. -100 indicates invalid grid. -
idle_driver_dist_time:
A list of length 144.idle_driver_dist_time[0] = [10000, 200], the mean and std of total number of idle drivers for entire simulator at time step 0.
-
idle_driver_location_mat:
A 2d list of size 144 x number of grids.
idle_driver_location_mat[0][grid_matrix_id] = 100 indicate mean of idle drivers at grid _grid_matrix_id_ at time 0.
- L_max The maximum time steps an order can across and the longest distance an order can across. If L_max = 9, the longest duration of order is 1.5 hours.
- M, N Size of matrix mapped_matrix_int
- n_side Indicate hexagonal (n_side = 6) or quadrilateral (n_side = 4) grid world.
- probability The probability of each order in real_orders being sampled.
- real_orders A matrix of size number of orders x 5.
real_orders[0] = [270. 9. 143. 2. 13.65]
indicate start grid_id = 278, end grid_id = 9, order generated time = 143, order duration = 2, order price = 13.65.
- onoff_driver_location_mat A 3d array of size 144 x number of grid x 2.
onoff_driver_location_mat[0][1] = [-0.625 2.92350389],
mean and std of changes of driver by online offline status updates at time step 0 and valid grid 1. env.target_node_ids[1] will relate this valid grid 1 to grid_id.
-
order_num_dist:
order_num_dist is list of length 144. Each element is a dictionary.
i.e. order_num_dist[0][22] = [mean, variance]
indicates at time 0, the mean and std of number of orders on grid_matrix_id 22.
- order_time_dist: A list of length L_max. The probability distribution of duration (order end time - order start time) of orders. If L_max = 9, the longest duration of order is 1.5 hours.
order_time_dist = [0.5, 0.4, 0.1]
indicates there are 50% of orders last for 10 minutes, 40% last for 20 minutes and 10% last for 30 minutes (3 time steps).
- order_price_dist: A 2d list of size L_max x 2
order_price_dist[0] = [10, 2]
indicate the mean and std of orders' price for those orders last 10 minutes.
NOTICE: grid_matrix_id is the index of grid in this matrix mapped_matrix_int . grid_id is the id of valid grid set by yourself. They can be same.