forked from LasseMay/test_pandoc_ci
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
real system perturbation leaderboard text
- Loading branch information
fwiebe
committed
Sep 4, 2024
1 parent
cd0866a
commit 1cc638f
Showing
6 changed files
with
446 additions
and
0 deletions.
There are no files selected for viewing
Binary file added
BIN
+8.04 MB
data/acrobot/real_system_perturbed/ilqr_tvlqr_lqr/experiment09/video.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+8.9 MB
data/pendubot/real_system_perturbed/ilqr_tvlqr_lqr/experiment09/video.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
## Rules | ||
|
||
The real system leaderboard compares the performance of different control | ||
methods on the real hardware. The task for the controller is to swingup and balance | ||
the acrobot and keep the end-effector above the threshold line. | ||
|
||
<div align="center"> | ||
<img width="400" src="https://raw.githubusercontent.com/dfki-ric-underactuated-lab/real_ai_gym_leaderboard/main/data/acrobot/real_system/ilqr_tvlqr_lqr/experiment03/video.gif"> | ||
<img width="400" src="https://raw.githubusercontent.com/dfki-ric-underactuated-lab/real_ai_gym_leaderboard/main/data/acrobot/real_system/MC-PILCO/experiment08/video.gif"> | ||
<figcaption>Videos from left to right: TVLQR, MC-PILCO</figcaption> | ||
</div> | ||
|
||
The model parameters identified by us with a least squares optimization of the | ||
acrobot are: | ||
|
||
- Gravity: $g = 9.81 \, \frac{\text{m}}{\text{s}^2}$ | ||
- First Link mass: $m_1 = 0.523 \, \text{kg}$ | ||
- Second Link mass: $m_2 = 0.625 \, \text{kg}$ | ||
- First actuator length: $l_1 = 0.2 \, \text{m}$ | ||
- Second actuator length: $l_2 = 0.3 \, \text{m}$ | ||
- First link center of mass: $r_1 = 0.2 \, \text{m}$ | ||
- Second link center of mass: $r_2: 0.255 \, \text{m}$ | ||
- First link inertia: $I_1 = 0.031 \, \text{kg m}^2$ | ||
- Second link inertia: $I_2 = 0.050 \, \text{kg m}^2$ | ||
- First actuator damping friction: $b_1 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- Second actuator damping friction: $b_2 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- First actuator coulomb friction: $c_{f1}: 0.16 \, \text{N m}$ | ||
- Second actuator coulomb friction: $c_{f2}: 0.12 \, \text{N m}$ | ||
- Motor Inertia: $I_r = 6.28e-05 \, \text{kg m}^2$ | ||
- Gear ratio: $g_r: 6.0$ | ||
- First Motor Torque limit $\tau_{max1} = 0.0 \, \text{N m}$ | ||
- Second Motor Torque limit $\tau_{max2} = 6.0 \, \text{N m}$ | ||
|
||
More information about the dynamic model of the double pendulum can be found | ||
here: [Double Pendulum | ||
Dynamics](https://dfki-ric-underactuated-lab.github.io/double_pendulum/dynamics.html). | ||
For a urdf file with this model see here: [URDF](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/data/system_identification/identified_parameters/design_C.1/model_1.0). | ||
|
||
The $0.5\,\text{Nm}$ torque limit on the passive joint can be used to compensate the | ||
friction of the motor. | ||
|
||
The actuators can be controlled with arbitrary control frequency of up to $500\, \text{Hz}$ | ||
and the experiment takes $10\,\text{s}$. | ||
The initial acrobot configuration | ||
is $x_0 = (0, 0, 0, 0)$ (hanging down) and the goal is the unstable | ||
fixpoint at the upright configuration $x_g = (\pi, 0, 0, 0)$. | ||
The upright position is considered to be reached when the end-effector is above | ||
the threshold line at $h=0.45 \, \text{m}$ (origin at the mounting point). | ||
|
||
## Scores | ||
|
||
For the evaluation multiple criteria are evaluated and weighted to calculate an | ||
overall score (Real AI Score). The criteria are: | ||
|
||
- **Swingup Success** $c_{success}$: Whether the swingup was successful, i.e. if the | ||
end-effector is above the threshold line in the end of the simulation. | ||
- **Swingup time** $c_{time}$: The time it takes for the acrobot to reach the goal region | ||
above the threhhold line and *stay there*. If the end-effector enters the | ||
goal region but falls below the line before the simulation time is over the | ||
swingup is not considered successful! The swingup time is the time when the | ||
end-effector enters the goal region and does not leave the region until the end. | ||
- **Energy** $c_{energy}$: The mechanical energy used during the execution. | ||
- **Torque Cost** $c_{\tau, cost}$: A quadratic cost on the used torques ( $c_{\tau, cost} = \sum u^TRu$, with R | ||
= 1). | ||
- **Torque Smoothness** $c_{\tau, smooth}$: The standard deviation of the changes in the torque | ||
signal. | ||
- **Velocity Cost** $c_{vel, cost}$: A quadratic cost on the joint velocities that were reached | ||
during the execution ( $c_{vel} = \dot{q}^T Q \dot{q}$, with Q = identity) | ||
|
||
These criteria are used to calculate the overall Real AI Score with the formula | ||
|
||
$$ | ||
\begin{equation} | ||
S = c_{success} \left( 1 - | ||
\sum_{i \in \{ 'time', 'energy', '\tau, cost', '\tau, smooth', 'vel, cost' \}} | ||
\tanh \left(\frac{c_{i}}{n_{i}}\right)\right) | ||
\end{equation} | ||
$$ | ||
|
||
The weights and normalizations are: | ||
|
||
| Criterion | normalization $n$ | | ||
| ------------------|-------------------| | ||
| Swingup Time | 10.0 | | ||
| Energy | 60.0 | | ||
| Torque Cost | 20 | | ||
| Torque Smoothness | 1.0 | | ||
| Velocity Cost | 400.0 | | ||
|
||
The listed number for swingup time, energy, etc. in the leaderboard are the | ||
scores for from the best attempt. The 'Best RealAIScore' is the score of that | ||
attempt. The 'Average RealAIScore' is the average score over 10 attempts. | ||
Unsuccessful swingups (where the end effector is not above the threshhold line | ||
in the end of the experiment (i.e. after $10\,\text{s}$)) have a score of 0. | ||
|
||
## Participating | ||
|
||
If you want to participate in this leaderboard with your own controller have a | ||
look at the [leaderboard | ||
explanation](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/leaderboard/real_hardware/acrobot) | ||
in the double pendulum repository. We recommend submitting the controller | ||
first to the acrobot [simulation | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/acrobot_simulation_performance_leaderboard.html) | ||
and the [robustness | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/acrobot_simulation_robustness_leaderboard.html). | ||
Experiments with the real hardware can be conducted remotely. Please contact | ||
[email protected], [email protected] or [email protected] for details | ||
and scheduling. The leaderboard is automatically | ||
periodically updated based on the recorded data which is uploaded to this leaderboard repository. |
114 changes: 114 additions & 0 deletions
114
leaderboards/acrobot_real_system_perturbation_leaderboard_v2.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
## Rules | ||
|
||
The real system leaderboard compares the performance of different control | ||
methods on the real hardware. The task for the controller is to swingup and balance | ||
the acrobot and keep the end-effector above the threshold line. | ||
|
||
During the execution external perturbations in the form of Gaußian torque peaks | ||
have been applied by the motors at both links. In total there were 4 | ||
perturbations, 2 at each joint, with a standard deviation between 0.05s and | ||
0.1s and an amplitude between 0.5Nm and 0.75Nm. The perturbations are randomly | ||
generated and different in all 10 trials. | ||
|
||
<div align="center"> | ||
<img width="400" src="https://raw.githubusercontent.com/dfki-ric-underactuated-lab/real_ai_gym_leaderboard/main/data/acrobot/real_system_perturbed/ilqr_tvlqr_lqr/experiment09/video.gif"> | ||
<figcaption>Videos from left to right: TVLQR</figcaption> | ||
</div> | ||
|
||
The model parameters identified by us with a least squares optimization of the | ||
acrobot are: | ||
|
||
- Gravity: $g = 9.81 \, \frac{\text{m}}{\text{s}^2}$ | ||
- First Link mass: $m_1 = 0.523 \, \text{kg}$ | ||
- Second Link mass: $m_2 = 0.625 \, \text{kg}$ | ||
- First actuator length: $l_1 = 0.2 \, \text{m}$ | ||
- Second actuator length: $l_2 = 0.3 \, \text{m}$ | ||
- First link center of mass: $r_1 = 0.2 \, \text{m}$ | ||
- Second link center of mass: $r_2: 0.255 \, \text{m}$ | ||
- First link inertia: $I_1 = 0.031 \, \text{kg m}^2$ | ||
- Second link inertia: $I_2 = 0.050 \, \text{kg m}^2$ | ||
- First actuator damping friction: $b_1 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- Second actuator damping friction: $b_2 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- First actuator coulomb friction: $c_{f1}: 0.16 \, \text{N m}$ | ||
- Second actuator coulomb friction: $c_{f2}: 0.12 \, \text{N m}$ | ||
- Motor Inertia: $I_r = 6.28e-05 \, \text{kg m}^2$ | ||
- Gear ratio: $g_r: 6.0$ | ||
- First Motor Torque limit $\tau_{max1} = 0.0 \, \text{N m}$ | ||
- Second Motor Torque limit $\tau_{max2} = 6.0 \, \text{N m}$ | ||
|
||
More information about the dynamic model of the double pendulum can be found | ||
here: [Double Pendulum | ||
Dynamics](https://dfki-ric-underactuated-lab.github.io/double_pendulum/dynamics.html). | ||
For a urdf file with this model see here: [URDF](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/data/system_identification/identified_parameters/design_C.1/model_1.0). | ||
|
||
The $0.5\,\text{Nm}$ torque limit on the passive joint can be used to compensate the | ||
friction of the motor. | ||
|
||
The actuators can be controlled with arbitrary control frequency of up to $500\, \text{Hz}$ | ||
and the experiment takes $10\,\text{s}$. | ||
The initial acrobot configuration | ||
is $x_0 = (0, 0, 0, 0)$ (hanging down) and the goal is the unstable | ||
fixpoint at the upright configuration $x_g = (\pi, 0, 0, 0)$. | ||
The upright position is considered to be reached when the end-effector is above | ||
the threshold line at $h=0.45 \, \text{m}$ (origin at the mounting point). | ||
|
||
## Scores | ||
|
||
For the evaluation multiple criteria are evaluated and weighted to calculate an | ||
overall score (Real AI Score). The criteria are: | ||
|
||
- **Swingup Success** $c_{success}$: Whether the swingup was successful, i.e. if the | ||
end-effector is above the threshold line in the end of the simulation. | ||
- **Swingup time** $c_{time}$: The time it takes for the acrobot to reach the goal region | ||
above the threhhold line and *stay there*. If the end-effector enters the | ||
goal region but falls below the line before the simulation time is over the | ||
swingup is not considered successful! The swingup time is the time when the | ||
end-effector enters the goal region and does not leave the region until the end. | ||
- **Energy** $c_{energy}$: The mechanical energy used during the execution. | ||
- **Torque Cost** $c_{\tau, cost}$: A quadratic cost on the used torques ( $c_{\tau, cost} = \sum u^TRu$, with R | ||
= 1). | ||
- **Torque Smoothness** $c_{\tau, smooth}$: The standard deviation of the changes in the torque | ||
signal. | ||
- **Velocity Cost** $c_{vel, cost}$: A quadratic cost on the joint velocities that were reached | ||
during the execution ( $c_{vel} = \dot{q}^T Q \dot{q}$, with Q = identity) | ||
|
||
These criteria are used to calculate the overall Real AI Score with the formula | ||
|
||
$$ | ||
\begin{equation} | ||
S = c_{success} \left( 1 - | ||
\sum_{i \in \{ 'time', 'energy', '\tau, cost', '\tau, smooth', 'vel, cost' \}} | ||
\tanh \left(\frac{c_{i}}{n_{i}}\right)\right) | ||
\end{equation} | ||
$$ | ||
|
||
The weights and normalizations are: | ||
|
||
| Criterion | normalization $n$ | | ||
| ------------------|-------------------| | ||
| Swingup Time | 20.0 | | ||
| Energy | 60.0 | | ||
| Torque Cost | 20 | | ||
| Torque Smoothness | 1.0 | | ||
| Velocity Cost | 400.0 | | ||
|
||
The listed number for swingup time, energy, etc. in the leaderboard are the | ||
scores for from the best attempt. The 'Best RealAIScore' is the score of that | ||
attempt. The 'Average RealAIScore' is the average score over 10 attempts. | ||
Unsuccessful swingups (where the end effector is not above the threshhold line | ||
in the end of the experiment (i.e. after $10\,\text{s}$)) have a score of 0. | ||
|
||
## Participating | ||
|
||
If you want to participate in this leaderboard with your own controller have a | ||
look at the [leaderboard | ||
explanation](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/leaderboard/real_hardware/acrobot) | ||
in the double pendulum repository. We recommend submitting the controller | ||
first to the acrobot [simulation | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/acrobot_simulation_performance_leaderboard.html) | ||
and the [robustness | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/acrobot_simulation_robustness_leaderboard.html). | ||
Experiments with the real hardware can be conducted remotely. Please contact | ||
[email protected], [email protected] or [email protected] for details | ||
and scheduling. The leaderboard is automatically | ||
periodically updated based on the recorded data which is uploaded to this leaderboard repository. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
## Rules | ||
|
||
The real system leaderboard compares the performance of different control | ||
methods on the real hardware. The task for the controller is to swingup and balance | ||
the pendubot and keep the end-effector above the threshold line. | ||
|
||
<div align="center"> | ||
<img width="400" src="https://raw.githubusercontent.com/dfki-ric-underactuated-lab/real_ai_gym_leaderboard/main/data/pendubot/real_system/ilqr_tvlqr_lqr/experiment03/video.gif"> | ||
<img width="400" src="https://raw.githubusercontent.com/dfki-ric-underactuated-lab/real_ai_gym_leaderboard/main/data/pendubot/real_system/MC-PILCO/experiment08/video.gif"> | ||
<figcaption>Videos from left to right: TVLQR, MC-PILCO</figcaption> | ||
</div> | ||
|
||
The model parameters identified by us with a least squares optimization of the | ||
pendubot are: | ||
|
||
- Gravity: $g = 9.81 \, \frac{\text{m}}{\text{s}^2}$ | ||
- First Link mass: $m_1 = 0.523 \, \text{kg}$ | ||
- Second Link mass: $m_2 = 0.625 \, \text{kg}$ | ||
- First actuator length: $l_1 = 0.2 \, \text{m}$ | ||
- Second actuator length: $l_2 = 0.3 \, \text{m}$ | ||
- First link center of mass: $r_1 = 0.2 \, \text{m}$ | ||
- Second link center of mass: $r_2: 0.255 \, \text{m}$ | ||
- First link inertia: $I_1 = 0.031 \, \text{kg m}^2$ | ||
- Second link inertia: $I_2 = 0.050 \, \text{kg m}^2$ | ||
- First actuator damping friction: $b_1 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- Second actuator damping friction: $b_2 = 0.001 \, \frac{\text{kg m}}{s}$ | ||
- First actuator coulomb friction: $c_{f1}: 0.16 \, \text{N m}$ | ||
- Second actuator coulomb friction: $c_{f2}: 0.12 \, \text{N m}$ | ||
- Motor Inertia: $I_r = 6.28e-05 \, \text{kg m}^2$ | ||
- Gear ratio: $g_r: 6.0$ | ||
- First Motor Torque limit $\tau_{max1} = 0.0 \, \text{N m}$ | ||
- Second Motor Torque limit $\tau_{max2} = 6.0 \, \text{N m}$ | ||
|
||
More information about the dynamic model of the double pendulum can be found | ||
here: [Double Pendulum | ||
Dynamics](https://dfki-ric-underactuated-lab.github.io/double_pendulum/dynamics.html). | ||
For a urdf file with this model see here: [URDF](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/data/system_identification/identified_parameters/design_C.1/model_1.0). | ||
|
||
The $0.5\,\text{Nm}$ torque limit on the passive joint can be used to compensate the | ||
friction of the motor. | ||
|
||
The actuators can be controlled with arbitrary control frequency of up to $500\, \text{Hz}$ | ||
and the experiment takes $10\,\text{s}$. | ||
The initial pendubot configuration | ||
is $x_0 = (0, 0, 0, 0)$ (hanging down) and the goal is the unstable | ||
fixpoint at the upright configuration $x_g = (\pi, 0, 0, 0)$. | ||
The upright position is considered to be reached when the end-effector is above | ||
the threshold line at $h=0.45 \, \text{m}$ (origin at the mounting point). | ||
|
||
## Scores | ||
|
||
For the evaluation multiple criteria are evaluated and weighted to calculate an | ||
overall score (Real AI Score). The criteria are: | ||
|
||
- **Swingup Success** $c_{success}$: Whether the swingup was successful, i.e. if the | ||
end-effector is above the threshold line in the end of the simulation. | ||
- **Swingup time** $c_{time}$: The time it takes for the pendubot to reach the goal region | ||
above the threhhold line and *stay there*. If the end-effector enters the | ||
goal region but falls below the line before the simulation time is over the | ||
swingup is not considered successful! The swingup time is the time when the | ||
end-effector enters the goal region and does not leave the region until the end. | ||
- **Energy** $c_{energy}$: The mechanical energy used during the execution. | ||
- **Torque Cost** $c_{\tau, cost}$: A quadratic cost on the used torques ( $c_{\tau, cost} = \sum u^TRu$, with R | ||
= 1). | ||
- **Torque Smoothness** $c_{\tau, smooth}$: The standard deviation of the changes in the torque | ||
signal. | ||
- **Velocity Cost** $c_{vel, cost}$: A quadratic cost on the joint velocities that were reached | ||
during the execution ( $c_{vel} = \dot{q}^T Q \dot{q}$, with Q = identity) | ||
|
||
These criteria are used to calculate the overall Real AI Score with the formula | ||
|
||
$$ | ||
\begin{equation} | ||
S = c_{success} \left( 1 - | ||
\sum_{i \in \{ 'time', 'energy', '\tau, cost', '\tau, smooth', 'vel, cost' \}} | ||
\tanh \left(\frac{c_{i}}{n_{i}}\right)\right) | ||
\end{equation} | ||
$$ | ||
|
||
The weights and normalizations are: | ||
|
||
| Criterion | normalization $n$ | | ||
| ------------------|-------------------| | ||
| Swingup Time | 10.0 | | ||
| Energy | 60.0 | | ||
| Torque Cost | 20 | | ||
| Torque Smoothness | 1.0 | | ||
| Velocity Cost | 400.0 | | ||
|
||
The listed number for swingup time, energy, etc. in the leaderboard are the | ||
scores for from the best attempt. The 'Best RealAIScore' is the score of that | ||
attempt. The 'Average RealAIScore' is the average score over 10 attempts. | ||
Unsuccessful swingups (where the end effector is not above the threshhold line | ||
in the end of the experiment (i.e. after $10\,\text{s}$)) have a score of 0. | ||
|
||
## Participating | ||
|
||
If you want to participate in this leaderboard with your own controller have a | ||
look at the [leaderboard | ||
explanation](https://github.com/dfki-ric-underactuated-lab/double_pendulum/tree/main/leaderboard/real_hardware/pendubot) | ||
in the double pendulum repository. We recommend submitting the controller | ||
first to the pendubot [simulation | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/pendubot_simulation_performance_leaderboard.html) | ||
and the [robustness | ||
leaderboard](https://dfki-ric-underactuated-lab.github.io/real_ai_gym_leaderboard/pendubot_simulation_robustness_leaderboard.html). | ||
Experiments with the real hardware can be conducted remotely. Please contact | ||
[email protected], [email protected] or [email protected] for details | ||
and scheduling. The leaderboard is automatically | ||
periodically updated based on the recorded data which is uploaded to this leaderboard repository. |
Oops, something went wrong.