forked from RussTedrake/underactuated
-
Notifications
You must be signed in to change notification settings - Fork 0
/
sysid.html
1663 lines (1355 loc) · 90.1 KB
/
sysid.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<title>Ch. 18 - System Identification</title>
<meta name="Ch. 18 - System Identification" content="text/html; charset=utf-8;" />
<link rel="canonical" href="http://underactuated.mit.edu/sysid.html" />
<script src="https://hypothes.is/embed.js" async></script>
<script type="text/javascript" src="chapters.js"></script>
<script type="text/javascript" src="htmlbook/book.js"></script>
<script src="htmlbook/mathjax-config.js" defer></script>
<script type="text/javascript" id="MathJax-script" defer
src="htmlbook/MathJax/es5/tex-chtml.js">
</script>
<script>window.MathJax || document.write('<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js" defer><\/script>')</script>
<link rel="stylesheet" href="htmlbook/highlight/styles/default.css">
<script src="htmlbook/highlight/highlight.pack.js"></script> <!-- http://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#language-names-and-aliases -->
<script>hljs.initHighlightingOnLoad();</script>
<link rel="stylesheet" type="text/css" href="htmlbook/book.css" />
</head>
<body onload="loadChapter('underactuated');">
<div data-type="titlepage">
<header>
<h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a></h1>
<p data-type="subtitle">Algorithms for Walking, Running, Swimming, Flying, and Manipulation</p>
<p style="font-size: 18px;"><a href="http://people.csail.mit.edu/russt/">Russ Tedrake</a></p>
<p style="font-size: 14px; text-align: right;">
© Russ Tedrake, 2023<br/>
Last modified <span id="last_modified"></span>.</br>
<script>
var d = new Date(document.lastModified);
document.getElementById("last_modified").innerHTML = d.getFullYear() + "-" + (d.getMonth()+1) + "-" + d.getDate();</script>
<a href="misc.html">How to cite these notes, use annotations, and give feedback.</a><br/>
</p>
</header>
</div>
<p><b>Note:</b> These are working notes used for <a
href="https://underactuated.csail.mit.edu/Spring2023/">a course being taught
at MIT</a>. They will be updated throughout the Spring 2023 semester. <a
href="https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg">Lecture videos are available on YouTube</a>.</p>
<table style="width:100%;"><tr style="width:100%">
<td style="width:33%;text-align:left;"><a class="previous_chapter" href=contact.html>Previous Chapter</a></td>
<td style="width:33%;text-align:center;"><a href=index.html>Table of contents</a></td>
<td style="width:33%;text-align:right;"><a class="next_chapter" href=state_estimation.html>Next Chapter</a></td>
</tr></table>
<script type="text/javascript">document.write(notebook_header('sysid'))
</script>
<!-- EVERYTHING ABOVE THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->
<chapter style="counter-reset: chapter 17"><h1>System Identification</h1>
<p>My primary focus in these notes has been to build algorithms that design
of analyze a control system <i>given a model</i> of the plant. In fact, we
have in some places gone to great lengths to understand the structure in our
models (the structure of the manipulator equations in particular) and tried
to write algorithms which exploit that structure.</p>
<p>Our ambitions for our robots have grown over recent years to where it
makes sense to question this assumption. If we want to program a robot to
fold laundry, spread peanut butter on toast, or make a salad, then we should
absolutely not assume that we are simply given a model (and the ability to
estimate the state of that model). This has led many researchers to focus on
the "model-free" approaches to optimal control that are popular in
reinforcement learning. But I worry that the purely model-free approach is
"throwing the baby out with the bathwater". We have fantastic tools for
making long-term decisions given a model; the model-free approaches are
correspondingly much much weaker.</p>
<p>So in this chapter I would like to cover the problem of learning a model.
This is far from a new problem. The field of "system identification" is as
old as controls itself, but new results from machine learning have added
significantly to our algorithms and our analyses, especially in the
high-dimensional and finite-sample regimes. But well before the recent
machine learning boom, system identification as a field had a very strong
foundation with thorough statistical understanding of the basic algorithms,
at least in the asymptotic regime (the limit where the amount of data goes to
infinity). Machine learning theory has brought a wealth of new results in
the online optimization and finite-sample regimes. My goal for this chapter
is to establish these foundations, and to provide some pointers so that you
can learn more about this rich topic.</p>
<section><h1>Problem formulation</h1>
<subsection><h1>Equation error vs simulation error</h1>
<figure><img width="80%"" src="data/sysid.svg"/></figure>
<p>Our problem formulation inevitably begins with the data. In practice,
if we have access to a physical system, instrumented using digital
electronics, then we have a system in which we can apply input commands,
$\bu_n$, at some discrete rate, and measure the outputs, $\by_n$ of the
system at some discrete rate. We normally assume these rates our fixed,
and often attempt to fit a state-space model of the form \begin{equation}
\bx[n+1] = f_\balpha(\bx[n], \bu[n]), \qquad \by[n] =
g_\balpha(\bx[n],\bu[n]), \label{eq:ss_model}\end{equation} where I have
used $\balpha$ again here to indicate a vector of parameters. In this
setting, a natural formulation is to minimize a least-squares estimation
objective: $$\min_{\alpha,\bx[0]} \sum_{n=0}^{N-1} \| \by[n] - \by_n
\|^2_2, \qquad \subjto \, (\ref{eq:ss_model}).$$ I have written purely
deterministic models to start, but in general we expect both the state
evolution and the measurements to have randomness. Sometimes, as I
have written, we fit a deterministic model to the data and rely on our
least-squares objective to capture the errors; more generally we will look
at fitting stochastic models to the data.</p>
<p>We often separate the identification procedure into two parts, were we
first estimate the state $\hat{\bx}_n$ given the input-output data $\bu_n,
\by_n$, and then focus on estimating the state-evolution dynamics in a
second step. The dynamics estimation algorithms fall into two main
categories: <ul><li><i>Equation error</i> minimizes only the one-step
prediction error: $$\min_{\alpha} \sum_{n=0}^{N-2} \| f_\balpha(\hat\bx_n,
\bu_n) - \hat{\bx}_{n+1} \|^2_2.$$ </li><li><i>Simulation error</i>
captures the long-term prediction error: $$\min_{\alpha} \sum_{n=1}^{N-1}
\| \bx[n] - \hat{\bx}_n \|^2_2, \qquad \subjto \quad \bx[n+1] =
f_\balpha(\bx[n], \bu_n),\, \bx[0] = \hat\bx_0,$$ </li></ul> The
equation-error formulations often result in much more tractable
optimization problems, but unfortunately we will see that optimizing the
one-step error can still result in arbitrarily large simulation errors.
Therefore, we generally consider the simulation error to be the true
objective we hope to optimize, and the equation error only as a potentially
useful surrogate.</p>
<p>Within this framework of model learning, we will face all of the
standard questions from supervised learning about sample efficiency and
generalization.</p>
<!--
<subsection><h1>Preprocessing your data</h1>
<p>Many of the techniques below can benefit from some basic data
preprocessing techniques. The <a
href="https://www.mathworks.com/help/ident/ug/ways-to-prepare-data-for-system-identification.html">MATLAB
System Identification Toolbox documentation</a> has some useful advice
(with emphasis on the fitting of linear models). </p>
</subsection>
-->
</subsection>
<subsection><h1>Online optimization</h1>
<p>Theoretical machine learning has brought a number of fresh
perspectives to this old problem of system identification. Perhaps most
relevant is the perspective of online optimization
<elib>Rakhlin14+Hazan16+Rakhlin22</elib>. In addition to changing the
formulation to considering streaming online data (as opposed to offline,
batch analysis), this field has (re)popularized the notion of an old idea
from decision theory: using <a
href="https://en.wikipedia.org/wiki/Regret_(decision_theory)"><i>regret</i></a>
as the loss function for identification.</p>
<p>In the regret formulation, we acknowledge that the class of
parameterized models that we are searching over can likely not perfectly
represent the input-output data. Rather than score ourselves directly on
the prediction error, we can score ourselves relative to the predictions
that would be made by best parameters in our model class. For
equation-error formulations, the regret $R$ at step $N$ would take the
form $$R[N] = \sum_{n=0}^{N-2} \| f_\balpha(\hat\bx_n, \bu_n) -
\hat{\bx}_{n+1} \|^2_2 - \min_{\alpha^*} \sum_{n=0}^{N-2} \|
f_{\alpha^*}(\hat\bx_n, \bu_n) - \hat{\bx}_{n+1} \|^2_2.$$ The goal is to
produce an online learning algorithm to select $\alpha$ at each step in
order to drive the regret to zero. This turns out to be an important
distinction, as we will see below.</p>
</subsection>
<subsection><h1>Learning models for control</h1>
<p>So far we've described the problem purely as capturing
the input-output behavior of the system. If our goal is to use this model
for control, then that might not be quite the right objective for a
number of reasons.</p>
<p>First of, predicting the observations might be <i>more</i> than we
need for making optimal decisions; <elib>Zhang20a</elib> tells that story
nicely. Imagine you are trying to swing-up and balance a cart-pole
system, where the only sensor is a camera. The background images are
irrelevant; we know that only the state of the cart-pole should matter.
If there was a movie playing on a screen in the background, you don't
want to spend all of the representational power of your model predicting
the next frame of the movie, at the cost of not predicting as well the
next state of the cart-pole. These challenges have led to a nice active
area of research on learning <i>task-relevant</i> models/representations;
I'll devote a section to it below after we cover the basics.</p>
<p>There are other considerations, as well. To be useful for control,
we'll need the state of our model to be observable (from online
observations) and controllable. There may also be trade-offs in model
complexity. If we learn a just linear model of the dynamics then we have
very powerful control design tools available, but we might not capture
the rich dynamics of the world. If we learn too complex of a model, then
our online planning and control techniques might become a bottleneck.
<elib>Watter15</elib> makes that point nicely. At the time of this
writing, I think it's broadly fair to say that deep learning models are
able to describe incredibly rich dynamics, but that our control tools for
making decisions with these models are still painfully weak.</p>
</subsection>
</section>
<section id="lumped"><h1>Parameter Identification for Mechanical Systems</h1>
<p>My primary focus throughout these notes is on (underactuated) mechanical
systems, but when it comes to identification there is an important
distinction to make. For some mechanical systems we know the structure of
the model, including number of state variables and the topology of the
kinematic tree. Legged robots like Spot or Atlas are good examples here --
the dynamics are certainly nontrivial, but the general form of the
equations are known. In this case, the task of identification is really
the task of estimating the parameters in a structured model. That is the
subject of this section.</p>
<p>The examples of folding laundry or making a salad fall into a different
category. In those examples, I might not even know a priori the number of
state variables needed to provide a reasonable description of the behavior.
That will force a more general examination of the the identification
problem, which we will explore in the remaining sections.</p>
<p>Let's start with the problem of identifying a canonical underactuated
mechanical system, like an Acrobot, Cart-Pole or Quadrotor, where we know
the structure of the equations, but just need to fit the parameters. We
will further assume that we have the ability to directly observe all of the
state variables, albeit with noisy measurements (e.g. from joint sensors
and/or inertial measurement units). The stronger <a
href="state_estimation.html">state estimation algorithms</a> that we
will discuss soon assume a model, so we typically do not use them directly
here.</p>
<p>Consider taking a minute to review the <a
href="multibody.html#double_pendulum">example of deriving the manipulator
equations for the double pendulum</a> before we continue.</p>
<subsection><h1>Kinematic parameters and calibration</h1>
<p>We can separate the parameters in the multibody equations again into
kinematic parameters and dynamic parameters. The kinematic parameters,
like link lengths, describe the coordinate transformation from one joint
to another joint in the kinematic tree. It is certainly possible to
write an optimization procedure to calibrate these parameters; you can
find a fairly thorough discussion in e.g. Chapter 11 of
<elib>Khalil04</elib>. But I guess I'm generally of the opinion that if
you don't have accurate estimates of your link lengths, then you should
probably invest in a tape measure before you invest in nonlinear
optimization.</p>
<p>One notable exception to this is calibration with respect to joint
offsets. This one can be a real nuisance in practice. Joint sensors can
slip, and some robots even use relative rotary encoders, and rely on
driving the joint to some known hard joint limit each time the robot is
powered on in order to obtain the offset. I've worked on one humanoid
robot that had a quite elaborate and painful kinematic calibration
procedure which involve fitting additional hardware over the joints to
ensure they were in a known location and then running a script. Having a
an expensive and/or unreliable calibration procedure can put a damper on
any robotics project. For underactuated systems, in particular, it can
have a dramatic effect on performance.</p>
<example><h1>Acrobot balancing with calibration error</h1>
<p>Small kinematic calibration errors can lead to large steady-state
errors when attempting to stabilize a system like the Acrobot. I've put together a simple notebook to show the effect here:</p>
<script>document.write(notebook_link('sysid'))</script>
<todo>Make a plot of steady-state error as a function of offset error.</todo>
<p>Our tools from robust / stochastic control are well-suited to
identifying (and bounding / minimizing) these sensitivities, at least
for the linearized model we use in LQR.</p>
</example>
<p>The general approach to estimating joint offsets from data is to write
the equations with the joint offset as a parameter, e.g. for the <a
href="multibody.html#double_pendulum">double pendulum</a> we would write
the forward kinematics as: $${\bf p}_1 =l_1\begin{bmatrix} \sin(\theta_1
+ \bar\theta_1) \\ - \cos(\theta_1 + \bar\theta_1) \end{bmatrix},\quad
{\bf p}_2 = {\bf p}_1 + l_2\begin{bmatrix} \sin(\theta_1 +
\bar\theta_1 + \theta_2 + \bar\theta_2) \\ - \cos(\theta_1 + \bar\theta_1
+ \theta_2 + \bar\theta_2) \end{bmatrix}.$$ We try to obtain independent
measurements of the end-effector position (e.g. from motion capture, from
perhaps some robot-mounted cameras, or from some mechanical calibration
rig) with their corresponding joint measurements, to obtain data points
of the form $ \langle {\bf p}_2, \theta_1, \theta_2 \rangle$. Then we can
solve a small nonlinear optimization problem to estimate the joint
offsets to minimize a least-squares residual.</p>
<p>If independent measurements of the kinematics are not available, it it
possible to estimate the offsets along with the dynamic parameters, using
the trigonometric identities, e.g. $s_{\theta + \bar\theta} = s_\theta
c_\bar\theta + c_\theta s_\bar\theta,$ and then including the
$s_\bar\theta, c_\bar\theta$ terms (separately) in the "lumped
parameters" we discuss below.</p>
</subsection>
<subsection><h1>Least-squares formulation (of the inverse dynamics).</h1>
<p>Now let's thinking about estimating the dynamic parameters of
multibody system. We've been writing the manipulation equations in the
form: \begin{equation}\bM({\bq})\ddot{\bq} + \bC(\bq,\dot{\bq})\dot\bq =
\btau_g(\bq) + \bB\bu + \text{friction, etc.}\end{equation} Each of the
terms in this equation can depend on the parameters $\balpha$ that we're
trying to estimate. But the parameters enter the multibody equations in
a particular structured way: the equations are
<i>affine in the <b>lumped parameters</b></i>. More precisely, the
manipulator equations above can be factored into the form $${\bf
W}(\bq,\dot{\bq}, \ddot{\bq}, \bu) \balpha_l(\balpha) +
\bw_0(\bq,\dot{\bq}, \ddot{\bq}, \bu) = 0,$$ where $\balpha_l$ are the
"lumped parameters". We sometimes refer to ${\bf W}$ as the "data matrix".</p>
<example><h1>Lumped parameters for the simple pendulum</h1>
<p>The now familiar equations of motion for the simple pendulum are
$$ml^2 \ddot\theta + b \dot\theta + mgl\sin\theta = \tau.$$ For
parameter estimation, we will factor this into $$\begin{bmatrix}
\ddot\theta & \dot\theta & \sin\theta \end{bmatrix} \begin{bmatrix}
ml^2 \\ b \\ mgl \end{bmatrix} - \tau = 0.$$ The terms $ml^2$, $b$,
and $mgl$ together form the "lumped parameters".</p>
</example>
<p>It is worth taking a moment to reflect on this factorization. First
of all, it does represent a somewhat special statement about the
multibody equations: the nonlinearities enter only in a particular way.
For instance, if I had terms in the equations of the form, $\sin(m
\theta)$, then I would <i>not</i> be able to produce an affine
decomposition separating $m$ from $\theta$. Fortunately, that doesn't
happen in our mechanical systems <elib>Khalil04</elib>. Furthermore,
this structure is particular to the <i>inverse dynamics</i>, as we have
written here. If you were to write the forward dynamics, multiplying by
$\bM^{-1}$ in order to solve for $\ddot{\bq}$, then once again you would
destroy this affine structure.</p>
<p>This is super interesting! It is tempting to thing about parameter
estimation for general dynamical systems in our standard state-space
form: $\bx[n+1] = f_\balpha(\bx[n], \bu[n]).$ But for multibody systems,
it seems that this would be the wrong thing to do, as it destroys this
beautiful affine structure.</p>
<example><h1>Multibody parameters in <drake></drake></h1>
<p>Very few robotics simulators have any way for you to access the
parameters of the dynamics. In Drake, we explicitly declare all of the
parameters of a multibody system in a separate data structure to make
them available, and we can leverage Drake's symbolic engine to extract
and manipulate the equations with respect to those variables.
</p>
<p>As a simple example, I've loaded the cart-pole system model from
URDF, created a symbolic version of the <code>MultibodyPlant</code>,
and populated the <code>Context</code> with symbolic variables for the
quantities of interest. Then I can evaluate the (inverse) dynamics in
order to obtain my equations.</p>
<script>document.write(notebook_link('sysid'))</script>
<p>The output looks like: <pre><code>Symbolic dynamics:
(0.10000000000000001 * v(0) - u(0) + (pow(v(1), 2) * mp * l * sin(q(1))) + (vd(0) * mc) + (vd(0) * mp) - (vd(1) * mp * l * cos(q(1))))
(0.10000000000000001 * v(1) - (vd(0) * mp * l * cos(q(1))) + (vd(1) * mp * pow(l, 2)) + 9.8100000000000005 * (mp * l * sin(q(1))))
</code></pre></p>
<p>Go ahead and compare these with the <a
href="acrobot.html#cart_pole">cart-pole equations</a> that we derived
by hand.</p>
<p>Drake offers a method <a
href="https://drake.mit.edu/doxygen_cxx/namespacedrake_1_1symbolic.html#ae8c85e424b3109ed84a5bb309238bc3c"><code>DecomposeLumpedParameters</code></a>
that will take this expression and factor it into the affine expression
above. For this cart-pole example, it extracts the lumped parameters
$[ m_c + m_p, m_p l, m_p l^2 ].$</p>
</example>
<p>The existence of the lumped-parameter decomposition reveals that the
<i>equation error</i> for lumped-parameter estimation, with the error
taken in the torque coordinates, can be solved using least squares. As
such, we can leverage all of the strong results and variants from linear
estimation. For instance, we can add terms to regularize the estimate
(e.g. to stay close to an initial guess), and we can write efficient
recursive estimators for optimal online estimation of the parameters
using recursive least-squares. My favorite recursive least-squares
algorithm uses incremental QR factorization<elib>Kaess08</elib>.</p>
<p>Importantly, because we have reduced this to a least-squares problem,
we can also understand when it will <i>not</i> work. In particular, it
is quite possible that some parameters cannot be estimated from any
amount of joint data taken on the robot. As a simple example, consider a
robotic arm bolted to a table; the inertial parameters of the first link
of the robot will not be identifiable from any amount of joint data. Even
on the second link, only the inertia relative to the first joint axis
will be identifiable; the inertial parameters corresponding to the other
dimensions will not. In our least-squares formulation, this is quite
easy to understand: we simply check the (column) rank of the data matrix,
${\bf W}$. In particular, we can extract the <b>identifiable lumped
parameters</b> by using, e.g., $\bR_1\alpha_l$ from the QR factorization:
$${\bf W} = \begin{bmatrix} \bQ_1 & \bQ_2 \end{bmatrix} \begin{bmatrix}
\bR_1 \\ 0 \end{bmatrix}, \quad \Rightarrow \quad {\bf W}\balpha_l =
\bQ_1 (\bR_1 \alpha_l).$$</p>
<example><h1>Parameter estimation for the Cart-Pole</h1>
<p>Having extracted the lumped parameters from the URDF file above, we
can now take this to fruition. I've kept the example simple: I've
simulated the cart-pole for a few seconds running just simple sine wave
trajectories at the input, then constructed the data matrix and
performed the least squares fit.</p>
<script>document.write(notebook_link('sysid'))</script>
<p>The output looks like this: <pre><code>Estimated Lumped Parameters:
(mc + mp). URDF: 11.0, Estimated: 10.905425349337081
(mp * l). URDF: 0.5, Estimated: 0.5945797067753813
(mp * pow(l, 2)). URDF: 0.25, Estimated: 0.302915745122919</code></pre>
Note that we could have easily made the fit more accurate with more
data (or more carefully selected data).</p>
</example>
<p>Should we be happy with only estimating the (identifiable) lumped
parameters? Isn't it the true original parameters that we are after? The
linear algebra decomposition of the data matrix (assuming we apply it to
a sufficiently rich set of data), is actually revealing something
fundamental for us about our system dynamics. Rather than feel
disappointed that we cannot accurately estimate some of the parameters,
we should embrace that <i>we don't need to estimate those parameters</i>
for any of the dynamic reasoning we will do about our equations
(simulation, verification, control design, etc). The identifiable lumped
parameters are precisely the subset of the lumped parameters that we
need.</p>
<p>For practical reasons, it might be convenient to take your estimates
of the lumped parameters, and try to back out the original parameters
(for instance, if you want to write them back out into a URDF file). For
this, I would recommend a final post-processing step that e.g. finds the
parameters $\hat{\balpha}$ that are as close as possible (e.g. in the
least-squares sense) to your original guess for the parameters, subject
to the nonlinear constraint that $\bR_1 \balpha_l(\hat{\balpha})$ matches
the estimated identifiable lumped parameters.
</p>
<p>There are still a few subtleties worth considering, such as how we
parameterize the inertial matrices. Direct estimation of the naive
parameterization, the six entries of a symmetric 3x3 matrix, can lead to
non-physical inertial matrices. <elib>Wensing17a</elib>
describes a parameter estimation formulation that includes a convex
formulation of the physicality constraints between these parameters.</p>
</subsection>
<subsection><h1>Identification using energy instead of inverse
dynamics.</h1>
<p>In addition to leveraging tools from linear algebra, there are a
number of other refinements to the basic recipe that leverage our
understanding of mechanics. One important example is the "energy
formulations" of parameter estimation <elib>Gautier97</elib>.</p>
<p>We have already observed that evaluating the equation error in torque
space (inverse dynamics) is likely better than evaluating it in state
space space (forward dynamics), because we can factorized the inverse
dynamics. But this property is not exclusive to inverse dynamics. The
total energy of the system (kinetic + potential) is also affine in the
lumped parameters. We can use the relation
$$\dot{E}(\bq,\dot\bq,\ddot\bq) = \dot\bq^T (\bB\bu + \text{ friction,
...}).$$</p>
<p>Why might we prefer to work in energy coordinates rather than torque?
The differences are apparent in the detailed numerics. In the torque
formulation, we find ourselves using $\ddot\bq$ directly. Conventional
wisdom holds that joint sensors can reliably report joint positions and
velocities, but that joint accelerations, often obtained by numerically
differentiating <i>twice</i>, tend to be much more noisy. In some cases,
it might be numerically better to apply finite differences to the total
energy of the system rather than to the individual joints †.
<sidenote>† Sometimes these methods are written as the numerical
integration of the power input, rather than the differentiation of the
total energy, but the numerics should be no different.</sidenote>
<elib>Gautier96</elib> provides a study of various filtering formulations
which leads to a recommendation in <elib>Gautier97</elib>
that the energy formulation tends to be numerically superior.</p>
</subsection>
<subsection><h1>Residual physics models with linear function
approximators</h1>
<p>The term "residual physics" has become quite popular recently (e.g.
<elib>Zeng20</elib>) as people are looking for ways to combine the
modeling power of deep neural networks with our best tools from
mechanics. But actually this idea is quite old, and there is a natural
class of residual models that fit very nicely into our least-squares
lumped-parameter estimation. Specifically, we can consider models of the
form: \begin{equation}\bM({\bq})\ddot{\bq} + \bC(\bq,\dot{\bq})\dot\bq =
\btau_g(\bq) + \balpha_r {\bPhi}(\bq,\dot\bq) + \bB\bu + \text{friction,
etc.},\end{equation} with $\balpha_r$ the additional parameters of the
residual and $\bPhi$ a set of (fixed) nonlinear basis functions. The
hope is that these residual models can capture any "slop terms" in the
dynamics that are predictable, but which we did not include in our
original parametric model. Nonlinear friction and aerodynamic drag are
commonly cited examples.</p>
<p>Common choices for these basis functions, $\bPhi(\bq,\dot\bq)$,
for use with the manipulator equations include radial basis
functions<elib>Sanner91</elib> or wavelets
<elib>Sanner98</elib>. Although allowing the basis functions to depend
on $\ddot\bq$ or $\bu$ would be fine for the parameter estimation, we
tend to restrict them to $\bq$ and $\dot\bq$ to maintain some of the
other nice properties of the manipulator equations (e.g.
control-affine).</p>
<p>Due to the maturity of least-squares estimation, it is also possible
to use least-squares to effectively determine a subset of basis functions
that effectively describe the dynamics of your system. For example, in
<elib>Hoburg09a</elib>, we applied least-squares to a wide range of
physically-inspired basis functions in order to make a better ODE
approximation of the post-stall aerodynamics during perching, and
ultimately discarded all but the small number of basis functions that
best described the data. Nowadays, we could apply algorithms like <a href="https://en.wikipedia.org/wiki/Lasso_(statistics)">LASSO</a>
for least-squares regression with an $\ell_1$-regularization, or
<elib>Brunton16a</elib> uses an alternative based on
sequential thresholded least-squares.</p>
</subsection>
<subsection id="mbp_experiment_design"><h1>Experiment design as a trajectory optimization</h1>
<p>One key assumption for any claims about our parameter estimation
algorithms recovering the true identifiable lumped parameters is that the
data set was sufficiently rich; that the trajectories were
"parametrically exciting". Basically we need to assume that the
trajectories produced motion so that the data matrix, ${\bf W}$ contains
information about all of the identifiable lumped parameters. Thanks to
our linear parameterization, we can evaluate this via numerical linear
algebra on ${\bf W}$.
</p>
<p>Moreover, if we have an opportunity to change the trajectories that
the robot executes when collecting the data for parameter estimation,
then we can design trajectories which try to maximally excite the
parameters, and produce a numerically-well conditioned least squares
problem. One natural choice is to minimize the <a
href="https://en.wikipedia.org/wiki/Condition_number">condition
number</a> of the data matrix, ${\bf W}$. The condition number of a
matrix is the ratio of the largest to smallest singular values
$\frac{\sigma_{max}({\bf W})}{\sigma_{min}({\bf W})}$. The condition
number is always greater than one, by definition, and the lower the value
the better the condition. The condition number of the data matrix is a
nonlinear function of the data taken over the entire trajectory, but it
can still be optimized in a nonlinear trajectory optimization
(<elib>Khalil04</elib>, § 12.3.4).</p>
</subsection>
<subsection><h1>Online estimation and adaptive control</h1>
<p>The field of adaptive control is a huge and rich literature; many
books have been written on the topic (e.g <elib>Åström13</elib>). Allow
me to make a few quick references to that literature here.</p>
<p>I have mostly discussed parameter estimation so far as an offline
procedure, but one important approach to adaptive control is to perform
online estimation of the parameters as the controller executes. Since
our estimation objective can be linear in the (lumped) parameters, this
often amounts to the recursive least-squares estimation. To properly
analyze a system like this, we can think of the system as having an
augmented state space, $\bar\bx = \begin{bmatrix} \bx \\ \balpha
\end{bmatrix},$ and study the closed-loop dynamics of the state and
parameter evolution jointly. Some of the strongest results in adaptive
control for robotic manipulators are confined to fully-actuated
manipulators, but for instance <elib>Moore14</elib> gives a nice example
of analyzing an adaptive controller for underactuated systems using many
of the tools that we have been developing in these notes.</p>
<p>As I said, adaptive control is a rich subject. One of the biggest
lessons from that field, however, is that one may not need to achieve
convergence to the true (lumped) parameters in order to achieve a task.
Many of the classic results in adaptive control make guarantees about
task execution but explicitly do <i>not</i> require/guarantee convergence
in the parameters.</p>
</subsection>
<subsection><h1>Identification with contact</h1>
<p>Can we apply these same techniques to e.g. walking robots that are
making and breaking contact with the environment?</p>
<p>There is certainly a version of this problem that works immediately:
if we know the contact Jacobians and have measurements of the contact
forces, then we can add these terms directly into the manipulator
equations and continued with the least-squares estimation of the lumped
parameters, even including frictional parameters.</p>
<p>One can also study cases where the contact forces are not measured
directly. For instance, <elib>Fazeli17a</elib> studies the extreme case
of identifiability of the inertial parameters of a passive object with
and without explicit contact force measurements.</p>
<todo>Flesh this out a bit more... (maybe move it to the hybrid sysid subsection?)</todo>
</subsection>
</section>
<section><h1>Identifying (time-domain) linear dynamical systems</h1>
<p>If multibody parameter estimation forms the first relevant pillar of
established work in system identification, then identification of linear
systems forms the second. Linear models have been a primary focus of
system identification research since the field was established, and have
witnessed a resurgence in popularity during just the last few years as new
results from machine learning have contributed new bounds and convergence
results, especially in the finite-sample regime (e.g.
<elib>Hardt16+Hazan18+Oymak19+Simchowitz19</elib>).</p>
<p>A significant portion of the linear system identification literature
(e.g. <elib>Ljung99</elib>) is focused on identifying linear models in the
frequency domain. Indeed, transfer-function realizations provide important
insights and avoid some of the foibles that we'll need to address with
state-space realizations. However, I will focus my attention in this
chapter on time-domain descriptions of linear dynamical systems; some of
the lessons here are easier to generalize to nonlinear dynamics (plus we
unfortunately haven't built the foundations for the frequency domain
techniques yet in these notes). </p>
<subsection><h1>From state observations</h1>
<p>Let's start our treatment with the easy case: fitting a linear model
from direct (potentially noisy) measurements of the state. Fitting a
discrete-time model, $\bx[n+1] = \bA\bx[n] + \bB\bu[n] + \bw[n]$, to
sampled data $(\bu_n,\bx_n = \bx[n]+\bv[n])$ using the
<i>equation-error</i> objective is just another <i>linear</i>
least-squares problem. Typically, we form some data matrices and write
our least-squares problem as: \begin{gather*} {\bf X}' \approx
\begin{bmatrix} \bA & \bB \end{bmatrix} \begin{bmatrix} {\bf X} \\ {\bf
U} \end{bmatrix}, \qquad \text{where} \\ {\bf X} = \begin{bmatrix} \mid &
\mid & & \mid \\ \bx_0 & \bx_1 & \cdots & \bx_{N-2} \\ \mid & \mid & &
\mid \end{bmatrix}, \quad {\bf X}' = \begin{bmatrix} \mid & \mid & & \mid
\\ \bx_1 & \bx_2 & \cdots & \bx_{N-1} \\ \mid & \mid & & \mid
\end{bmatrix}, \quad {\bf U} = \begin{bmatrix} \mid & \mid & & \mid \\
\bu_0 & \bu_1 & \cdots & \bu_{N-2} \\ \mid & \mid & & \mid
\end{bmatrix}.\end{gather*} By the virtues of linear least squares, this
estimator is unbiased with respect to the uncorrelated process and/or
measurement noise, $\bw[n]$ and $\bv[n]$.</p>
<example><h1>Cart-pole balancing with an identified model</h1>
<p>I've provide a notebook demonstrating what a practical application
of linear identification might look like for the cart-pole system, in a
series of steps. First, I've designed an LQR balancing controller, but
using the <i>wrong</i> parameters for the cart-pole (I changed the
masses, etc). This LQR controller is enough to keep the cart-pole from
falling down, but it doesn't converge nicely to the upright. I wanted
to ask the question, can I use data generated from this experiment to
identify a better linear model around the fixed-point, and improve my
LQR controller?</p>
<p>Interestingly, the simple answer is "no". If you only collect the
input and state data from running this controller, you will see that
the least-squares problem that we formulate above is rank-deficient.
The estimated $\bA$ and $\bB$ matrices, denoted $\hat{\bA}$ and
$\hat{\bB}$ describe the data, but do not reveal the true
linearization. And if you design a controller based on these
estimates, you will be disappointed!</p>
<todo>insert some plots?</todo>
<p>Fortunately, we could have seen this problem by checking the rank of
the least-squares solution. Generating more examples won't fix this
problem. Instead, to generate a richer dataset, I've added a small
additional signal to the input: $u(t) = \pi_{lqr}(\bx(t)) + 0.1\sin(t).$
That makes all of the difference.</p>
<script>document.write(notebook_link('sysid'))</script>
<p>I hope you try the code. The basic algorithm for estimation is
disarmingly simple, but there are a lot of details to get right to
actually make it work.</p>
</example>
<subsubsection><h1>Model-based Iterative Learning Control (ILC)</h1>
<todo>Local time-varying linear model along a trajectory + iLQR. Bregmann ADMM</todo>
<example><h1>The Hydrodynamic Cart-Pole</h1>
<p>One of my favorite examples of model-based ILC was a series of
experiments in which we explored the dynamics of a "hydrodynamic
cart-pole" system. Think of it as a cross between the classic
cart-pole system and a fighter jet (perhaps a little closer to the
cartpole)!</p>
<p>Here we've replaced the pole with an airfoil (hydrofoil), turned
the entire system on its side, and dropped it into a water tunnel.
Rather than swing-up and balance the pole against gravity, the task
is to balance the foil in its unstable configuration against the
hydrodynamic forces. These forces are the result of unsteady
fluid-body interactions; unlike the classic cart-pole system, this
time we do not have an tractable parameterized ODE model for the
system. It's a perfect problem for system identification and ILC.</p>
<figure><img width="25%" src="data/hydro_cartpole_downright.png">
<img width="25%" src="data/hydro_cartpole_dynamic.png"> <img
width="25%" src="data/hydro_cartpole_upright.png">
<figcaption>A cartoon of the hydronamic cart-pole system. The cart
is actuated horizontally, the foil pivots around a passive joint,
and the fluid is flowing in the direction of the arrows. (The
entire system is horizontal, so there is no effect from gravity.)
The aerodynamic center of the airfoil is somewhere in the middle of
the foil; because the pin joint is at the tail, the passive system
will "weather vane" to the stable "downward" equilibrium (left).
Balancing corresponds to stabilizing the unstable "upward"
equilibrium (right). The fluid-body dynamics during the transition
(center) are unsteady and very nonlinear.</figcaption>
</figure>
<p>In a series of experiments, first we attempted to stabilize the
system using an LQR controller derived with an approximate model
(using <a href="trajopt.html#perching">flat-plate theory</a>). This
controller didn't perform well, but was just good enough to collect
relevant data in the vicinity of the unstable fixed point. Then we
fit a linear model, recomputed the LQR controller using the model,
and got notably better performance.</p>
<p>To investigate more aggressive maneuvers we considered making a
rapid step change in the desired position of the cart (like a fighter
jet rapidly changing altitude). Using only the time-invariant LQR
balancing controller with a shifted set point, we naturally observed
a very slow step-response. Using trajectory optimization on the
time-invariant linear model used in balancing, we could do much
better. But we achieved considerably better performance by
iteratively fitting a time-varying linear model in the neighborhood
of this trajectory and performing model-based ILC.</p>
<figure><img width="60%"
src="data/hydro_cartpole_step_comparison.png"><figcaption>Comparison
of the step response using three different controllers: the balancing
LQR controller (blue), LQR with a feed-forward term obtained from
trajectory optimization with the LTI model (red), and a controller
obtained via ILC with a time-varying linear model and iLQR
(green).</figcaption></figure>
<p>These experiments were quite beautiful and very visual. They went
on from here to consider the effect of stabilizing against incoming
vortex disturbances using real-time perception of the oncoming fluid.
If you're at all interested, I would encourage you to check out John
Robert's thesis<elib>Roberts12</elib> and/or even the <a
href="https://www.youtube.com/watch?v=8ISAIRIRiSs">video of his
thesis defense</a>.
</p>
<todo>Consider adding a video or two here? John just shared his
thesis slides with me in dropbox, and I found some of the original
videos in movies/RobotLocomotion/WaterTunnel. But the videos have
big black borders, so could really use some cleanup.</todo>
</example>
<todo>How reasonable (locally) are LQG models through contact?</todo>
</subsubsection>
<subsubsection><h1>Compression using the dominant eigenmodes</h1>
<todo>connect to acrobot ch modal analysis?</todo>
<p>For high-dimensional state or input vectors, one can use singular
value decomposition (SVD) to solve this least-squares problem using
only the dominant eigenvalues (and corresponding eigenvectors) of $\bA$
<elib part="Section 7.2">Brunton19</elib> using the so-called "Dynamic
Mode Decomposition" (DMD). There have been many empirical success
stories of using a small number of dominant modes to produce an
excellent fit to the data (this is especially relevant in problems like
fluid dynamics where the state vector $\bx$ might be an entire image
corresponding to a fluid flow). In DMD, we would write the linear
dynamics in the coordinates of these eigenmodes (which can always be
projected back to the full coordinates).</p>
</subsubsection>
<subsubsection><h1>Linear dynamics in a nonlinear basis</h1>
<p>A potentially useful generalization that we can consider here is
identifying dynamics that evolve linearly in a coordinate system
defined by some (potentially high-dimensional) basis vectors
$\phi(\bx).$ In particular, we might consider dynamics of the form
$\dot\bphi = \pd{\bphi}{\bx} \dot\bx = \bA \bphi(x).$ Much ado has
been made of this particular form, due to the connection to Koopman
operator theory that we will discuss briefly below. For our purposes,
we also need a control input, so might consider a form like $\dot\bphi
= \bA \bphi(x) + \bB \bu.$</p>
<todo>example with the pendulum here?</todo>
<p>Once again, thanks to the maturity of least-squares, with this
approach it is possible to include list many possible basis functions,
then use sparse least-squares and/or the modal decomposition to
effectively find the important subset.</p>
<p>Note that multibody parameter estimation described above is not
this, although it is closely related. The least-squares
lumped-parameter estimation for the manipulator equations uncovered
dynamics that were still <i>nonlinear</i> in the state variables.</p>
</subsubsection>
</subsection>
<subsection><h1>From input-output data (the state-realization problem)</h1>
<p>In the more general form, we would like to estimate a model of the
form \begin{gather*} \bx[n+1] = \bA \bx[n] + \bB \bu[n] + \bw[n]\\ \by[n]
= \bC \bx[n] + {\bf D} \bu[n] + \bv[n]. \end{gather*} Once again, we
will apply least-squares estimation, but combine this with the famous
"Ho-Kalman" algorithm (also known as the "Eigen System Realization" (ERA)
algorithm) <elib>Ho66</elib>. My favorite presentation of this algorithm
is <elib>Oymak19</elib>.</p>
<todo>cite VanOverschee96?</todo>
<p>First, observe that \begin{align*} \by[0] =& \bC\bx[0] + {\bf D}\bu[0]
+ \bv[0], \\ \by[1] =& \bC(\bA\bx[0] + \bB\bu[0] + \bw[0]) + {\bf
D}\bu[1] + \bv[1], \\ \by[n] =& \bC\bA^n\bx[0] + {\bf D}\bu[n] + \bv[n] +
\sum_{k=0}^{n-1}\bC\bA^{n-k-1}(\bB\bu[k] + \bw[k]). \end{align*} For the
purposes of identification, let's write $\by[n]$ as a function of the most
recent $N+1$ inputs (for $k \ge N$): \begin{align*}\by[n] =&
\begin{bmatrix} \bC\bA^{N-1}\bB & \bC\bA^{N-2}\bB & \cdots & \bC\bB &
{\bf D}\end{bmatrix} \begin{bmatrix} \bu[n-N] \\ \bu[n-N+1] \\ \vdots \\
\bu[n] \end{bmatrix} + {\bf \delta}[n] \\ =& {\bf G}[n]\bar\bu[n] + {\bf
\delta}[n] \end{align*} where ${\bf \delta}[n]$ captures the remaining
terms from initial conditions, noise and control inputs before the
(sliding) window. ${\bf G}[n] = \begin{bmatrix} \bC\bA^{N-1}\bB &
\bC\bA^{N-2}\bB & \cdots & \bC\bB & {\bf D}\end{bmatrix},$ and
$\bar\bu[n]$ represents the concatenated $\bu[n]$'s from time $n-N$ up to
$n$. Importantly we have that ${\bf \delta}[n]$ is uncorrelated with
$\bar\bu[n]$: $\forall k > n$ we have $E\left[\bu_k {\bf \delta}_n
\right] = 0.$ This is sometimes known as Neyman orthogonality, and it
implies that we can estimate ${\bf G}$ using simple least-squares
$\hat{\bf G} = \argmin_{\bf G} \sum_{n \ge N} \| \by_n - {\bf G}\bar\bu_n
\|^2.$
<elib>Oymak19</elib> gives bounds on the norm of the estimation error as
a function of the number of samples and the variance of the noise.</p>
<p>How should we pick the window size $N$? By Neyman orthogonality, we
know that our estimates will be unbiased for any choice of $N \ge 0$. But
if we choose $N$ too small, then the term $\delta[n]$ will be large,
leading to a potentially large variance in our estimate. For stable
systems, the $\delta$ terms will get smaller as we increase $N$. In
practice, we choose $N$ based on the characteristic settling time in the
data (roughly until the impulse response becomes sufficiently small).</p>
<p>If you've studied linear systems, ${\bf G}$ will look familiar; it is
precisely this (multi-input, multi-output) matrix impulse response, also
known as the "Markov parameters". In fact, estimating $\hat{\bf
G}$ may even be sufficient for control design, as it is closely-related to
the parameterization used in disturbance-based feedback for
partially-observable systems
<elib>Sadraddini20+Simchowitz20</elib>. But the Ho-Kalman algorithm can
be used to extract good estimates $\hat\bA, \hat\bB, \hat\bC, \hat{\bf
D}$ with state-dimension $\dim(\bx) = n_x$ from $\hat{\bf G},$ assuming
that the true system is observable and controllable with order at least
$n_x$ and the data matrices we form below are sufficiently
rich<elib>Oymak19</elib>.</p>
<p>It is important to realize that many system matrices, $\bA, \bB, \bC,$
can describe the same input-output data. In particular, for any
invertible matrix (aka <a
href="https://en.wikipedia.org/wiki/Matrix_similarity">similarity
transform</a>), ${\bf T}$, the system matrices $\bA, \bB, \bC$ and $\bA',
\bB', \bC'$ with, $$\bA' = {\bf T}^{-1} \bA {\bf T},\quad \bB' = {\bf
T}^{-1} \bB,\quad \bC' = \bC{\bf T},$$ describe the same input-output
behavior. The Ho-Kalman algorithm returns a <i>balanced realization</i>.
A balanced realization is one in which we have effectively applied a
similarity transform, ${\bf T},$ which makes the controllability and
observability Grammians equal and diagonal<elib part="Ch.
9">Brunton19</elib> and which orders the states in terms of diminishing
effect on the input/output behavior. This ordering is relevant for
determining the system order and for model reduction.</p>
<p>Note that $\hat{\bf D}$ is the last block in $\hat{\bf G}$ so is
extracted trivially. The Ho-Kalman algorithm tells us how to extract
$\hat\bA, \hat\bB, \hat\bC$, with another application of the SVD on
suitably constructed data matrices (see e.g. <elib
part="§5.1">Oymak19</elib>, <elib part="§10.5">Juang01</elib>,
or <elib part="§9.3">Brunton19</elib>).</p>
<todo>Does using the Kalman filter states allow me to use colored input excitation?</todo>
<example><h1>Ho-Kalman identification of the cart-pole from keypoints</h1>
<p>Let's repeat the cart-pole example. But this time, instead of
getting observations of the joint position and velocities directly, we
will consider the problem of identifying the dynamics from a camera
rendering the scene, but let us proceed thoughtfully.</p>
<p>The output of a standard camera is an RGB image, consisting of 3 x
width x height real values. We could certainly columnate these into a
vector and use this as the outputs/observations, $y(t)$. But I don't
recommend it. This "pixel-space" is not a nice space. For example,
you could easily find a pixel that has the color of the cart in one
frame, but after an incremental change in the position of the cart, now
(discontinuously) takes on the color of the background. Deep learning
has given us fantastic new tools for transforming raw RGB images into
better "feature spaces", that will be robust enough to deploy on real
systems but can make our modeling efforts much more tractable. My
group has made heavy use of "keypoint networks" <elib>Manuelli19</elib>
and self-supervised "dense correspondences"
<elib>Florence18a+Florence20+Manuelli20a</elib> to convert RGB outputs
into a more consumable form.</p>
<figure>
<iframe scrolling="no" style="border:none;" seamless="seamless" src="data/cartpole_balancing_keypoints.html" height="300" width="100%"></iframe>
</figure>
<p>The existence of these tools makes it reasonable to assume that we
have observations representing the 2D positions of some number of "key
points" that are rigidly fixed to the cart and to the pole. The
location of any keypoint $K$ on a pole, for instance, is given by the <a
href="acrobot.html#cart_pole">cart-pole kinematics</a>: $${}^{W}{\bf
p}^{K} = \begin{bmatrix} x \\ 0 \end{bmatrix} + \begin{bmatrix}
\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}
{}^{P}{\bf p}^{K},$$ where $x$ is the position of the cart, and
$\theta$ is the angle of the pole. I've used <a
href="http://manipulation.csail.mit.edu/pick.html#monogram">Monogram
notation</a> to denote that ${}^{P}{\bf p}^{K}$ is the Cartesian
position of a point $A$ in the pole's frame, $P$, and ${}^{W}{\bf
p}^{K}$ is that same position in the camera/world frame, $W$. While it
is probably unreasonable to linearize the RGB outputs as a function of
the cart-pole positions, $x$ and $\theta$, it <i>is</i> reasonable to
approximate the keypoint positions by linearizing this equation for
small angles.</p>
<p>You'll see in the example that we recover the impulse response quite
nicely.</p>
<figure><img width="70%"
src="data/keypoint_impulse_response.svg"></figure>
<p>The big question is: does Ho-Kalman discover the 4-dimensional
state vector that we know and like for the cart-pole? Does it find
something different? We would typically choose the order by looking at
the magnitude of the singular values of the Hankel data matrix. In
this example, in the case of no noise, you see clearly that 2 states
describe most of the behavior, and we have diminishing returns after 4 or 5:</p>
<figure><img width="70%"
src="data/keypoint_hankel_singular_values.svg"></figure>
<p>As an exercise, try adding process and/or measurement noise to the
system that generated the data, and see how they effect this
result.</p>
<script>document.write(notebook_link('sysid'))</script>
</example>
<todo>identify observer/Kalman filter markov parameters (aka OKID) from
Juang93, also in Juang95 section 10.7, Brunton19, VanOverschee96, etc.
Inspiration is similar to Simchowitz and Boczar; we'd expect the Kalman
rates to be faster than the generic fits if the Gaussian iid assumption
is valid. Looks like there are more papers/tutorials at https://www.dartmouth.edu/~mqphan/sidhlights.html</todo>
</subsection>
<subsection><h1>Adding stability constraints</h1>
<p>Details coming soon. See, for instance <elib>Umenberger18</elib>.</p>
<todo>possibly: https://arxiv.org/abs/1204.0590</todo>
</subsection>
<subsection><h1>Autoregressive models</h1>
<p>Another important class of linear models predict the output directly
from a history of recent inputs and outputs: \begin{align*} \by[n+1]
=&\bA_0 \by[n] + \bA_1 \by[n-1] + ... + \bA_k \by[n-k] \\ & + \bB_0\bu[n]
+ \bB_1 \bu[n-1] + ... \bB_k \bu[n-k]\end{align*} These are the
so-called "AutoRegressive models with eXogenous input (ARX)" models. The
coefficients of ARX models can be fit directly from input-output data
using linear least squares.</p>
<p>While it is certainly possible to think of these models as state-space
models, where we collect the finite history $\by$ and $\bu$ (except not
$\bu[n]$) into our state vector. But this is not necessarily a very
efficient representation; the Ho-Kalman algorithm might be able to find a
much smaller state representation. This is particularly important for
partially-observable systems that might require a very long history,
$k$.</p>