-
Notifications
You must be signed in to change notification settings - Fork 40
/
Copy pathREADME
4109 lines (3075 loc) · 154 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# MGL Manual
###### \[in package MGL\]
## MGL ASDF System
- Version: 0.1.0
- Description: MGL is a machine learning library for backpropagation
neural networks, boltzmann machines, gaussian processes and more.
- Licence: MIT, see COPYING.
- Author: Gábor Melis <[email protected]>
- Mailto: [[email protected]](mailto:[email protected])
- Homepage: [http://melisgl.github.io/mgl](http://melisgl.github.io/mgl)
- Bug tracker: [https://github.com/melisgl/mgl/issues](https://github.com/melisgl/mgl/issues)
- Source control: [GIT](https://github.com/melisgl/mgl.git)
## Introduction
### Overview
MGL is a Common Lisp machine learning library by [Gábor
Melis](http://quotenil.com) with some parts originally contributed
by Ravenpack International. It mainly concentrates on various forms
of neural networks (boltzmann machines, feed-forward and recurrent
backprop nets). Most of MGL is built on top of MGL-MAT so it has
BLAS and CUDA support.
In general, the focus is on power and performance not on ease of
use. Perhaps one day there will be a cookie cutter interface with
restricted functionality if a reasonable compromise is found between
power and utility.
### Links
Here is the [official repository](https://github.com/melisgl/mgl)
and the [HTML
documentation](http://melisgl.github.io/mgl-pax-world/mgl-manual.html)
for the latest version.
### Dependencies
MGL used to rely on [LLA](https://github.com/tpapp/lla) to
interface to BLAS and LAPACK. That's mostly history by now, but
configuration of foreign libraries is still done via LLA. See the
README in LLA on how to set things up. Note that these days OpenBLAS
is easier to set up and just as fast as ATLAS.
[CL-CUDA](https://github.com/takagi/cl-cuda) and
[MGL-MAT](https://github.com/melisgl/mgl) are the two main
dependencies and also the ones not yet in quicklisp, so just drop
them into `quicklisp/local-projects/`. If there is no suitable GPU
on the system or the CUDA SDK is not installed, MGL will simply
fall back on using BLAS and Lisp code. Wrapping code in
MGL-MAT:WITH-CUDA\* is basically all that's needed to run on the GPU,
and with MGL-MAT:CUDA-AVAILABLE-P one can check whether the GPU is
really being used.
### Code Organization
MGL consists of several packages dedicated to different tasks.
For example, package `MGL-RESAMPLE` is about
MGL-RESAMPLE::@MGL-RESAMPLE and `MGL-GD` is about MGL-GD::@MGL-GD
and so on. On one hand, having many packages makes it easier to
cleanly separate API and implementation and also to explore into a
specific task. At other times, they can be a hassle, so the MGL
package itself reexports every external symbol found in all the
other packages that make up MGL and MGL-MAT (see
MGL-MAT::@MAT-MANUAL) on which it heavily relies.
One exception to this rule is the bundled, but independent
MGL-GNUPLOT library.
The built in tests can be run with:
(ASDF:OOS 'ASDF:TEST-OP '#:MGL)
Note, that most of the tests are rather stochastic and can fail once
in a while.
### Glossary
Ultimately machine learning is about creating **models** of some
domain. The observations in the modelled domain are called
**instances** (also known as examples or samples). Sets of instances
are called **datasets**. Datasets are used when fitting a model or
when making **predictions**. Sometimes the word predictions is too
specific, and the results obtained from applying a model to some
instances are simply called **results**.
## Datasets
###### \[in package MGL-DATASET\]
An instance can often be any kind of object of the user's choice.
It is typically represented by a set of numbers which is called a
feature vector or by a structure holding the feature vector, the
label, etc. A dataset is a SEQUENCE of such instances or a
@MGL-SAMPLER object that produces instances.
- [function] MAP-DATASET FN DATASET
Call FN with each instance in DATASET. This is basically equivalent
to iterating over the elements of a sequence or a sampler (see
@MGL-SAMPLER).
- [function] MAP-DATASETS FN DATASETS &KEY (IMPUTE NIL IMPUTEP)
Call FN with a list of instances, one from each dataset in
DATASETS. Return nothing. If IMPUTE is specified then iterate until
the largest dataset is consumed imputing IMPUTE for missing values.
If IMPUTE is not specified then iterate until the smallest dataset
runs out.
```common-lisp
(map-datasets #'prin1 '((0 1 2) (:a :b)))
.. (0 :A)(1 :B)
(map-datasets #'prin1 '((0 1 2) (:a :b)) :impute nil)
.. (0 :A)(1 :B)(2 NIL)
```
It is of course allowed to mix sequences with samplers:
```common-lisp
(map-datasets #'prin1
(list '(0 1 2)
(make-sequence-sampler '(:a :b) :max-n-samples 2)))
.. (0 :A)(1 :B)
```
### Samplers
Some algorithms do not need random access to the entire dataset and
can work with a stream observations. Samplers are simple generators
providing two functions: SAMPLE and FINISHEDP.
- [generic-function] SAMPLE SAMPLER
If SAMPLER has not run out of data (see FINISHEDP)
SAMPLE returns an object that represents a sample from the world to
be experienced or, in other words, simply something the can be used
as input for training or prediction. It is not allowed to call
SAMPLE if SAMPLER is FINISHEDP.
- [generic-function] FINISHEDP SAMPLER
See if SAMPLER has run out of examples.
- [function] LIST-SAMPLES SAMPLER MAX-SIZE
Return a list of samples of length at most MAX-SIZE or less if
SAMPLER runs out.
- [function] MAKE-SEQUENCE-SAMPLER SEQ &KEY MAX-N-SAMPLES
Create a sampler that returns elements of SEQ in their original
order. If MAX-N-SAMPLES is non-nil, then at most MAX-N-SAMPLES are
sampled.
- [function] MAKE-RANDOM-SAMPLER SEQ &KEY MAX-N-SAMPLES (REORDER #'MGL-RESAMPLE:SHUFFLE)
Create a sampler that returns elements of SEQ in random order. If
MAX-N-SAMPLES is non-nil, then at most MAX-N-SAMPLES are sampled.
The first pass over a shuffled copy of SEQ, and this copy is
reshuffled whenever the sampler reaches the end of it. Shuffling is
performed by calling the REORDER function.
- [variable] *INFINITELY-EMPTY-DATASET* #\<FUNCTION-SAMPLER "infinitely empty" \>
This is the default dataset for MGL-OPT:MINIMIZE. It's an infinite
stream of NILs.
#### Function Sampler
- [class] FUNCTION-SAMPLER
A sampler with a function in its GENERATOR that
produces a stream of samples which may or may not be finite
depending on MAX-N-SAMPLES. FINISHEDP returns T iff MAX-N-SAMPLES is
non-nil, and it's not greater than the number of samples
generated (N-SAMPLES).
(list-samples (make-instance 'function-sampler
:generator (lambda ()
(random 10))
:max-n-samples 5)
10)
=> (3 5 2 3 3)
- [reader] GENERATOR FUNCTION-SAMPLER (:GENERATOR)
A generator function of no arguments that returns
the next sample.
- [accessor] MAX-N-SAMPLES FUNCTION-SAMPLER (:MAX-N-SAMPLES = NIL)
- [reader] NAME FUNCTION-SAMPLER (:NAME = NIL)
An arbitrary object naming the sampler. Only used
for printing the sampler object.
- [reader] N-SAMPLES FUNCTION-SAMPLER (:N-SAMPLES = 0)
## Resampling
###### \[in package MGL-RESAMPLE\]
The focus of this package is on resampling methods such as
cross-validation and bagging which can be used for model evaluation,
model selection, and also as a simple form of ensembling. Data
partitioning and sampling functions are also provided because they
tend to be used together with resampling.
### Partitions
The following functions partition a dataset (currently only
SEQUENCEs are supported) into a number of partitions. For each
element in the original dataset there is exactly one partition that
contains it.
- [function] FRACTURE FRACTIONS SEQ &KEY WEIGHT
Partition SEQ into a number of subsequences. FRACTIONS is either a
positive integer or a list of non-negative real numbers. WEIGHT is
NIL or a function that returns a non-negative real number when
called with an element from SEQ. If FRACTIONS is a positive integer
then return a list of that many subsequences with equal sum of
weights bar rounding errors, else partition SEQ into subsequences,
where the sum of weights of subsequence I is proportional to element
I of FRACTIONS. If WEIGHT is NIL, then it's element is assumed to
have the same weight.
To split into 5 sequences:
```common-lisp
(fracture 5 '(0 1 2 3 4 5 6 7 8 9))
=> ((0 1) (2 3) (4 5) (6 7) (8 9))
```
To split into two sequences whose lengths are proportional to 2 and
3:
```common-lisp
(fracture '(2 3) '(0 1 2 3 4 5 6 7 8 9))
=> ((0 1 2 3) (4 5 6 7 8 9))
```
- [function] STRATIFY SEQ &KEY (KEY #'IDENTITY) (TEST #'EQL)
Return the list of strata of SEQ. SEQ is a sequence of elements for
which the function KEY returns the class they belong to. Such
classes are opaque objects compared for equality with TEST. A
stratum is a sequence of elements with the same (under TEST) KEY.
```common-lisp
(stratify '(0 1 2 3 4 5 6 7 8 9) :key #'evenp)
=> ((0 2 4 6 8) (1 3 5 7 9))
```
- [function] FRACTURE-STRATIFIED FRACTIONS SEQ &KEY (KEY #'IDENTITY) (TEST #'EQL) WEIGHT
Similar to FRACTURE, but also makes sure that keys are evenly
distributed among the partitions (see STRATIFY). It can be useful
for classification tasks to partition the data set while keeping the
distribution of classes the same.
Note that the sets returned are not in random order. In fact, they
are sorted internally by KEY.
For example, to make two splits with approximately the same number
of even and odd numbers:
```common-lisp
(fracture-stratified 2 '(0 1 2 3 4 5 6 7 8 9) :key #'evenp)
=> ((0 2 1 3) (4 6 8 5 7 9))
```
### Cross-validation
- [function] CROSS-VALIDATE DATA FN &KEY (N-FOLDS 5) (FOLDS (ALEXANDRIA:IOTA N-FOLDS)) (SPLIT-FN #'SPLIT-FOLD/MOD) PASS-FOLD
Map FN over the FOLDS of DATA split with SPLIT-FN and collect the
results in a list. The simplest demonstration is:
```common-lisp
(cross-validate '(0 1 2 3 4)
(lambda (test training)
(list test training))
:n-folds 5)
=> (((0) (1 2 3 4))
((1) (0 2 3 4))
((2) (0 1 3 4))
((3) (0 1 2 4))
((4) (0 1 2 3)))
```
Of course, in practice one would typically train a model and return
the trained model and/or its score on TEST. Also, sometimes one may
want to do only some of the folds and remember which ones they were:
```common-lisp
(cross-validate '(0 1 2 3 4)
(lambda (fold test training)
(list :fold fold test training))
:folds '(2 3)
:pass-fold t)
=> ((:fold 2 (2) (0 1 3 4))
(:fold 3 (3) (0 1 2 4)))
```
Finally, the way the data is split can be customized. By default
SPLIT-FOLD/MOD is called with the arguments DATA, the fold (from
among FOLDS) and N-FOLDS. SPLIT-FOLD/MOD returns two values which
are then passed on to FN. One can use SPLIT-FOLD/CONT or
SPLIT-STRATIFIED or any other function that works with these
arguments. The only real constraint is that FN has to take as many
arguments (plus the fold argument if PASS-FOLD) as SPLIT-FN
returns.
- [function] SPLIT-FOLD/MOD SEQ FOLD N-FOLDS
Partition SEQ into two sequences: one with elements of SEQ with
indices whose remainder is FOLD when divided with N-FOLDS, and a
second one with the rest. The second one is the larger set. The
order of elements remains stable. This function is suitable as the
SPLIT-FN argument of CROSS-VALIDATE.
- [function] SPLIT-FOLD/CONT SEQ FOLD N-FOLDS
Imagine dividing SEQ into N-FOLDS subsequences of the same
size (bar rounding). Return the subsequence of index FOLD as the
first value and the all the other subsequences concatenated into one
as the second value. The order of elements remains stable. This
function is suitable as the SPLIT-FN argument of CROSS-VALIDATE.
- [function] SPLIT-STRATIFIED SEQ FOLD N-FOLDS &KEY (KEY #'IDENTITY) (TEST #'EQL) WEIGHT
Split SEQ into N-FOLDS partitions (as in FRACTURE-STRATIFIED).
Return the partition of index FOLD as the first value, and the
concatenation of the rest as the second value. This function is
suitable as the SPLIT-FN argument of CROSS-VALIDATE (mostly likely
as a closure with KEY, TEST, WEIGHT bound).
### Bagging
- [function] BAG SEQ FN &KEY (RATIO 1) N WEIGHT (REPLACEMENT T) KEY (TEST #'EQL) (RANDOM-STATE \*RANDOM-STATE\*)
Sample from SEQ with SAMPLE-FROM (passing RATIO, WEIGHT,
REPLACEMENT), or SAMPLE-STRATIFIED if KEY is not NIL. Call FN with
the sample. If N is NIL then keep repeating this until FN performs a
non-local exit. Else N must be a non-negative integer, N iterations
will be performed, the primary values returned by FN collected into
a list and returned. See SAMPLE-FROM and SAMPLE-STRATIFIED for
examples.
- [function] SAMPLE-FROM RATIO SEQ &KEY WEIGHT REPLACEMENT (RANDOM-STATE \*RANDOM-STATE\*)
Return a sequence constructed by sampling with or without
REPLACEMENT from SEQ. The sum of weights in the result sequence will
approximately be the sum of weights of SEQ times RATIO. If WEIGHT is
NIL then elements are assumed to have equal weights, else WEIGHT
should return a non-negative real number when called with an element
of SEQ.
To randomly select half of the elements:
```common-lisp
(sample-from 1/2 '(0 1 2 3 4 5))
=> (5 3 2)
```
To randomly select some elements such that the sum of their weights
constitute about half of the sum of weights across the whole
sequence:
```common-lisp
(sample-from 1/2 '(0 1 2 3 4 5 6 7 8 9) :weight #'identity)
=> ;; sums to 28 that's near 45/2
(9 4 1 6 8)
```
To sample with replacement (that is, allowing the element to be
sampled multiple times):
```common-lisp
(sample-from 1 '(0 1 2 3 4 5) :replacement t)
=> (1 1 5 1 4 4)
```
- [function] SAMPLE-STRATIFIED RATIO SEQ &KEY WEIGHT REPLACEMENT (KEY #'IDENTITY) (TEST #'EQL) (RANDOM-STATE \*RANDOM-STATE\*)
Like SAMPLE-FROM but makes sure that the weighted proportion of
classes in the result is approximately the same as the proportion in
SEQ. See STRATIFY for the description of KEY and TEST.
### CV Bagging
- [function] BAG-CV DATA FN &KEY N (N-FOLDS 5) (FOLDS (ALEXANDRIA:IOTA N-FOLDS)) (SPLIT-FN #'SPLIT-FOLD/MOD) PASS-FOLD (RANDOM-STATE \*RANDOM-STATE\*)
Perform cross-validation on different shuffles of DATA N times and
collect the results. Since CROSS-VALIDATE collects the return values
of FN, the return value of this function is a list of lists of FN
results. If N is NIL, don't collect anything just keep doing
repeated CVs until FN performs a non-local exit.
The following example simply collects the test and training sets for
2-fold CV repeated 3 times with shuffled data:
```commonlisp
;;; This is non-deterministic.
(bag-cv '(0 1 2 3 4) #'list :n 3 :n-folds 2)
=> ((((2 3 4) (1 0))
((1 0) (2 3 4)))
(((2 1 0) (4 3))
((4 3) (2 1 0)))
(((1 0 3) (2 4))
((2 4) (1 0 3))))
```
CV bagging is useful when a single CV is not producing stable
results. As an ensemble method, CV bagging has the advantage over
bagging that each example will occur the same number of times and
after the first CV is complete there is a complete but less reliable
estimate for each example which gets refined by further CVs.
### Miscellaneous Operations
- [function] SPREAD-STRATA SEQ &KEY (KEY #'IDENTITY) (TEST #'EQL)
Return a sequence that's a reordering of SEQ such that elements
belonging to different strata (under KEY and TEST, see STRATIFY) are
distributed evenly. The order of elements belonging to the same
stratum is unchanged.
For example, to make sure that even and odd numbers are distributed
evenly:
```common-lisp
(spread-strata '(0 2 4 6 8 1 3 5 7 9) :key #'evenp)
=> (0 1 2 3 4 5 6 7 8 9)
```
Same thing with unbalanced classes:
```common-lisp
(spread-strata (vector 0 2 3 5 6 1 4)
:key (lambda (x)
(if (member x '(1 4))
t
nil)))
=> #(0 1 2 3 4 5 6)
```
- [function] ZIP-EVENLY SEQS &KEY RESULT-TYPE
Make a single sequence out of the sequences in SEQS so that in the
returned sequence indices of elements belonging to the same source
sequence are spread evenly across the whole range. The result is a
list is RESULT-TYPE is LIST, it's a vector if RESULT-TYPE is VECTOR.
If RESULT-TYPE is NIL, then it's determined by the type of the first
sequence in SEQS.
```common-lisp
(zip-evenly '((0 2 4) (1 3)))
=> (0 1 2 3 4)
```
## Core
###### \[in package MGL-CORE\]
### Persistence
- [function] LOAD-STATE FILENAME OBJECT
Load weights of OBJECT from FILENAME. Return OBJECT.
- [function] SAVE-STATE FILENAME OBJECT &KEY (IF-EXISTS :ERROR) (ENSURE T)
Save weights of OBJECT to FILENAME. If ENSURE, then
ENSURE-DIRECTORIES-EXIST is called on FILENAME. IF-EXISTS is passed
on to OPEN. Return OBJECT.
- [function] READ-STATE OBJECT STREAM
Read the weights of OBJECT from the bivalent STREAM where weights
mean the learnt parameters. There is currently no sanity checking of
data which will most certainly change in the future together with
the serialization format. Return OBJECT.
- [function] WRITE-STATE OBJECT STREAM
Write weight of OBJECT to the bivalent STREAM. Return OBJECT.
- [generic-function] READ-STATE* OBJECT STREAM CONTEXT
This is the extension point for READ-STATE. It is
guaranteed that primary READ-STATE\* methods will be called only once
for each OBJECT (under EQ). CONTEXT is an opaque object and must be
passed on to any recursive READ-STATE\* calls.
- [generic-function] WRITE-STATE* OBJECT STREAM CONTEXT
This is the extension point for WRITE-STATE. It is
guaranteed that primary WRITE-STATE\* methods will be called only
once for each OBJECT (under EQ). CONTEXT is an opaque object and must
be passed on to any recursive WRITE-STATE\* calls.
### Batch Processing
Processing instances one by one during training or prediction can
be slow. The models that support batch processing for greater
efficiency are said to be *striped*.
Typically, during or after creating a model, one sets MAX-N-STRIPES
on it a positive integer. When a batch of instances is to be fed to
the model it is first broken into subbatches of length that's at
most MAX-N-STRIPES. For each subbatch, SET-INPUT (FIXDOC) is called
and a before method takes care of setting N-STRIPES to the actual
number of instances in the subbatch. When MAX-N-STRIPES is set
internal data structures may be resized which is an expensive
operation. Setting N-STRIPES is a comparatively cheap operation,
often implemented as matrix reshaping.
Note that for models made of different parts (for example,
MGL-BP:BPN consists of MGL-BP:LUMPs) , setting these
values affects the constituent parts, but one should never change
the number stripes of the parts directly because that would lead to
an internal inconsistency in the model.
- [generic-function] MAX-N-STRIPES OBJECT
The number of stripes with which the OBJECT is
capable of dealing simultaneously.
- [generic-function] SET-MAX-N-STRIPES MAX-N-STRIPES OBJECT
Allocate the necessary stuff to allow for
MAX-N-STRIPES number of stripes to be worked with simultaneously in
OBJECT. This is called when MAX-N-STRIPES is SETF'ed.
- [generic-function] N-STRIPES OBJECT
The number of stripes currently present in OBJECT.
This is at most MAX-N-STRIPES.
- [generic-function] SET-N-STRIPES N-STRIPES OBJECT
Set the number of stripes (out of MAX-N-STRIPES)
that are in use in OBJECT. This is called when N-STRIPES is
SETF'ed.
- [macro] WITH-STRIPES SPECS &BODY BODY
Bind start and optionally end indices belonging to stripes in
striped objects.
(WITH-STRIPES ((STRIPE1 OBJECT1 START1 END1)
(STRIPE2 OBJECT2 START2)
...)
...)
This is how one's supposed to find the index range corresponding to
the Nth input in an input lump of a bpn:
(with-stripes ((n input-lump start end))
(loop for i upfrom start below end
do (setf (mref (nodes input-lump) i) 0d0)))
Note how the input lump is striped, but the matrix into which we are
indexing (NODES) is not known to WITH-STRIPES. In fact, for lumps
the same stripe indices work with NODES and MGL-BP:DERIVATIVES.
- [generic-function] STRIPE-START STRIPE OBJECT
Return the start index of STRIPE in some array or
matrix of OBJECT.
- [generic-function] STRIPE-END STRIPE OBJECT
Return the end index (exclusive) of STRIPE in some
array or matrix of OBJECT.
- [generic-function] SET-INPUT INSTANCES MODEL
Set INSTANCES as inputs in MODEL. INSTANCES is
always a SEQUENCE of instances even for models not capable of batch
operation. It sets N-STRIPES to (LENGTH INSTANCES) in a :BEFORE
method.
- [function] MAP-BATCHES-FOR-MODEL FN DATASET MODEL
Call FN with batches of instances from DATASET suitable for MODEL.
The number of instances in a batch is MAX-N-STRIPES of MODEL or less
if there are no more instances left.
- [macro] DO-BATCHES-FOR-MODEL (BATCH (DATASET MODEL)) &BODY BODY
Convenience macro over MAP-BATCHES-FOR-MODEL.
### Executors
- [generic-function] MAP-OVER-EXECUTORS FN INSTANCES PROTOTYPE-EXECUTOR
Divide INSTANCES between executors that perform the
same function as PROTOTYPE-EXECUTOR and call FN with the instances
and the executor for which the instances are.
Some objects conflate function and call: the forward pass of a
MGL-BP:BPN computes output from inputs so it is like a
function but it also doubles as a function call in the sense that
the bpn (function) object changes state during the computation of
the output. Hence not even the forward pass of a bpn is thread safe.
There is also the restriction that all inputs must be of the same
size.
For example, if we have a function that builds bpn a for an input of
a certain size, then we can create a factory that creates bpns for a
particular call. The factory probably wants to keep the weights the
same though. In @MGL-PARAMETERIZED-EXECUTOR-CACHE,
MAKE-EXECUTOR-WITH-PARAMETERS is this factory.
Parallelization of execution is another possibility
MAP-OVER-EXECUTORS allows, but there is no prebuilt solution for it,
yet.
The default implementation simply calls FN with INSTANCES and
PROTOTYPE-EXECUTOR.
- [macro] DO-EXECUTORS (INSTANCES OBJECT) &BODY BODY
Convenience macro on top of MAP-OVER-EXECUTORS.
#### Parameterized Executor Cache
- [class] PARAMETERIZED-EXECUTOR-CACHE-MIXIN
Mix this into a model, implement
INSTANCE-TO-EXECUTOR-PARAMETERS and MAKE-EXECUTOR-WITH-PARAMETERS
and DO-EXECUTORS will be to able build executors suitable for
different instances. The canonical example is using a BPN to compute
the means and convariances of a gaussian process. Since each
instance is made of a variable number of observations, the size of
the input is not constant, thus we have a bpn (an executor) for each
input dimension (the parameters).
- [generic-function] MAKE-EXECUTOR-WITH-PARAMETERS PARAMETERS CACHE
Create a new executor for PARAMETERS. CACHE is a
PARAMETERIZED-EXECUTOR-CACHE-MIXIN. In the BPN gaussian process
example, PARAMETERS would be a list of input dimensions.
- [generic-function] INSTANCE-TO-EXECUTOR-PARAMETERS INSTANCE CACHE
Return the parameters for an executor able to
handle INSTANCE. Called by MAP-OVER-EXECUTORS on CACHE (that's a
PARAMETERIZED-EXECUTOR-CACHE-MIXIN). The returned parameters are
keys in an EQUAL parameters->executor hash table.
## Monitoring
###### \[in package MGL-CORE\]
When training or applying a model, one often wants to track various
statistics. For example, in the case of training a neural network
with cross-entropy loss, these statistics could be the average
cross-entropy loss itself, classification accuracy, or even the
entire confusion matrix and sparsity levels in hidden layers. Also,
there is the question of what to do with the measured values (log
and forget, add to some counter or a list).
So there may be several phases of operation when we want to keep an
eye on. Let's call these **events**. There can also be many fairly
independent things to do in response to an event. Let's call these
**monitors**. Some monitors are a composition of two operations: one
that extracts some measurements and another that aggregates those
measurements. Let's call these two **measurers** and **counters**,
respectively.
For example, consider training a backpropagation neural network. We
want to look at the state of of network just after the backward
pass. MGL-BP:BP-LEARNER has a MONITORS event hook corresponding to the moment after
backpropagating the gradients. Suppose we are interested in how the
training cost evolves:
(push (make-instance 'monitor
:measurer (lambda (instances bpn)
(declare (ignore instances))
(mgl-bp:cost bpn))
:counter (make-instance 'basic-counter))
(monitors learner))
During training, this monitor will track the cost of training
examples behind the scenes. If we want to print and reset this
monitor periodically we can put another monitor on
MGL-OPT:ITERATIVE-OPTIMIZER's MGL-OPT:ON-N-INSTANCES-CHANGED
accessor:
(push (lambda (optimizer gradient-source n-instances)
(declare (ignore optimizer))
(when (zerop (mod n-instances 1000))
(format t "n-instances: ~S~%" n-instances)
(dolist (monitor (monitors gradient-source))
(when (counter monitor)
(format t "~A~%" (counter monitor))
(reset-counter (counter monitor)))))
(mgl-opt:on-n-instances-changed optimizer))
Note that the monitor we push can be anything as long as
APPLY-MONITOR is implemented on it with the appropriate signature.
Also note that the ZEROP + MOD logic is fragile, so you will likely
want to use MGL-OPT:MONITOR-OPTIMIZATION-PERIODICALLY instead of
doing the above.
So that's the general idea. Concrete events are documented where
they are signalled. Often there are task specific utilities that
create a reasonable set of default monitors (see
@MGL-CLASSIFICATION-MONITOR).
- [function] APPLY-MONITORS MONITORS &REST ARGUMENTS
Call APPLY-MONITOR on each monitor in MONITORS and ARGUMENTS. This
is how an event is fired.
- [generic-function] APPLY-MONITOR MONITOR &REST ARGUMENTS
Apply MONITOR to ARGUMENTS. This sound fairly
generic, because it is. MONITOR can be anything, even a simple
function or symbol, in which case this is just CL:APPLY. See
@MGL-MONITOR for more.
- [generic-function] COUNTER MONITOR
Return an object representing the state of MONITOR
or NIL, if it doesn't have any (say because it's a simple logging
function). Most monitors have counters into which they accumulate
results until they are printed and reset. See @MGL-COUNTER for
more.
- [function] MONITOR-MODEL-RESULTS FN DATASET MODEL MONITORS
Call FN with batches of instances from DATASET until it runs
out (as in DO-BATCHES-FOR-MODEL). FN is supposed to apply MODEL to
the batch and return some kind of result (for neural networks, the
result is the model state itself). Apply MONITORS to each batch and
the result returned by FN for that batch. Finally, return the list
of counters of MONITORS.
The purpose of this function is to collect various results and
statistics (such as error measures) efficiently by applying the
model only once, leaving extraction of quantities of interest from
the model's results to MONITORS.
See the model specific versions of this functions such as
MGL-BP:MONITOR-BPN-RESULTS.
- [generic-function] MONITORS OBJECT
Return monitors associated with OBJECT. See various
methods such as MONITORS for more
documentation.
### Monitors
- [class] MONITOR
A monitor that has another monitor called MEASURER
embedded in it. When this monitor is applied, it applies the
measurer and passes the returned values to ADD-TO-COUNTER called on
its COUNTER slot. One may further specialize APPLY-MONITOR to change
that.
This class is useful when the same event monitor is applied
repeatedly over a period and its results must be aggregated such as
when training statistics are being tracked or when predictions are
begin made. Note that the monitor must be compatible with the event
it handles. That is, the embedded MEASURER must be prepared to take
the arguments that are documented to come with the event.
- [reader] MEASURER MONITOR (:MEASURER)
This must be a monitor itself which only means
that APPLY-MONITOR is defined on it (but see @MGL-MONITORING). The
returned values are aggregated by COUNTER. See
@MGL-MEASURER for a library of measurers.
- [reader] COUNTER MONITOR (:COUNTER)
The COUNTER of a monitor carries out the
aggregation of results returned by MEASURER. The See @MGL-COUNTER
for a library of counters.
### Measurers
MEASURER is a part of MONITOR objects, an embedded monitor that
computes a specific quantity (e.g. classification accuracy) from the
arguments of event it is applied to (e.g. the model results).
Measurers are often implemented by combining some kind of model
specific extractor with a generic measurer function.
All generic measurer functions return their results as multiple
values matching the arguments of ADD-TO-COUNTER for a counter of a
certain type (see @MGL-COUNTER) so as to make them easily used in a
MONITOR:
(multiple-value-call #'add-to-counter <some-counter>
<call-to-some-measurer>)
The counter class compatible with the measurer this way is noted for
each function.
For a list of measurer functions see @MGL-CLASSIFICATION-MEASURER.
### Counters
- [generic-function] ADD-TO-COUNTER COUNTER &REST ARGS
Add ARGS to COUNTER in some way. See specialized
methods for type specific documentation. The kind of arguments to be
supported is the what the measurer functions (see @MGL-MEASURER)
intended to be paired with the counter return as multiple values.
- [generic-function] COUNTER-VALUES COUNTER
Return any number of values representing the state
of COUNTER. See specialized methods for type specific
documentation.
- [generic-function] COUNTER-RAW-VALUES COUNTER
Return any number of values representing the state
of COUNTER in such a way that passing the returned values as
arguments ADD-TO-COUNTER on a fresh instance of the same type
recreates the original state.
- [generic-function] RESET-COUNTER COUNTER
Restore state of COUNTER to what it was just after
creation.
#### Attributes
- [class] ATTRIBUTED
This is a utility class that all counters subclass.
The ATTRIBUTES plist can hold basically anything. Currently the
attributes are only used when printing and they can be specified by
the user. The monitor maker functions such as those in
@MGL-CLASSIFICATION-MONITOR also add attributes of their own to the
counters they create.
With the :PREPEND-ATTRIBUTES initarg when can easily add new
attributes without clobbering the those in the :INITFORM, (:TYPE
"rmse") in this case.
(princ (make-instance 'rmse-counter
:prepend-attributes '(:event "pred."
:dataset "test")))
;; pred. test rmse: 0.000e+0 (0)
=> #<RMSE-COUNTER pred. test rmse: 0.000e+0 (0)>
- [accessor] ATTRIBUTES ATTRIBUTED (:ATTRIBUTES = NIL)
A plist of attribute keys and values.
- [method] NAME (ATTRIBUTED ATTRIBUTED)
Return a string assembled from the values of the ATTRIBUTES of
ATTRIBUTED. If there are multiple entries with the same key, then
they are printed near together.
Values may be padded according to an enclosing
WITH-PADDED-ATTRIBUTE-PRINTING.
- [macro] WITH-PADDED-ATTRIBUTE-PRINTING (ATTRIBUTEDS) &BODY BODY
Note the width of values for each attribute key which is the number
of characters in the value's PRINC-TO-STRING'ed representation. In
BODY, if attributes with they same key are printed they are forced
to be at least this wide. This allows for nice, table-like output:
(let ((attributeds
(list (make-instance 'basic-counter
:attributes '(:a 1 :b 23 :c 456))
(make-instance 'basic-counter
:attributes '(:a 123 :b 45 :c 6)))))
(with-padded-attribute-printing (attributeds)
(map nil (lambda (attributed)
(format t "~A~%" attributed))
attributeds)))
;; 1 23 456: 0.000e+0 (0)
;; 123 45 6 : 0.000e+0 (0)
- [function] LOG-PADDED ATTRIBUTEDS
Log (see LOG-MSG) ATTRIBUTEDS non-escaped (as in PRINC or ~A) with
the output being as table-like as possible.
#### Counter classes
In addition to the really basic ones here, also see
@MGL-CLASSIFICATION-COUNTER.
- [class] BASIC-COUNTER ATTRIBUTED
A simple counter whose ADD-TO-COUNTER takes two
additional parameters: an increment to the internal sums of called
the NUMERATOR and DENOMINATOR. COUNTER-VALUES returns two
values:
- NUMERATOR divided by DENOMINATOR (or 0 if DENOMINATOR is 0) and
- DENOMINATOR
Here is an example the compute the mean of 5 things received in two
batches:
(let ((counter (make-instance 'basic-counter)))
(add-to-counter counter 6.5 3)
(add-to-counter counter 3.5 2)
counter)
=> #<BASIC-COUNTER 2.00000e+0 (5)>
- [class] RMSE-COUNTER BASIC-COUNTER
A BASIC-COUNTER with whose nominator accumulates
the square of some statistics. It has the attribute :TYPE "rmse".
COUNTER-VALUES returns the square root of what BASIC-COUNTER's
COUNTER-VALUES would return.
(let ((counter (make-instance 'rmse-counter)))
(add-to-counter counter (+ (* 3 3) (* 4 4)) 2)
counter)
=> #<RMSE-COUNTER rmse: 3.53553e+0 (2)>
- [class] CONCAT-COUNTER ATTRIBUTED
A counter that simply concatenates
sequences.
\`\`\`cl-transcript
(let ((counter (make-instance 'concat-counter)))
(add-to-counter counter '(1 2 3) #(4 5))
(add-to-counter counter '(6 7))
(counter-values counter))
=> (1 2 3 4 5 6 7)
\`\`\`\`
- [reader] CONCATENATION-TYPE CONCAT-COUNTER (:CONCATENATION-TYPE = 'LIST)
A type designator suitable as the RESULT-TYPE
argument to CONCATENATE.
## Classification
###### \[in package MGL-CORE\]
To be able to measure classification related quantities, we need to
define what the label of an instance is. Customization is possible
by implementing a method for a specific type of instance, but these
functions only ever appear as defaults that can be overridden.
- [generic-function] LABEL-INDEX INSTANCE
Return the label of INSTANCE as a non-negative
integer.
- [generic-function] LABEL-INDEX-DISTRIBUTION INSTANCE
Return a one dimensional array of probabilities
representing the distribution of labels. The probability of the
label with LABEL-INDEX `I` is element at index `I` of the returned
arrray.
The following two functions are basically the same as the previous
two, but in batch mode: they return a sequence of label indices or
distributions. These are called on results produced by models.
Implement these for a model and the monitor maker functions below
will automatically work. See FIXDOC: for bpn and boltzmann.
- [generic-function] LABEL-INDICES RESULTS
Return a sequence of label indices for RESULTS
produced by some model for a batch of instances. This is akin to
LABEL-INDEX.
- [generic-function] LABEL-INDEX-DISTRIBUTIONS RESULT
Return a sequence of label index distributions for
RESULTS produced by some model for a batch of instances. This is
akin to LABEL-INDEX-DISTRIBUTION.
### Classification Monitors