forked from dmoisset/os-implementation
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathhacking.xml
3145 lines (2742 loc) · 115 KB
/
hacking.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" standalone="no"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<book lang="en" id="hacking-geekos">
<bookinfo>
<title>Hacking GeekOS</title>
<authorgroup>
<author>
<firstname>David</firstname>
<othername>H.</othername>
<surname>Hovemeyer</surname>
<affiliation>University of Maryland</affiliation>
</author>
<author>
<firstname>Jeffrey</firstname>
<othername>K.</othername>
<surname>Hollingsworth</surname>
<affiliation>University of Maryland</affiliation>
</author>
<author>
<firstname>Iulian</firstname>
<surname>Neamtiu</surname>
<affiliation>University of Maryland</affiliation>
</author>
</authorgroup>
<copyright>
<year>2003</year>
<year>2004</year>
<year>2005</year>
<holder>David H. Hovemeyer, Jeffrey K. Hollingsworth, and Iulian Neamtiu</holder>
</copyright>
<legalnotice>
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License.
To view a copy of this license, visit
<ulink url="http://creativecommons.org/licenses/by-nc-sa/1.0/">http://creativecommons.org/licenses/by-nc-sa/1.0/</ulink>
or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
</legalnotice>
</bookinfo>
<!--
<ulink url="mailto:[email protected]"><[email protected]></ulink>
<ulink url="mailto:[email protected]"><[email protected]></ulink>
-->
<!--
**********************************************************************
Introduction.
**********************************************************************
-->
<chapter id="intro">
<title>Introduction</title>
<para>
GeekOS is an educational operating system kernel.
<!--
Many educational operating
systems exist: some of the best known are
<itemizedlist>
<listitem>
<para><ulink url="http://www.cs.vu.nl/~ast/minix.html">Minix</ulink>,
</para>
</listitem>
<listitem>
<para>
<ulink url="http://www.cs.washington.edu/homes/tom/nachos/">Nachos</ulink>,
</para>
</listitem>
<listitem>
<para>
<ulink url="http://www.eecs.harvard.edu/~syrah/os161/">OS/161</ulink>, and
</para>
</listitem>
<listitem>
<para>
<ulink url="http://www.tik.ee.ethz.ch/~topsy/">Topsy</ulink>.
</para>
</listitem>
</itemizedlist>
-->
GeekOS tries to combine realism and simplicity. It is a realistic
system because it targets a real hardware platform - the x86 PC.
It strives for simplicitly in that it contains the bare minimum functionality
necessary to provide the services of a modern operating system,
such as virtual memory, a filesystem, and interprocess communication.
<important>
<para>
This document and the GeekOS distribution are works in progress.
At the time of writing (March 3, 2004) this document is about 80% complete,
and the GeekOS code is about 95% complete. We will be filling in
the remaining text and code in the near future. In the meantime,
if you have any questions about this manual or about GeekOS itself,
please send email to <email>[email protected]</email>.
</para>
</important>
</para>
<para>
This document has two purposes. The first purpose is to give an overview of
the GeekOS kernel, and to cover the topics needed to read, understand,
and modify the kernel source code. The second purpose is to present a
series of projects in which you can build important new functionality on
top of the GeekOS kernel. These projects are suitable for use in a
senior level undergraduate course, or for self-study.
</para>
<sect1 id="audience">
<title>Intended Audience</title>
<para>
This document is for anyone interested in gaining hands-on experience
in operating system kernel programming. Most operating system textbooks
focus on high level theory and concepts. This document is intended to
bridge the gap between those concepts and actual, working kernel code.
We will try to give you all of the information and background you need
to start hacking. In this way, this document complements your operating
system textbook.
</para>
</sect1>
<sect1 id="background">
<title>Required background</title>
<para>
Before you start hacking on GeekOS, we assume that you have the
following skills and knowledge:
<itemizedlist>
<listitem>
<para>Basic understanding of what an operating system kernel does</para>
</listitem>
<listitem>
<para>A strong understanding of the C programming language</para>
</listitem>
<listitem>
<para>Experience programming at the system call level in
an operating system such as Linux or Windows</para>
</listitem>
<listitem>
<para>Experience programming using threads, such as pthreads or
Java threads</para>
</listitem>
<listitem>
<para>Some knowledge of computer architecture and organization</para>
</listitem>
<listitem>
<para>Familiarity with assembly language for some CPU architecture,
and a willingness to learn x86 (a.k.a. Intel IA32) assembly language</para>
</listitem>
</itemizedlist>
</para>
</sect1>
</chapter>
<!--
**********************************************************************
Kernel Hacking 101
**********************************************************************
-->
<chapter id="hacking101">
<title>Kernel Hacking 101</title>
<para>
GeekOS is an operating system <emphasis>kernel</emphasis>. In many
respects, it is simply a C program. It has functions, threads,
a memory allocator, and so forth. However, unlike a C program executing
as a <emphasis>user mode</emphasis> process of a host operating system
such as Linux or Windows, a kernel operates in <emphasis>kernel mode</emphasis>.
A program executing in kernel mode has total control over the computer's
CPU, memory, and hardware devices.
</para>
<para>
Writing code to execute in kernel mode presents a few challenges
that you will need to be aware of. This chapter presents some
important tips and techniques that you will need to use as you
modify the GeekOS kernel.
</para>
<sect1>
<title>Kernel Mode Restrictions</title>
<para>
The runtime environment of the GeekOS kernel has several important
restrictions you will need to be aware of.
</para>
<sect2 id="nolibrary">
<title>Limited Set of Library Functions</title>
<para>
Because the operating system is that lowest level of software in the
computer, all functionality used by the kernel must be implemented
within the kernel. This is different than for user programs,
which are generally linked against a set of standard libraries containing
often-used functions.
</para>
<para>
The only standard C library functions available in GeekOS are a subset of
the string functions (<literal>strcpy()</literal>, <literal>memcpy()</literal>,
etc.) and the <literal>snprintf()</literal> function. Prototypes for
these functions are defined in the header <filename><geekos/string.h></filename>.
</para>
<para>
In addition to the standard C functions, the GeekOS kernel also contains
functions which are similar to C library functions. The <literal>Print()</literal>
function (prototype in <filename><geekos/screen.h></filename>)
is a subset of the standard C <literal>printf()</literal> function.
The <literal>Malloc()</literal> and <literal>Free()</literal> functions
(prototypes in <filename><geekos/malloc.h></filename>)
are equivalent to the <literal>malloc()</literal> and <literal>free()</literal>
functions.
</para>
</sect2>
<sect2 id="limitedstack">
<title>Limited Stack</title>
<para>
Each thread in the GeekOS kernel has a 4K stack. If a thread overflows
its stack, a kernel crash will generally be the result. Therefore, you should
be very careful to conserve stack space:
<itemizedlist>
<listitem>
<para>Do not allocate large data structures on the stack.
Use the kernel heap allocator (<literal>Malloc()</literal> and
<literal>Free()</literal>) instead.</para>
</listitem>
<listitem>
<para>Do not use recursion. Avoid deep call stacks.</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="nomemprot">
<title>Limited Memory Protection</title>
<para>
When the kernel starts executing there is no memory protection;
every memory access made by the kernel is to real (physical) memory.
Dereferences of null pointers are not trapped. For this reason, kernel
code is more vulnerable to stray pointer references than user code.
You will need to check your code very carefully to make sure there are no
memory errors, because they are hard to debug.
</para>
<para>
When you add virtual memory to GeekOS (<xref linkend="vmproject"/>),
the kernel will gain a greater degree of protection for memory errors,
but you will still need to be careful.
</para>
</sect2>
<sect2 id="asyncint">
<title>Asynchronous Interrupts</title>
<para>
Many hardware devices use <emphasis>interrupts</emphasis> to notify
the CPU of important events: the expiration of a timer, the completion
of an I/O request, etc. An interrupt is an immediate
asynchronous transfer of control to an <emphasis>interrupt handler</emphasis>.
When the handler completes, control returns to the point where execution was interrupted,
and the original code resumes. Note that interrupt handlers may cause
a <emphasis>thread context switch</emphasis>, meaning that other threads
may execute before control returns to the interrupted thread.
</para>
<para>
The practical implications of interrupts are described in <xref linkend="intsandthreads"/>.
You will want to read this section carefully.
</para>
</sect2>
</sect1>
<sect1 id="practices">
<title>Recommended Kernel Hacking Practices</title>
<para>
If you observe the precautions
described in this document, you will find that programming in kernel
mode is fairly straightforward. However, to make your kernel hacking
experience more pleasant, we strongly encourage you to adopt
the following habits.
</para>
<sect2 id="assertions">
<title>Use Assertions</title>
<para>
The header <filename><geekos/kassert.h></filename> defines the
<literal>KASSERT()</literal> macro, which takes a boolean expression. If the
expression evaluates to false, the macro prints a message and
halts the kernel. Here is an example:
<programlisting>
void My_Function(struct Thread_Queue *queue)
{
KASSERT(!Interrupts_Enabled()); /* Interrupts must be disabled */
KASSERT(queue != 0); /* queue must not be null */
...
}
</programlisting>
As much as possible, you should rigorously use assertions to check
function preconditions, postconditions, and data structure invariants.
Assertions have two important benefits. First, a failed assertion
immediately and precisely pinpoints a bug in the code, often
before the kernel state becomes seriously corrupted. Second, assertions
help document the code.
</para>
</sect2>
<sect2 id="printstatements">
<title>Use Print Statements</title>
<para>
The <filename><geekos/screen.h></filename> header defines the
<literal>Print()</literal> function, which supports much of the functionality
of the standard C <literal>printf()</literal> function. Print statements
are the most useful general technique for debugging kernel code.
As with any debugging, you should adopt the strategy of forming a hypothesis
and gathering evidence to support or refute the hypothesis.
</para>
</sect2>
<sect2 id="testearly">
<title>Test Early and Often</title>
<para>
To a much greater extent than for user programs, kernel code needs to be
developed and tested in small pieces. Whenever you reach a stable
point in your kernel development, you should commit your work to version
control. We strongly recommend that you use <ulink url="http://www.cvshome.org/">CVS</ulink>
to store your code.
</para>
</sect2>
</sect1>
</chapter>
<!--
**********************************************************************
Overview of GeekOS
**********************************************************************
-->
<chapter id="overview">
<title>Overview of GeekOS</title>
<para>
This chapter presents a high level overview of the GeekOS kernel and
the subsystems you will use when you add new functionality to GeekOS.
Once you have read this chapter, you can refer to <xref linkend="apiref"/>
for detailed descriptions of functions in the GeekOS kernel.
</para>
<sect1 id="memory">
<title>Memory</title>
<para>
The GeekOS kernel manages all memory in the system. Two types of memory
can be allocated.
</para>
<sect2 id="pageallocator">
<title>Page Allocator</title>
<para>
All of the memory in the system is divided into chunks called <emphasis>pages</emphasis>.
In the x86 architecture, the page size is 4K. A page is a unit of memory
that can be part of a virtual address space; this characteristic of
pages will come into play when you add virtual memory to GeekOS
(<xref linkend="vmproject"/>). For now, you can just think of pages
as small fixed size chunks of memory. Pages are allocated and freed using the
<literal>Alloc_Page()</literal> and <literal>Free_Page()</literal>
functions in the <filename><geekos/mem.h></filename> header file.
</para>
</sect2>
<sect2 id="heapallocator">
<title>Heap Allocator</title>
<para>
The heap allocator provides allocation of arbitrary-sized chunks of memory.
The <literal>Malloc()</literal> function allocates a chunk of memory,
and the <literal>Free()</literal> function releases a chunk of memory.
The prototypes for these functions are in the <filename><geekos/malloc.h></filename>
header file.
</para>
</sect2>
</sect1>
<sect1 id="intsandthreads">
<title>Interrupts and Threads</title>
<para>
<emphasis>Interrupts</emphasis> and <emphasis>threads</emphasis> are the
mechanisms used by GeekOS to divide CPU resources between the various tasks
the operating system performs. Understanding how interrupts and
threads interact is crucial to being able to add new functionality to
the kernel.
</para>
<sect2 id="interrupts">
<title>Interrupts</title>
<para>
<emphasis>Interrupts</emphasis> are used to inform the CPU of the
occurrence of an important event. The important characteristic
of an interrupt is that it causes control to transfer immediately
to an <emphasis>interrupt handler</emphasis>. Interrupt handlers are
simply C functions.
</para>
<para>
There are several kinds of interrupts:
<itemizedlist>
<listitem>
<para>
<emphasis>Exceptions</emphasis> indicate that the currently executing
thread performed an illegal action. They are a form of <emphasis>synchronous</emphasis>
interrupt, because they happen in a predictable manner. Examples include execution of
an invalid instruction and attempts to divide by zero. Exceptions
generally kill the thread which raised them, because there is no way
to recover from the exception.
</para>
</listitem>
<listitem>
<para>
<emphasis>Faults</emphasis>, like exceptions, are also synchronous. Unlike
exceptions, they are generally recoverable; the kernel can do some work
to remove the condition that caused the fault, and then allow the faulting
thread to continue executing. An example is a <emphasis>page fault</emphasis>,
which indicates that a page containing a referenced memory location is
not currently mapped into the address space. If the kernel can locate the
page and map it back into the address space, the faulting thread can continue
executing. You will learn more about page faults in <xref linkend="vmproject"/>.
</para>
</listitem>
<listitem>
<para>
<emphasis>Hardware interrupts</emphasis> are used by external hardware
devices to notify the CPU of an event. These interrupts are
<emphasis>asynchronous</emphasis>, because they are unpredictable.
In other words, an asynchronous interrupt can happen at any time.
Sometimes the kernel is in a state where it cannot immediately handle
an asynchronous interrupts. In this case, the kernel can temporarily
<emphasis>disable interrupts</emphasis> until it is ready to handle
them again. An example of a hardware interrupt is the
<emphasis>timer interrupt</emphasis>.
</para>
</listitem>
<listitem>
<para>
<emphasis>Software interrupts</emphasis> are used by <emphasis>user mode</emphasis>
processes to signal that they require attention from the kernel. The
only kind of software interrupt used in GeekOS is the
<emphasis>system call interrupt</emphasis>, which is used by processes
to request a service from the kernel. For example, system calls are
used by processes to open files, perform input and output, spawn new processes,
etc.
</para>
</listitem>
</itemizedlist>
</para>
<para>
When an interrupt handler has completed, it returns control to the
thead which was interrupted at the exact instruction where the interrupt
occurred. For the most part, the original thread
resumes as if the interrupt never happened.
</para>
<para>
The occurrence of an interrupt can cause a <emphasis>thread context switch</emphasis>.
This fact has imporant consequences for code that modifies shared
kernel data structures, and will be described in detail in
<xref linkend="interactions"/>.
</para>
</sect2>
<sect2 id="threads">
<title>Threads</title>
<para>
Threads allow multiple tasks to share the CPU. In GeekOS, each thread
is represented by a <literal>Kernel_Thread</literal> object,
defined in <filename><geekos/kthread.h></filename>.
</para>
<para>
Threads are selected for execution by the <emphasis>scheduler</emphasis>.
At any given time, a single thread is executing. This is the
<emphasis>current thread</emphasis>, and a pointer to its <literal>Kernel_Thread</literal>
object is available in the <literal>g_currentThread</literal> global variable.
Threads that are ready to run, but not currently running, are placed
in the <emphasis>run queue</emphasis>. Threads that are waiting for
a specific event to occur are placed on a <emphasis>wait queue</emphasis>.
Both the run queue and wait queues are instances of the
<literal>Thread_Queue</literal> data structure, which is simply a linked
list of <literal>Kernel_Thread</literal> objects.
</para>
<para>
Some threads execute entirely in kernel mode. These threads are called
<emphasis>system threads</emphasis>. System threads are used to perform
demand-driven tasks within the kernel. For example, a system thread called
the <emphasis>reaper thread</emphasis> is used to free the resources of
threads which have exited. The floppy and IDE disk drives each use a
system thread to wait for I/O requests, perform them, and communicate
the results back to the requesting threads.
</para>
<para>
In contrast to system threads, <emphasis>processes</emphasis> spend most of their
time executing in <emphasis>user mode</emphasis>. Processes should be
familiar to you already; when you run an ordinary program in an operating
system like Linux or Windows, the system creates a process to execute
the program. Each process consists of a <emphasis>memory space</emphasis>
reserved for the exclusive use of the running program, as well as other
resources like files and semaphores.
</para>
<para>
In GeekOS, a process is simply a <literal>Kernel_Thread</literal> which has
a special data structure attached to it. This data structure is the
<literal>User_Context</literal>. It contains all of the memory and other
resources allocated to the process. Because processes are just ordinary
threads which have the capability of executing in user mode, they are
sometimes referred to in GeekOS as <emphasis>user threads</emphasis>.
</para>
<para>
Processes start out executing in user mode. However, interrupts
occurring while the process is executing in user mode cause the
process to switch back into kernel mode. When the interrupt handler
returns, the process resumes executing in user mode.
</para>
</sect2>
<sect2 id="threadsync">
<title>Thread Synchronization</title>
<para>
GeekOS provides a high level mechanism to synchronize threads:
<emphasis>mutexes</emphasis> and <emphasis>condition variables</emphasis>.
(The mutex and condition variable implementation in GeekOS is modeled
on the pthreads API, so if you have some any pthreads programming,
this section should seem very familiar.) Mutexes and condition
variables are defined in the <filename><geekos/synch.h></filename>
header file.
<important>
<para>
Mutexes and condition variables may only be used to synchronize threads.
It is not legal to access a mutex or condition variable from an
handler for an asynchronous interrupt.
</para>
</important>
</para>
<para>
<emphasis>Mutexes</emphasis> are used to guard <emphasis>critical sections</emphasis>.
A mutex ensures MUTual EXclusion within a critical section guarded by the
mutex; only one thread is allowed to hold a mutex at any given time.
If a thread tries to acquire a mutex that is already held by another
thread, it is suspended until the mutex is available.
</para>
<para>
Here is an example of a function that atomically adds a node to a list:
<programlisting>
#include <geekos/synch.h>
struct Mutex lock;
struct Node_List nodeList;
void Add_Node(struct Node *node) {
Mutex_Lock(&lock);
Add_To_Back_Of_Node_List(&nodeList, node);
Mutex_Unlock(&lock);
}
</programlisting>
</para>
<para>
<emphasis>Condition variables</emphasis> represent a condition that threads
can wait for. Each condition variable is associated with a mutex,
which must be held while accessing the condition variable and when inspecting
or modifying the program state associated with the condition.
</para>
<para>
Here is an elaboration of the earlier example that allows threads
to wait for a node to become available in the node list:
<programlisting>
#include <geekos/synch.h>
struct Mutex lock;
struct Condition nodeAvail;
struct Node_List nodeList;
void Add_Node(struct Node *node) {
Mutex_Lock(&lock);
Add_To_Back_Of_Node_List(&nodeList, node);
Cond_Broadcast(&nodeAvail);
Mutex_Unlock(&lock);
}
struct Node *Wait_For_Node(void) {
struct Node *node;
Mutex_Lock(&lock);
while (Is_Node_List_Empty(&nodeList)) {
/* Wait for another thread to call Add_Node() */
Cond_Wait(&nodeAvail, &lock);
}
node = Remove_From_Front_Of_Node_List(&nodeList);
Mutex_Unlock(&lock);
return node;
}
</programlisting>
</para>
</sect2>
<sect2 id="interactions">
<title>Interactions between Interrupts and Threads</title>
<para>
The GeekOS kernel is <emphasis>preemptible</emphasis>. This means that,
in general, a <emphasis>thread context switch</emphasis> can occur at any time.
Choosing which thread to execute at a preemption point is the job
of the <emphasis>scheduler</emphasis>. In general, the scheduler will
choose the task which has the highest <emphasis>priority</emphasis>
and is ready to execute.
</para>
<para>
The main cause of asynchronous (involuntary) threads switches is the
timer interrupts, which the kernel uses to ensure that no single
thread can completely monopolize the CPU. However, other hardware interrupts
(such as the floppy disk interrupt) can also cause asynchronous thread switches.
Threads often need to modify data structures shared by other threads
and/or interrupt handler functions. If a thread switch were to occur
in the middle of an operation modifying a shared data structure, the data
structure could be left in an inconsistent state, leading to a
kernel crash or other unpredictable behavior.
</para>
<para>
Fortunately, it is easy to temporarily disable preemption by
<emphasis>disabling interrupts</emphasis>. This is done by calling
the <literal>Disable_Interrupts()</literal> function
(prototype in <filename><geekos/int.h></filename>).
After this function is called, the processor ignores
all external hardware interrupts. While interrupts are disabled,
the current thread is guaranteed to retain control of the CPU:
no other threads or interrupt handlers will execute.
When the thread is ready to re-enable preemption, it can call
<literal>Enable_Interrupts()</literal>.
</para>
<para>
There are a variety of situations when interrupts should be disabled.
Generally, disabling interrupts can be used to make any sequence
of instructions <emphasis>atomic</emphasis>; this means that the
entire sequence of instructions is guaranteed to complete as a unit,
without interruption.
</para>
<para>
The most important specific situation when interrupts should be disabled
is when a scheduler data structure is modified. A typical example is
putting the current thread on a <emphasis>wait queue</emphasis>. Here is an
example:
<programlisting>
/* Wait for an event */
Disable_Interrupts();
while (!eventHasOccurred) {
Wait(&waitQueue);
}
Enable_Interrupts();
</programlisting>
In this example, a thread is waiting for the occurrence of an asynchronous
event. Until the event occurs, it will suspend itself by waiting on
a wait queue. When the event occurs, the interrupt handler for the event
will set the <literal>eventHasOccurred</literal> flag and move the
thread from the wait queue to the run queue.
</para>
<para>
Consider what could happen if interrupts were <emphasis>not</emphasis>
disabled in the example above. First, the interrupt handler for the event
could occur between the time <literal>eventHasOccurred</literal> is checked
and when the calling thread puts itself on the wait queue. This might
result in the thread waiting forever, even though the event it is waiting
for has already occurred. A second possibility is that a thread switch
might occur while the thread is adding itself to the wait queue.
The handler for the interrupt causing the thread switch will place the
current thread on the run queue, <emphasis>while the wait queue is in
a modified (inconsistent) state</emphasis>. Any code accessing the
wait queue (such as an interrupt handler or another thread)
might cause the system to crash.
</para>
<para>
Fortunately, (almost) all functions in GeekOS that need to be called with
interrupts disabled contain an assertion that will inform you immediately
if they are called with interupts enabled. You will see many
places in the code that look like this:
<programlisting>
KASSERT(!Interrupts_Enabled());
</programlisting>
These statements indicate that interrupts must be disabled in order
for the code immediately following to execute.
Knowing when interrupts need to be disabled is a bit tricky
at first, but you will soon get the hang of it.
</para>
<para>
One final caveat: regions of code which explicitly disable interrupts
cannot be nested. Another way to put this is that it is illegal to
call <literal>Disable_Interrupts()</literal> when interrupts are
already disabled, and it is illegal to call <literal>Enabled_Interrupts()</literal>
when interrupts are already enabled. Consider the following code:
<programlisting>
void f(void) {
Disable_Interrupts();
g();
... modify shared data structures ...
Enable_Interrupts();
}
void g(void) {
Disable_Interrupts();
...
Enable_Interrupts();
}
</programlisting>
This code, if allowed to execute, would contain a bug. If the function
<literal>g()</literal> returned with interrupts enabled, a context switch
in function <literal>f()</literal> could interrupt a modification to
a shared data structure, leaving it in an inconsistent state.
Fortunately, there is a simple way to write code that will selectively
disable interrupts if needed:
<programlisting>
bool iflag;
iflag = Begin_Int_Atomic();
... interrupts are disabled ...
End_Int_Atomic(iflag);
</programlisting>
Sections of code that use <literal>Begin_Int_Atomic()</literal> and
<literal>End_Int_Atomic()</literal> may be nested safely.
</para>
</sect2>
</sect1>
<sect1 id="devices">
<title>Devices</title>
<para>
The GeekOS kernel contains <emphasis>device drivers</emphasis> for
several important hardware devices.
</para>
<sect2 id="screen">
<title>Text Screen</title>
<para>
The text screen provides support for displaying text. The screen
driver in GeekOS emulates a subset of VT100 and ANSI escape codes
for cursor movement and setting character attributes. Text screen
services are provided in the <filename><geekos/screen.h></filename>
header file.
</para>
<para>
The main function you will use in sending output to the screen is
the <literal>Print()</literal> function, which supports a subset of the
functionality of the standard C <literal>printf()</literal> function.
Low level screen output is provided by the <literal>Put_Char()</literal>
and <literal>Put_Buf()</literal> functions, which write a single character
and a sequence of characters to the screen, respectively.
</para>
</sect2>
<sect2 id="keyboard">
<title>Keyboard</title>
<para>
The keyboard device driver provides a high level interface to the
keyboard. It installs an interrupt handler for keyboard events,
and translates the low level key scan codes to higher level codes
that contain ASCII character codes for pressed keys, as well as
modifier information (whether shift, control, and/or alt are pressed).
The header for for the keyboard services is <filename><geekos/keyboard.h></filename>.
</para>
<para>
Threads can wait for a key event by calling the <literal>Wait_For_Key()</literal>
function. Key codes are returned using the <literal>Keycode</literal> datatype,
which is a 16 bit unsigned integer. The low 10 bits of the keycode indicate
which physical key was pressed or released. Several flag bits are used:
<itemizedlist>
<listitem>
<para>
<literal>KEY_SPECIAL_FLAG</literal> is set for keys that do not
have an ASCII representation. For examples, function keys and
cursor keys call into this category. If this flag is <emphasis>not</emphasis>
set, then the low 8 bits of the key code contain the ASCII code.
</para>
</listitem>
<listitem>
<para>
<literal>KEY_KEYPAD_FLAG</literal> is set for keys on the numeric keypad.
</para>
</listitem>
<listitem>
<para>
<literal>KEY_SHIFT_FLAG</literal> is set if one of the shift keys is
currently pressed. For alphabetic keys, this will cause the ASCII code
to be upper case.
</para>
</listitem>
<listitem>
<para>
<literal>KEY_CTRL_FLAG</literal> and <literal>KEY_ALT_FLAG</literal>
are set to indicate that the control and alt keys are pressed,
respectively.
</para>
</listitem>
<listitem>
<para>
<literal>KEY_RELEASE_FLAG</literal> is set for key release events.
You will probably want to ignore key events that have this flag set.
</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="timer">
<title>System Timer</title>
<para>
The system timer is used to provide a periodic timeslice interrupt.
After the current thread has been interrupted a set number of times
by the timer interrupt, the scheduler is invoked to choose a new thread.
You will generally not need to use any timer services directly.
However, it is worth noting that the system timer is the mechanism used
to ensure that all threads have a chance to execute by preventing any
single thread from monopolizing the CPU.
</para>
</sect2>
<sect2 id="blockdevs">
<title>Block Devices: Floppy and IDE Disks</title>
</sect2>
<para>
<emphasis>Block devices</emphasis> are the abstraction used to represent
storage devices: i.e., disks. They are called block devices because they
are organized as a sequence of fixed size blocks called <emphasis>sectors</emphasis>.
Block device services are defined in the <filename><geekos/blockdev.h></filename>
header file.
</para>
<para>
Although real block devices often have varying sector sizes, GeekOS makes
the simplifying assumption that all block devices have a fixed sector size
of 512 bytes. This is the value defined by the <literal>SECTOR_SIZE</literal>
macro.
</para>
<para>
GeekOS supports two kinds of block devices: <emphasis>floppy drives</emphasis>
and <emphasis>IDE disk drives</emphasis>. A naming scheme is used to
identify particular drives attached to the system:
<itemizedlist>
<listitem><para><literal>fd0</literal>: the first floppy drive</para></listitem>
<listitem><para><literal>ide0</literal>: the first IDE disk drive</para></listitem>
<listitem><para><literal>ide1</literal>: the second IDE disk drive</para></listitem>
</itemizedlist>
A particular instance of a block device is represented in the kernel by
the <literal>Block_Device</literal> data structure. You can retrieve a
block device object by passing its name to the <literal>Open_Block_Device()</literal>
function. Once you have the block device object, you can read and write
particular sectors using the <literal>Block_Read()</literal> and
<literal>Block_Write()</literal> functions.
</para>
<para>
Block devices are used as the underlying storage for <emphasis>filesystems</emphasis>.
A filesystem builds higher level abstractions (files and directories)
on top of the raw block storage offered by block devices. In
<xref linkend="fsproject"/>, you will implement a filesystem using
an underlying block device.
</para>
</sect1>
</chapter>
<!--
**********************************************************************
Overview of the Projects
**********************************************************************
-->
<chapter id="projectoverview">
<title>Overview of the Projects</title>
<para>
This chapter gives a brief overview of the projects in which you will add
important new functionality to the GeekOS kernel. It also discusses
all of the requirements for compiling and running GeekOS.
</para>
<sect1 id="projectsbrief">
<title>Project Descriptions</title>
<para>
There are a total of seven projects. For the most part, each project builds
on the one before it, so you will need to do them in order.
The projects were originally developed as part of a senior
level undergraduate operating systems course, so
the complete sequence will require a significant amount of effort to
complete. Some projects will be more
difficult than others; in particular, the virtual memory project
(<xref linkend="vmproject"/>) and the filesystem project (<xref linkend="fsproject"/>)
require you to write a fairly large amount of code (several hundred to
a thousand lines of code for each project). The good news is that when
you complete the last project, GeekOS will be a functional operating
system, capable of running multiple process with full memory protection.
</para>
<para>
Project 0 (<xref linkend="introproject"/>) serves as an introduction to
modifying, building, and running GeekOS. You will add a kernel
thread to read keys from the keyboard and echo them to the screen.
</para>
<para>
For Project 1 (<xref linkend="elfparsingproject"/>), you become
familiar with the structure of an executable file. You are provided
with code for loading and running executable files, but you need to
first become familiar with the ELF file format, then write code to
parse the provided file and pass it to the loader.
</para>
<para>
In Project 2 (<xref linkend="usermodeproject"/>), you will add
support for <emphasis>user mode processes</emphasis>. Rather than using
virtual memory to provide separate user address spaces, this project
uses <emphasis>segmentation</emphasis>, which is simpler to understand.
</para>
<para>
Project 3 (<xref linkend="schedulingproject"/>) improves the GeekOS scheduler
and adds support for semaphores to coordinate multiple processes.
</para>
<para>
Project 4 (<xref linkend="vmproject"/>) replaces the segmentation based
user memory protection added in Project 1 with paged virtual memory.
Pages of memory can be stored on disk in order to free up RAM when
the demand for memory exceeds the amount available.
</para>
<para>
Project 5 (<xref linkend="fsproject"/>) adds a hierarchical read/write
filesystem to GeekOS.
</para>
<para>
Project 6 (<xref linkend="ipcproject"/>) adds access control lists
(ACLs) to the filesystem, and adds interprocess communication using
anonymous half-duplex pipes. Upon the completion of this project, GeekOS will
resemble a very simple version of Unix.
</para>
</sect1>
<sect1 id="requiredsoftware">
<title>Required Software</title>
<para>
Compiling GeekOS requires a number of tools. The good news is that if you