Skip to content

Commit ad79057

Browse files
committed
Add bunch of random stuff
1 parent 9e575ac commit ad79057

File tree

4 files changed

+234
-0
lines changed

4 files changed

+234
-0
lines changed

README.md

+220
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# Tiny ELF loader
2+
3+
## Introduction
4+
5+
This is a failed submission for IOCCC 2013.
6+
7+
This is a tiny dynamic linker/loader for ELF. This loads programs
8+
built on Linux, and it runs on Linux, Mac OSX, Cygwin, and possibly on
9+
other OS. thanks to its Linux/glibc emulation layer. This means you
10+
can run Linux programs on other OSes. This works only for x86. This
11+
program is similar to how WINE works.
12+
13+
## Usage
14+
15+
Note that all commands below assumes you have all of my submission in
16+
the current directory.
17+
18+
### Usage for Linux and Mac OSX
19+
20+
$ make
21+
$ ./elf bin/hello
22+
$ ./elf bin/i386-tcc-32 # help is shown
23+
24+
To compile something with TinyCC (http://tinycc.org/), which is based
25+
on a former IOCCC winning entry (http://www0.us.ioccc.org/2001/bellard.c),
26+
you need to set up your environment by mkenv.sh. This script downloads
27+
two debian packages and extracts it to set up "linux" directory which
28+
contains include files and object files for TinyCC. This script
29+
requires curl, ar, tar, and perl.
30+
31+
$ ./mkenv.sh
32+
$ ./elf bin/i386-tcc-32 -E ./hello.c
33+
$ ./elf bin/i386-tcc-32 ./hello.c -o hello-tcc
34+
$ ./elf ./hello-tcc
35+
36+
You can compile more complex programs with the TCC loaded by this ELF
37+
loader. For example, let's compile the source code of TCC itself.
38+
39+
$ curl -L -O http://download.savannah.gnu.org/releases/tinycc/tcc-0.9.26.tar.bz2 # or wget
40+
$ tar -xvjf tcc-0.9.26.tar.bz2
41+
$ cd tcc-0.9.26
42+
$ ./configure
43+
$ cd ..
44+
$ ./elf bin/i386-tcc-32 -o i386-tcc-32-tcc tcc-0.9.26/tcc.c -DONE_SOURCE -DTCC_TARGET_I386 -DCONFIG_SYSROOT='"linux"' -DCONFIG_TCCDIR='"linux/tcc"' -g -O2 -m32 -lm -ldl
45+
$ ./elf ./i386-tcc-32-tcc
46+
47+
Of course, you can load the TCC built by the original TCC.
48+
49+
$ ./elf ./i386-tcc-32-tcc -o i386-tcc-32-tcc-tcc tcc-0.9.26/tcc.c -DONE_SOURCE -DTCC_TARGET_I386 -DCONFIG_SYSROOT='"linux"' -DCONFIG_TCCDIR='"linux/tcc"' -g -O2 -m32 -lm -ldl
50+
$ ./elf ./i386-tcc-32-tcc-tcc # this still works
51+
52+
### Usage for Cygwin
53+
54+
See also usage for Linux and Mac as well.
55+
56+
Unfortunately, Cygwin does not support MAP_FIXED for 4k boundaries so
57+
we need to use special Linux binaries whose segments are aligned to
58+
64k boundaries.
59+
60+
(cygwin) $ tar -xvzf for_cygwin.tgz
61+
(cygwin) $ make
62+
(cygwin) $ ./elf bin/hello-aligned
63+
(cygwin) $ ./elf bin/i386-tcc-32-aligned # help is shown
64+
65+
You can build Linux binaries with i386-tcc-32-aligned, but you cannot
66+
run the output because it is not aligned properly. However, you can
67+
run the output on Linux.
68+
69+
(cygwin) $ ./mkenv.sh
70+
(cygwin) $ ./elf bin/i386-tcc-32-aligned -E ./hello.c
71+
(cygwin) $ ./elf bin/i386-tcc-32-aligned ./hello.c -o hello-tcc-win
72+
(cygwin) $ ./elf ./hello-tcc-win # mmap fails
73+
(linux) $ ./hello-tcc-win # works
74+
75+
You can reproduce the -aligned binaries by using align.lds.
76+
77+
(linux) $ gcc -m32 hello.c -Wl,-Talign.lds -o hello-aligned
78+
79+
### Chain load
80+
81+
You can load this loader itself.
82+
83+
$ ./elf bin/elf-linux bin/hello
84+
$ ./elf bin/elf-linux bin/i386-tcc-32
85+
86+
For Cygwin, please use hello-aligned and i386-tcc-32-aligned instead.
87+
88+
Of course, on Linux and Mac, you still can run programs built by TCC
89+
chain-loaded by this loader loaded by this loader.
90+
91+
$ ./elf bin/elf-linux bin/i386-tcc-32 ./hello.c -o hello-tcc
92+
$ ./elf bin/elf-linux ./hello-tcc
93+
94+
You can reproduce elf-linux by
95+
96+
(linux) $ gcc -m32 -g -Wall -W elf.o -rdynamic -ldl -Wl,-Ttext-segment=0x3000000 -Wl,-Talign.lds -o elf-linux
97+
98+
As you see, the start address of elf-linux was adjusted for Linux, and
99+
the alignment of elf-linux was adjusted for Cygwin.
100+
101+
Note that you cannot load elf-linux twice, because the address layout
102+
of elf-linux is fixed.
103+
104+
$ ./elf bin/elf-linux bin/elf-linux # fails
105+
106+
### Add Linux only APIs
107+
108+
This loader cannot run arbitrary Linux binaries on other OSes mainly
109+
because its Linux emulation layer lacks a lot of functions. However,
110+
you can easily add such functions. For example, see the following
111+
C code:
112+
113+
#include <stdio.h>
114+
#include <string.h>
115+
int main() {
116+
char buf[] = "hello";
117+
memfrob(buf, 5);
118+
puts(buf);
119+
return 0;
120+
}
121+
122+
This code uses memfrob, which is a glibc-only function, and this will
123+
not work on Mac or Cygwin.
124+
125+
$ ./elf ./memfrob # linux only
126+
127+
However, by providing the implementation of memfrob in elf.c, you can
128+
run this program on Mac or Cygwin. Please add the following code at
129+
the bottom of elf.c:
130+
131+
void* memfrob(void* v, size_t n) {
132+
char* p = (char*)v;
133+
while (n--) {
134+
*p++ ^= 42;
135+
}
136+
return v;
137+
}
138+
139+
$ make
140+
$ ./elf ./memfrob # now it works on everywhere!
141+
142+
## Obfuscation techniques
143+
144+
### ASCII arts
145+
146+
The code itself provides some ideas about what code does. The first
147+
three letters, 'E', 'L', and 'F', are just some preprocessor
148+
directives and some data. The face of elf is Linux emulation layer.
149+
150+
Then, the next box which has four cells explains how ELF objects look
151+
like. An ELF object always starts with an ELF header. The code around
152+
the first cell actually parses the ELF headers. Notice the string in
153+
the cell ("ELF Header") is used as a part of the error message.
154+
155+
$ ./elf hello.c # not ELF Header
156+
157+
Then, multiple program headers follow. You see the following for-loop
158+
at the top of the second cell.
159+
160+
for(K=E+=13;K<E+E[-2]%65536*8;K+=8){
161+
162+
This is the loop which handles program headers. Then, next line starts
163+
with
164+
165+
if(*K==1)
166+
167+
The code in this if-clause handles PT_LOAD (==1).
168+
169+
At the top of the 4th cell, you will see
170+
171+
if(*K==2)
172+
173+
The code after this handles PT_DYNAMIC (==2).
174+
175+
### Compactness
176+
177+
Another notable characteristic of this code is its
178+
compactness. elf-tiny.c is the compressed version of this program,
179+
which has no error checks. elf-tiny.c has only less than 1000
180+
bytes. I'd claim this is the tiniest ELF loader in the world, but it
181+
just works on multiple OSes:
182+
183+
$ make elf-tiny
184+
$ ./elf-tiny ./hello
185+
186+
To achieve this extreme compactness, a number of techniques are
187+
used. One good example is
188+
189+
1[(I*)O]
190+
191+
at the top of the 4th cell. This <index>[<array>] style is well known
192+
obfuscation technique, but this code uses this style because this is
193+
shorter than
194+
195+
((I*)O)[1]
196+
197+
or
198+
199+
*((I*)O+1)
200+
201+
Another example is magic numbers like 7417633*159. This is 0x464c457f,
202+
which is the magic of ("\x7fELF") in little endian.
203+
204+
Finally, the following code snippet is one of my favorite in this
205+
program:
206+
207+
O=strstr(T,H=*((char**)D[6]+M/256*4)+D[5]),G=O?U[(O-T)/6]:Y(0,H)
208+
209+
This obtains an address of a symbol from its name. Do you see how it
210+
works? Why is strstr for T necessary?
211+
212+
## Philosophy
213+
214+
This entry focuses on a fairly overlooked tool, the dynamic linking
215+
loader. I wanted to show how compact the code of dynamic loaders can
216+
be by implementing the loader in less than 1000 bytes. Another goal of
217+
this entry was to show how useful the dynamic loaders can be, by
218+
allowing users to run Linux binaries on another OS.
219+
220+
One more thing this entry would demonstrate is the portability of x86.

bin/elf-linux

71.8 KB
Binary file not shown.

elf-tiny.c

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#define _GNU_SOURCE
2+
#include<dlfcn.h>
3+
#include<stdio.h>
4+
#include<string.h>
5+
#include<sys/mman.h>
6+
#include<unistd.h>
7+
#define I int
8+
#define A return
9+
I*E,i,k,F,*K,L,J,M,*V,*G,D[99],Q,N,exit();char*O,t,*H,**P,T[]="stdout!dlsym!lseek!stderr!mmap!__libc_start_main";I Z(I(*m)()){exit(m(Q,P,0));}I*__rawmemchr(I*s,I c){A memchr(s,c,1<<30);}I X(I f,I o,I w){A lseek(f,o,w);}I*Y(I*h,char*p){A dlsym(h?h:RTLD_DEFAULT,p);}I*W(void*a,size_t l,I p,I f,I d,I o){A mmap(a,l,p,f&31|(f&32?MAP_ANON:0),d,o);}void*U[]={0,Y,X,0,W,Z};I main(I S,char**R){*U=&stdout;U[3]=&stderr;Q=S-1;P=R+1;F=fileno(fopen(*P,"r"));E=W(0,N=4095,7,S=MAP_PRIVATE,F,0);for(K=E+=13;K<E+E[-2]%65536*8;K+=8){O=(char*)K[2];if(*K==1)O-=M=K[2]&N,W(O,J=(K[5]+M+N)&~N,7,S|MAP_FIXED,F,K[1]-M),L=K[4]+M,memset(O+L,0,J-L);if(*K==2){for(;*O;O+=8)D[*O&63]=1[(I*)O];for(i=-D[2];i<D[18];i+=8)t=M=1[V=(I*)((i<0?D[23]+D[2]:D[17])+i)],V=(I*)*V,O=strstr(T,H=*((char**)D[6]+M/256*4)+D[5]),G=O?U[(O-T)/6]:Y(0,H),*V=t-5?(t<7)**V+t%2*(I)G:*G;}}((I(*)())E[-7])();}

hello.c

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#include <stdio.h>
2+
int main() {
3+
puts("Hello, world!");
4+
return 0;
5+
}

0 commit comments

Comments
 (0)