https://github.com/cklin/mdm
The Middleman Project (mdm) aims to create utility programs that unleash the power of multi-processor and multi-core computer systems. It does so by helping you parallelize your shell scripts, Makefiles, or any other program that invoke external programs.
To run mdm, you need a modern (2.6+) Linux system, GNU screen
and ncurses library. It should be easy to port to other Unix
systems by writing new /proc
parsers and fixing any library
incompatibilities.
To build mdm, simply run make
at the toplevel. This project is simple
enough so that there is no need for autoconf and automake. To install,
use make install
as follows:
$ make install PREFIX=/install/directory/prefix
Without the PREFIX
override, make install
installs mdm to /usr/local
.
The philosophy behind mdm is that users should benefit from their multi-core systems without making drastic changes to their shell scripts. With mdm, you annotate your scripts to specify which commands might benefit from parallelization, and then you run it under the supervision of the mdm system. At runtime, the mdm system dynamically discovers parallelization opportunities and run the annotated commands in parallel as appropriate.
Suppose you use the following shell script (encode.sh) for encoding your music library. It works, but it leaves your quad-core computer mostly idle because it processes only one file at a time.
#!/bin/bash
for i in */*.wav
do echo $i
ffmpeg -i "$i" "${i%%.wav}.mp3"
done
You can parallelize this shell script in three easy steps.
-
Find commands that you think are suitable for parallel execution, and annotate them with mdm-run. Here is the modified encode.sh:
#!/bin/bash for i in *.wav do echo $i mdm-run ffmpeg -i "$i" "${i%%.wav}.mp3" done
-
Specify the I/O behavior of your parallel commands in an iospec file. You know ffmpeg reads from its -i option argument and writes to its command argument (w/o option), so this is what you write in your iospec file:
ffmpeg R-i W
You can skip this step if you are certain the parallel command cannot interfere with any other command in the script.
-
Run the script under mdm.screen as follows:
$ mdm.screen -c iospec encode.sh
You should see a monitoring program (mdm-top) displaying the execution status of your parallel commands, and the encoding process should (hopefully) complete a lot sooner because you are giving all processing cores a good workout!
The mdm-run command runs executable programs asynchronously. Therefore, there are a few cases where you should not annotate a command with mdm-run:
-
The command is a shell built-in,
-
You need to know the exit status of the command, or
-
You perform I/O redirection on the command.
The I/O specification file (iospec) specifies the I/O behavior of programs. The mdm system use these specifications to decide whether it is okay to run two annotated commands at the same time. Each line of the file describes a program. Here are a few examples:
ffmpeg R-i W
rm W
cc W-o 0-c Rbusy R
date Wbusy
In plain English:
-
ffmpeg
reads from the option argument of-i
and writes to all its non-option arguments, -
rm
writes to all its non-option arguments, -
cc
writes to its-o
argument,-c
takes no arguments, reads from the (abstract) file "busy" and from its non-option arguments, and -
date
writes to the (abstract) file "busy".
Adding the abstract file "busy" to the iospec ensures that mdm will never schedule the date command to run when any "cc" command is still running (and vice versa).
Beware that the iospec format is subject to change in the future.
The mdm-sync command is just like mdm-run, except that it does not submit the command for parallel execution. Use mdm-sync to annotate a command when you don't want it to run in parallel, but you think it might interfere with a command annotated by mdm-run.
Please report bugs through the GitHub mdm issue tracker.