...making Linux just a little more fun!
When was the last time you compiled a linux kernel?
Yesterday? Last week? Five minutes ago?
On a 486?
I don't remember either.
Remember how long it took?
I remember that. Too long. Too damn long.
Now why would I want to compile the latest kernel on a 486?
Ordinarily, I wouldn't. But with the tragic death of my main computer I was forced to move my computing needs to an old 486 someone had given me. I had been using this one as a NTP time server for my home network. Suffice it to say, what was on the NTP server wasn't the latest and the greatest. The other computer on the network wasn't much of an improvement over the 486. (A foundling laptop with a miniscule hard drive.)
Well I was screwed because I needed my Emacs. So I pulled the drive from the dead computer and hooked it up to the 486.
It worked flawlessly, which is a testament to the Linux kernel and GNU Software quality and efficiency. I didn't really know what to expect regarding response and the general feel of the environment, but in console mode I noticed no real difference. The X window system even worked fine, albeit slow on the start-up. Now, there was no way the GIMP or Mozilla was going to run with any kind of usability, but I could use Emacs and lynx or dillo without too many problems.
But did I really want to sit through something that was going to take a few hours at least? Not really. I guess I could have washed dishes, mowed the lawn or watched TV but, hey, TV sucks. I'd rather watch a kernel compile.
Enter the award-winning distcc, a distributed compiler front end for gcc, written by Martin Poole.
Distcc consists of two binary
programs: distccd
and distcc
.
distccd
runs as a daemon and handles network traffic. By
passing pre-processed source code files across a network to other computers
with an installed compiler, you effectively have two or more
compilations going at once.
distcc
is a front end to gcc
and
g++
. You specify distcc
as the compiler in place
of gcc
and it transparently handles all the magic that is
going on. distcc
can be used for all compile jobs whether you
need the networking capabilities or not, i.e., you can compile one file or
thousands, it's up to you.
The easiest way to demonstrate distcc
's abilities is to use
it to compile itself as an example of distributed compilation.
I'll show how to compile distcc
and give my time for the
initial compilation, then recompile using distcc
in place of
gcc
.
Minimum Requirements:
Two compatible networked computers designated as a server and a client.
The server:
This machine should have a complete C/C++ development
environment installed. You'll also need any other ancillary
development packages (readline, ncurses, gtk+, whatever) that your
particular bit of software needs for compiling.
distcc
itself requires nothing special.
Note: There are a couple of other programs produced by
distcc
:
distccmon-text
and distccmon-gnome
.
These are monitor programs to show you what's happening during a
distcc
compile session. The *-gnome version needs GTK at a
minimum but if you don't have it installed, don't worry.
The client:
This machine only needs the compilers installed. You do
not need libc, ncurses, kernel headers or the infinite array of
libraries things seem to need nowadays to compile.
distcc
source code available here:
distcc source code.
Building distcc
, the first run:
Standard Operating Procedure:
$ tar -jxvf distcc*(use j flag not z with tar, distcc is bzip2ed).
$ cd distcc* $ ./configure $ time make(don't forget the time command).
distcc
is small and doesn't require much time to build.
Here's the time from that aforementioned 486DX:
Without distcc
real 13m45.185s
user 12m4.320s
sys 1m7.120s
It took longer to run the configure script than it did to compile.
Install the binaries:
make install
distcc
and distccd
should be in /usr/local/bin
For the client machine:
Transfer a copy of distccd
to /usr/local/bin or your binary repository
of choice.
Now to use distcc
to recompile distcc
.
Make sure you are in the distcc
source directory
$ make clean
This will clean out all the crud leftover from the first compile. You won't need to run configure again.
We need to spend a couple of minutes setting up for distcc
.
1. Run the distccd
daemon on both computers.
$ distccd --daemon
It'll bitch about no distcc
user. Ignore the warning.
You can check to see that it's actually running via "$ ps -ax | grep
distccd" to assuage your concerns.
2. Set the DISTCC_HOSTS
environment variable:
You can use IP addresses or if your /etc/hosts file is set-up properly the hostnames of the computers.
I have two computers at the moment:
mothra on 192.168.1.2
ghidra on 192.168.1.3 (This one's a rescued 120MHz laptop. It would
be my main computer but it doesn't have the drive space I need.)
Set the variable (sh syntax, adjust for your shell):
$ export DISTCC_HOSTS="mothra ghidra" or $ export DISTCC_HOSTS="192.168.1.2 192.168.1.3"
Either way it doesn't matter.
NOTE: Names or addresses are space delimited.
Recompile the code:
$ time make -j4 CC=distcc
Explaining the command line:
time: should be obvious.
make -j4:
the -j flag is make
's "multiple command" flag. Read the
info manual for more specific information. Trust me, just use -j4
for now.
CC=distcc:
Override configured compiler directive. This way you can
do a regular configure with gcc
defined in the makefile.
distcc
is nice about not forcing complicated procedures to use
it.
distcc compiled with distcc real 6m38.089s user 2m42.200s sys 0m29.520s
Cut the time in half! You can't complain about that.
The following shows times for some of my favorite programs compiled with
and without distcc
, utilizing the two node setup describe above.
Remember, I'm compiling with a 486 without distcc
.
Dillo Web Browser
Without Distcc With Distcc real 52m14.120s real 22m31.975s user 47m24.820s user 5m12.630s sys 3m29.220s sys 1m23.930s
The BASH Shell
Without Distcc With Distcc real 75m25.306s real 18m22.613s user 69m2.110s user 3m27.950s sys 5m8.030s sys 0m58.980s
This was the most amazing for me. This is 1/4 of the
non-distcc
compilation time!
distcc
is flexible. You can use it as a one-shot compiler
or set-up your build environment to use it for all compiles.
You can define the available compiler hosts in a $HOME/.distcc/hosts file.
You can force distcc
to prefer one machine over another by listing the
order in the .distcc/hosts or DISTCC_HOSTS
environment variable.
For example, rather than having my poor little 486 desktop grind down
to an almost unusable state as gcc
takes over the system, I set
DISTCC_HOSTS='ghidra' and all the compilation is shipped to the faster
laptop.
More documentation is at the distcc web site.
Oh, yeah - that kernel compile. How long did it take? I don't
know. I said screw it, I'll just stick with the stock kernel from my
Slackware install. Even with distcc
it would take forever.
Maybe I'll bite the bullet at some point - but I think I'll just save up
for that dual processor Athlon system I've been coveting.
V. L. Simpson, after being unceremoniously (and rather rudely)
informed that GNU Emacs is not an operating system, has been
re-adjusted to a happy, regular life after many protracted
sessions with 'the doctor'.
A webpage is available here.