Goldsmith
Thursday, July 15th, 2010When hiring or being hired, if they fuck you during the application they will fuck you on the job.
When hiring or being hired, if they fuck you during the application they will fuck you on the job.
Some more readable examples:
http://www.cs.ucl.ac.uk/staff/c.clack/phd.html
http://www.ccs.neu.edu/home/shivers/diss-advice.html
http://www.phys.unsw.edu.au/~jw/thesis.html
The irony is that so many pieces of advice on writing are so bloody unreadable.
Science is the search for fact.
Art is the search for meaning.
Engineering is the application of both.
Discuss.
I’m still fiddling around with benchmarks.
So, if you ever want to run Whetstone on Ubuntu you will need to:
1. Download the code. I got it from here.
2. Compile it thus:
gcc whets.c -o whets -O2 -fomit-frame-pointer -ffast-math -fforce-addr -fforce-mem -lm -DUNIX
3. Run it:
./whets
4. Wait about a minute and a half as the results roll in then answer some questions to produce a result file.
This is what they look like on my aging 2nd desktop machine:
##############################################
Whetstone Single Precision Benchmark in C/C++
Date 2/1/2008 (DD/MM/YYYY)
Model Brand-x Pentium 3
CPU Pentium III (Coppermine)
Clock MHz 800 Mhz
Cache L1 16k L2 256k
H/W options
OS Ubuntu 6.06
Compiler gcc
Options -O2 -fomit-frame-pointer -ffast-math -fforce-addr -fforce-mem -lm -DUNIX
Loop content Result MFLOPS MOPS Seconds
N1 floating point -1.12441420555114746 241.012 0.561
N2 floating point -1.12241148948669434 199.674 4.739
N3 if then else 1.00000000000000000 1151.431 0.633
N4 fixed point 12.00000000000000000 3837.803 0.578
N5 sin,cos etc. 0.49907428026199341 20.709 28.288
N6 floating point 0.99999988079071045 118.127 32.151
N7 assignments 3.00000000000000000 121.284 10.728
N8 exp,sqrt etc. 0.75095528364181519 11.484 22.809
MWIPS 700.690 100.487
Results to load to spreadsheet MWIPS Mflops1 Mflops2 Mflops3 Cosmops Expmops Fixpmops Ifmops Eqmops
Results to load to spreadsheet 700.690 241.012 199.674 118.127 20.709 11.484 3837.803 1151.431 121.284
The following instructions follow on from my previous article concerning a cluster of Ubuntu machines and BOINC.
Similar to BOINC, Condor is a client server architecture. So I have instructions for a master machine and also for a bunch of slaves.
Master Machine
Download condor-6.8.7-linux-x86-rhel.tar.gz and uncompress it.
cd /data/condor-6.8.7
I have opted to make two directories, condor_root and condor_local:
mkdir condor_root
mkdir condor_local
Now run Condor’s configuration program and then set an environment variable to point to it (you might want to make it more permanent than that):
sudo ./condor_configure --install-dir=/data/condor-6.8.7/condor_root/ --type=manager,submit --local-dir=/data/condor-6.8.7/condor_local/ --owner=bcg --install=/data/condor-6.8.7/release.tar
export CONDOR_CONFIG=/data/condor-6.8.7/condor_root/etc/condor_config
Edit condor_root/etc/condor_config:
Set RELEASE_DIR to /data/condor-6.8.7/condor_root/
Set HOSTALLOW_WRITE to *
Set HOSTALLOW_ADMINISTRATOR = $(FULL_HOSTNAME)
Start Condor:
sudo condor_root/sbin/condor_master
Stop Condor:
sudo condor_root/sbin/condor_off -master
Check that its running:
ps -ef | egrep condor_
bcg@rhdl-a2:/data/condor-6.8.7$ ps -ef | egrep condor_
bcg 24421 1 0 12:25 ? 00:00:00 condor_root/sbin/condor_master
bcg 24422 24421 0 12:25 ? 00:00:00 condor_collector -f
bcg 24423 24421 0 12:25 ? 00:00:00 condor_negotiator -f
bcg 24424 24421 0 12:25 ? 00:00:00 condor_schedd -f
bcg 24425 24421 7 12:25 ? 00:00:07 condor_startd -f
bcg 24475 5431 0 12:27 pts/0 00:00:00 grep -E condor_
Create a job:
Put the following in a text file called mandelbrot16.condor (the filename can be anything really).
# file name: mandelbrot16.condor
# Condor submit description file for mandelbrot
Executable = /data/condor-6.7.8/mandelbrot16/mandelbrot
Universe = vanilla
Error = logs/err.$(cluster)
Output = logs/out.$(cluster)
Log = logs/log.$(cluster)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = files/mandelbrot_00000001
Arguments = mandelbrot_00000001 mandelbrot_00000001_out
Queue
create a directory to put the job in.
mkdir mandelbrot16
cd mandelbrot16
mkdir logs
mkdir files
Copy the files that you want condor to work with (such as the input files for the alogorithm) in the files directory. Put the algorithm itself in the mandelbrot16 directory.
Submit a job
condor_root/bin/condor_submit mandelbrot16/mandelbrot16.condor
Check on jobs
All jobs:
bin/condor_q
A job (say job id 3):
bin/condor_q 3
Client Machine
Download condor-6.8.7-linux-x86-rhel.tar.gz and uncompress it.
cd /data/condor-6.8.7
Same kind of setup as the Master box here. Note that the type is execute only:
mkdir condor_root
mkdir condor_local
sudo ./condor_configure --install-dir=/home/bcg/condor-6.8.7/condor_root/ --type=execute --local-dir=/home/bcg/condor-6.8.7/condor_local/ --owner=bcg --install=/home/bcg/condor-6.8.7/release.tar
export CONDOR_CONFIG=/home/bcg/condor-6.8.7/condor_root/etc/condor_config
Edit condor-6.8.7/condor_root/etc/condor_config
Set UID_DOMAIN = $(FULL_HOSTNAME)
Set FILESYSTEM_DOMAIN=$(FULL_HOSTNAME)
Set HOSTALLOW_ADMINISTRATOR = $(FULL_HOSTNAME)
Set HOSTALLOW_WRITE to *
Edit condor-6.8.7/condor_local/condor_config.local. Set CONDOR_HOST to the ip address of your master machine. Set NETWORK_INTERFACE to the ip address of the client machine you are setting up.
Set CONDOR_HOST = 144.6.40.251
SET UID_DOMAIN and FILESYSTEM_DOMAIN to $(FULL_HOSTNAME)
NETWORK_INTERFACE = 144.6.40.115
These settings, in the same file, make the client work on jobs as quickly as possible and with as much effort as possible regardless of user actions on the client machine. Remember, these instructions are for a dedicated cluster - you might not want to do this with your desktop machine.
WANT_SUSPEND = FALSE
CONTINUE = TRUE
SUSPEND = FALSE
PREEMPT = FALSE
START=TRUE
That’s it. Start them up on all the machines and wait for the computation to start. Condor can take several ( 5 to 10) minutes to get things underway, but once she starts she chuggs through the work pretty quickly.
I am doing some research that is applying some peer-to-peer (P2P) techniques to distributed computing. One assertion central to my thesis is that a P2P swarm of processing nodes can perform as well as a client server solution. In order for me to defend this claim, I need to pit my developed system against some other existing distributed computing systems. BOINC is one of the chosen candidates to get some data from when running a framework of various benchmarks.
Now BOINC can be a bit tricky to set up. And it took me some time to get running properly, so here are the notes that I took along the way which may help others embarking on a similar task.
There are a few shortcuts I take here because these machines are hosted on a private network. Someone wanting to set up a public BOINC project may not really want to omit some of the security steps that I have. Please consult the official BOINC wiki for security suggestions.
The following set of instructions shows how to get a non-BOINC executable running within the BOINC system and producing results. It uses the “wrapper” example from BOINC to achieve this. This is good for if you want to use an existing algorithm and not write a “proper” BOINC algorithm with proper checkpointing, graphical stuff, etc.
The algorithm I have written is a Mandelbrot set generator that takes two command line arguments. 1. An input text file containing the settings for the part of the Mandelbrot set it is going to generate and 2. the name of an output file to put the resulting jpeg. There are a lot of other non-generic things in these instructions like the ip address of my machine and the name of my user account etc. Obviously you’ll need to adjust these.
These instructions work on a server machine (a nothing fancy p3 machine running Ubuntu 6.06) with a full complement of development tools with Apache2, MySQL and PHP installed and working correctly.
Instructions:
Check out the code:
svn co http://boinc.berkeley.edu/svn/trunk/boinc
Revision 14015 was the latest revision at time of experimentation.
Build the code:
./_autosetup
./configure --disable-client
make
Notes on earlier versions of Ubuntu: You may need to upgrade automake etc to get it to configure properly. There are instructions elsewhere on this blog for upgrading automake to the latest version for older Ubuntu boxes.
Create the Project:
cd /data/source/boinc/tools/
./make_project --url_base http://144.6.40.251/ --db_host localhost --db_user root mandelbrot16
Note: The url_base is just that. The base url for which the project will be placed into a directory underneath. So the command above will produce a project URL of http://144.6.40.251/mandelbrot16/
Set up the Project Website:
sudo cp mandelbrot16.httpd.conf /etc/apache2/sites-available/
sudo ln -s /etc/apache2/sites-available/mandelbrot16.httpd.conf /etc/apache2/sites-enabled/mandelbrot16.httpd.conf
sudo apache2ctl restart
cd ~/projects/mandelbrot16
sudo chown www-data ~/projects/ -R
Note: this is for Ubuntu remember. These instructions won’t work if your Linux distribution has different group names for apache.
Enable Account Creation:
Enable account creation – edit config.xml in the ~/project/mandelbrot16/ directory and set the the disable_account_creation element to 0
Edit the Project:
In this example, we’re just using linux machines, so par down the project xml to only include one target platform.
Also, add in the name of the algorithm executable. In this case its mandelbrot.
<boinc>
<platform>
<name>i686-pc-linux-gnu</name>
<user_friendly_name>Linux running on an Intel x86-compatible CPU</user_friendly_name>
</platform>
<app>
<name>mandelbrot</name>
<user_friendly_name>mandelbrot16</user_friendly_name>
</app>
</boinc>
Edit config.xml:
Make sure the db_passwd element contains your db password
Get the code for Wrapper and Compile It:
You can’t just run any old executable in BOINC and expect it to work (I know, I tried). BOINC algorithm’s need to work with the BOINC framework and link to BOINC libraries etc. To get around this, there is an example program called wrapper which allows you to run an arbitrary executable. This code is not within the main source tree - its in samples.
svn co http://boinc.berkeley.edu/svn/trunk/boinc_samples
cd boinc_samples/wrapper
ln -s `g++ -print-file-name=libstdc++.a`
make
cp boinc_samples/wrapper/wrapper ~/projects/mandelbrot16/apps/mandelbrot/wrapper_5.5_i686-pc-linux-gnu
Copy the Algorithm:
Also make sure to name your algorithm as per BOINC’s naming convention so that it matches the target platforms set out in the project.xml.
mkdir ~/projects/mandelbrot16/apps/mandelbrot/
cp /data/source/sp2p/experiments/mandelbrot/160/comptorrents/working/mandelbrot/mandelbrot ~/projects/mandelbrot16/apps/mandelbrot_5.5_i686-pc-linux-gnu
bin/xadd
bin/update_versions
Add Templates for Work Units and Results:
Put the following into a file called result_template in ~/projects/mandelbrot16/templates/
<file_info>
<name><OUTFILE_0/></name>
<generated_locally/>
<upload_when_present/>
<max_nbytes>100000000</max_nbytes>
<url><UPLOAD_URL/></url>
</file_info>
<result>
<file_ref>
<file_name><OUTFILE_0/></file_name>
<open_name>out</open_name>
<copy_file/>
</file_ref>
</result>
Put the following into a file called work_unit_template in ~/projects/mandelbrot16/templates/
<file_info>
<number>0</number>
</file_info>
<workunit>
<file_ref>
<file_number>0</file_number>
<open_name>in</open_name>
<copy_file/>
</file_ref>
<delay_bound>6000000</delay_bound>
<rsc_fpops_bound>9999999999999999999999999999999999999999999999999999</rsc_fpops_bound>
<rsc_fpops_est>9999999999999999999999999999999999999999999999999999</rsc_fpops_bound>
</workunit>
Note: You may not want your algorithm to be allowed quite this long to execute. You may want to consult the BOINC documentation for the delay_bound, rsc_fpops_bound and rsc_fpops_est options and then make an informed judgement as to their appropriate value. As for me, I just want it to do the work and take as long as it wants. My situation here is that I have full control over all of the machines in my cluster.This is not the common situation with BOINC projects - so please adjust these variables for a “real” project.
Add Work Units
The mandelbrot executable takes 2 command line arguments: an infile and an outfile.
The infile stipulates the parameters for the mandelbrot set being generated.
The outfile is a resulting jpeg.
For this example, there are 16 work units, each a file containing the settings for the overall Mandelbrot set as well as the region of the set to be calculated (so it can be split up and computed over multiple machines).
Copy each of these files into the ~/projects/mandelbrot16/download directory.
Now, tell boinc about each one of these files by inserting each one of them as a work unit (using the templates from the previous step).
bin/create_work -appname mandelbrot -wu_name mandelbrot_00000001 -wu_template templates/work_unit_template -result_template templates/result_template -min_quorum 1 -target_nresults 1 mandelbrot_00000001
Do this changing the file and work unit name for each data set file eg. mandelbrot_00000002, mandelbrot_00000003 and so on.
Start Boinc:
sudo bin/start
You can also stop it (sudo bin/stop) and get a status (sudo bin/status)
Create a Client User Account:
Go to the project homepage (http://144.6.40.251/mandelbrot16/) and create an account.
If the server does not have public Internet access, as is often the case on a dedicated cluster, this page might take a while to load as it tries to get user info from the boinc main site. You can hack this out editing the file user.inc (home/bcg/projects/test1/html/inc/user.inc). Modify the function get_other_projects at line 60 to return $user and do nothing else eg:
function get_other_projects($user) {
/* $cpid = md5($user->cross_project_id . $user->email_addr);
$url = "http://boinc.netsoft-online.com/get_user.php?cpid=$cpid";
$f = fopen($url, "r");
if (!$f) {
return $user;
}
$u = parse_user($f, $user);
fclose($f);
return $u; */
return $user;
}
Since this is a dedicated cluster example, you like me will probably not bother setting up email on the servers. You need to edit the created user’s database entry and also get the authentication key so command line clients can attach to the project. You don’t need to do this if your client machines are going to run the BOINC GUI client. In my case I am controlling 16 machines at the same time using cssh, so I need to use the boinc command line tools. At this stage, I do not know how to connect to them using a username and password like the GUI client does. If anyone does know how to do this, please leave a comment on this post!
Connect to the database using phpmyadmin or similar. Go to the mandelbrot16 database and browse the results in the user table.
Take note of the authenticator field value (eg. 84a35ba7615192bd3120019d8861ffac). You will need to to connect shortly.
Update the email_validated field to contain a 1.
Setup Client Software:
Download boinc_5.10.21_i686-pc-linux-gnu.sh from the BOINC website.
Run it on the client machine from your home directory. This will install the runtime files into a BOINC directory.
In a terminal run:
./run_client
Attach to the Project
In another separate terminal window:
./boinc_cmd –project_attach http://144.6.40.251/mandelbrot16/ cc5e5948e2ff9fe00dc2474d271753ad
Use your own authenticator at this point that you noted down earlier.
You can also detach from the project later on by running:
./boinc_cmd –project http://144.6.40.251/mandelbrot16/mandelbrot16/ detach
That should be it! You should be able to go to your ops page for your project (in this case http://144.6.40.251/mandelbrot16_ops/) and see results coming in.
I am by no means a BOINC expert so if you find any better ways of doing what I have done please make a comment or drop me a line and let me know.
Coming Soon: A set of instructions to do exactly the same thing with a Beowulf cluster.
REFERENCES WHICH HELPED ME:
Building distributed applications with BOINC (PDF) (HIGHLY RECOMMENDED READING FOR BEGINNERS)