"Linux Gazette...making Linux just a little more fun!"


Building an Automatic FTP Retriever

by Manuel Arturo Izquierdo


Internet is the big and wide world of infomation, which is really great, but when one works on a limited Internet access, retrieving big amounts of data may become a nigthmare. This is my particular case. I work at the National Astronomic Observatory, Universidad Nacional de Colombia. Its ethernet LAN is attached to a main ATM university's network backbone. However, the external connection to the world goes through a 64K bandwidth channel and that's a real problem when more than 500 users surf the net on day time, for Internet velocity becomes offendly slow. Matter change when night comes up and there is nobody at the campus, and so the transmition rate grows to acceptable levels. Then, one can downloading easily big quantities of information (for example, a whole Linux distribution). But as we are mortal human beings, it's not very advisable to keep awake all nights working at the computer. Then a solution arises: Program the computer so it works when we sleep. Now the question is: How to program a Linux box to do that? This is the reason I wrote this article.

On this writting I cover the concerning about ftp connections. I have not worked yet on http connections, if you did so, please tell me.

At first sight, a solution comes up intuitively: Use the at command to program an action at a given time. Let's remember how looks a simple ftp session (in bold are the commands entered by user):

bash$ ftp anyserver.anywhere.net
Connected to anyserver.anywhere.net.
220 anyserver FTP server (Version wu-2.4(1) Tue Aug 8 15:50:43 CDT 1995) 
ready.
Name (anyserver:theuser): anonymous
331 Guest login ok, send your complete e-mail address as password.
Password:(an e-mail address)
230 Guest login ok, access restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd pub
ftp> bin
ftp> get anyfile.tar.gz
150 Opening BINARY mode data connection for anyfile.tar.gz (3217 bytes).
226 Transfer complete.
3217 bytes received in 0.0402 secs (78 Kbytes/sec)
ftp> bye
221 Goodbye.
bash$ 

You can write a small shell script containing the steps that at will execute. To manage the internal ftp commands into a shell script you can use a shell syntax feature that permits to embed data that supposely would come from the standard input into a script. This is called a "here" document:

#! /bin/sh
echo This will use a "here" document to embed ftp commands in this script
# Begin of "here" document
ftp <<**
open anyserver.anywhere.net
anonymous
[email protected]
cd pub
bin
get anyfile.tar.gz
bye
**
# End of "here" document
echo ftp transfer ended.

Note that all the data between the ** strings are sended to the ftp program as if it has been written by the user. So this script would open a ftp connection to anyserver.anynet.net, loging as anonymous with [email protected] as password, will retrieve the anyfile.tar.gz file located at the pub directory using binary transfer mode. Theoretically this script looks okay, but on practice it won't work. Why? Because the ftp program does not accept the username and password via a "here" document; so in this case ftp will react with an "Invalid comand" to anonymous and [email protected] . Obviously the ftp server will reject when no login and password data are sended.

The tip to this lies in a hidden file that ftp uses called ~/.netrc ; this must be located on the home directory. This file contains the information required by ftp to login into a system, organized in tree text lines:

machine  anyserver.anynet.net
login    anonymous
password [email protected]

In the case for private ftp connections, the password field must have the concerning account password, instead an email as for anonymous ftp. This may open a security hole, for this reason ftp will require that the ~/.netrc file lacks of read, write, and execute permission to everybody, except the user. This is done easily using the chmod command:

chmod go-rwx .netrc

Now, our shell script will look so:

#! /bin/sh
echo This will use a "here" document to embed ftp commands in this script
# Begin of "here" document
ftp <<**
open anyserver.anywhere.net
cd pub
bin
get anyfile.tar.gz
bye
**
# End of "here" document
echo ftp transfer ended.

ftp will extract the login and passwd information fron ~/.netrc and will realize the connection. Say we called this script getdata (and made it executable with chmod ugo+x getdata), we can program its execution at a given time so:

bash$ at 1:00 am
getdata
(control-D)
Job 70 will be executed using /bin/sh
bash$

When you return at the morning, the requested data will be on your computer!

Another useful way to use this script is:

bash$ nohup getdata &
[2] 131
bash$ nohup: appending output to 'nohup.out'
bash$ 

nohup permits that the process it executes (in this case getdata) keeps runnig in spite of the user logouts. So you can work in anything else while in the background a set of files are retrieved, or make logout without kill the ftp children process.

In short you may follow these steps:

And voilá.

Additionally you can add more features to the script, so it automatically manages the updating of the ~/.netrc file and generates a log information file showing the time used:

#!/bin/sh

# Makes a backup of the old ~/.netrc file
cp $HOME/.netrc $HOME/netrc.bak

# Configures a new ~/.netrc
rm $HOME/.netrc
echo machine anyserver.anywhere.net > $HOME/.netrc
echo login anonymous >> $HOME/.netrc
echo password [email protected] >> $HOME/.netrc
chmod go-rwx $HOME/.netrc
echo scriptname log file > scriptname.log
echo Begin conection at: >> scriptname.log
date >> scriptname.log
ftp -i<<**
open anyserver.anywhere.net
bin
cd pub
get afile.tar.gz
get bfile.tar.gz

bye
**
echo End conection at: >> scriptname.log
date >> scriptname.log
# End of scriptname script

To create by hand such script each time we need to download data may be annoying. For this reason I have wrote a small tcl/tk8.0 application to generate a script under our needs.

You can find detailed information about the ftp command set in its respective man page.



Copyright © 1998, Manuel Arturo Izquierdo
Published in Issue 34 of Linux Gazette, November 1998


[ TABLE OF CONTENTS ] [ FRONT PAGE ]  Back  Next