...making Linux just a little more fun!

2-Cent Tips

Two-cent Tip: Retrieving directory contents

Ben Okopnik [ben at linuxgazette.net]


Wed, 3 Feb 2010 22:35:08 -0500

During a recent email discussion regarding pulling down the LG archives with 'wget', I discovered (perhaps mistakenly; if so, I wish someone would enlighten me [1]) that there's no way to tell it to pull down all the files in a directory unless there's a page that links to all those files... and the directory index doesn't count (even though it contains links to all those files.) So, after a minute or two of fiddling with it, I came up with a following solution:

#!/bin/bash
# Created by Ben Okopnik on Fri Jan 29 14:41:57 EST 2010
 
[ -z "$1" ] && { printf "Usage: ${0##*/} <URL> \n"; exit; }
 
# Safely create a temporary file
file=`tempfile`
# Extract all the links from the directory listing into a local text file
wget -q -O - "$1"|\
URL="${1%/}" perl -wlne'print "$ENV{URL}/$2" if /href=(["\047])([^\1]+)\1/' > $file
# Retrieve the listed links
wget -i $file && rm $file

To summarize, I used 'wget' to grab the directory listing, parse it to extract all the links, prefixing them with the site URL, and saved the result into a local tempfile. Then, I used that tempfile as a source for 'wget's '-i' option (read the links to be retrieved from a file.)

I've tested this script on about a dozen directories with a variety of servers, and it seems to work fine.

[1] Please test your proposed solution, though. I'm rather cranky at 'wget' with regard to its documentation; perhaps it's just me, but I often find that the options described in its manpage do something rather different from what they promise to do. For me, 'wget' is a terrific program, but the documentation has lost something in the translation from the original Martian.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Two-cent Tip: How big is that directory?

Dr. Parthasarathy S [drpartha at gmail.com]


Tue, 2 Feb 2010 09:57:02 +0530

At times, you may need to know exactly how big is a certain directory (say top directory) along with all its contents and subdirectories(and their contents). You may need this if you are copying a large diectory along with its contents and structure. And you may like to know if what you got after the copy, is what you sent. Or you may need this when trying to copy stuff on to a device where the space is limited. So you want to make sure that you can accomodate the material you are planning to send.

Here is a cute little script. Calling sequence::

howmuch <top directory name>

You get a summary, which gives the total size, the number of subdirectories, and the number of files (counted from the top directory). Good for book-keeping.

###########start-howmuch-script
# Tells you how many files, subdirectories and content bytes in a
# directory
# Usage :: how much <directory-path-and-name>
 
# check if there is no command line argument
if [ $# -eq 0 ]
then
echo "You forgot the directory to be accounted for !"
echo "Usage :: howmuch <directoryname with path>"
exit
fi
 
echo "***start-howmuch***"
pwd > ~/howmuch.rep
pwd
echo -n "Disk usage of directory ::" > ~/howmuch.rep
echo $1 >> ~/howmuch.rep
echo -n "made on ::" >> ~/howmuch.rep
du -s $1 > ~/howmuch1
tree $1 > ~/howmuch2
date >> ~/howmuch.rep
tail ~/howmuch1 >> ~/howmuch.rep
tail --lines=1 ~/howmuch2 >> ~/howmuch.rep
cat ~/howmuch.rep
# cleanup
rm ~/howmuch1
rm ~/howmuch2
#Optional -- you can delete howmuch.rep if you want
#rm ~/howmuch.rep
 
echo "***end-howmuch***"
#   
 
 
########end-howmuch-script
-- 
---------------------------------------------------------------------------------------------
Dr. S. Parthasarathy                    |   mailto:[email protected]
Algologic Research & Solutions    |
78 Sancharpuri Colony                 |
Bowenpally  P.O                          |   Phone: + 91 - 40 - 2775 1650
Secunderabad 500 011 - INDIA     |
WWW-URL: http://algolog.tripod.com/nupartha.htm
---------------------------------------------------------------------------------------------

[ Thread continues here (5 messages/9.85kB) ]


Two-cent tip: GRUB and inode sizes

René Pfeiffer [lynx at luchs.at]


Wed, 3 Feb 2010 01:07:03 +0100

Hello!

I had a decent fight with a stubborn server today. It was a Fedora Core 6 system (let's not talk about how old it is) that was scheduled for a change of disks. This is fairly straightforward - until you have to write the boot block. Unfortunately I prepared the new disks before copying the files. As soon as I wanted to install GRUB 0.97 it told me that it could not read the stage1 file. The problem is that GRUB only deals with 128-byte inodes. The prepared / partition has 256-byte inodes. So make sure to use

mkfs.ext3 -I 128 /dev/sda1

when preparing disks intended to co-exist with GRUB. I know this is old news, but I never encountered this problem before. http://www.linuxplanet.com/linuxplanet/tutorials/6480/2/ has more hints ready.

Best, René, who is thinking about moving back to LILO.

[ Thread continues here (3 messages/2.77kB) ]


Two-cent Tip: backgrounding the last stopped job without knowing its job ID

Mulyadi Santosa [mulyadi.santosa at gmail.com]


Mon, 22 Feb 2010 16:14:09 +0700

For most people, to send a job to background after stopping a task, he/she will take a note the job ID and then invoke "bg" command appropriately like below:

$ (while (true); do yes  > /dev/null ; done)
^Z
[2]+  Stopped                 ( while ( true ); do
    yes > /dev/null;
done )
 
$ bg %2
[2]+ ( while ( true ); do
    yes > /dev/null;
done ) &

Can we omit the job ID? Yes, we can. Simply replace the above "bg %2" with "bg %%". It will refer to the last stopped job ID. This way, command typing mistake could be avoided too.

-- 
regards,
 
Mulyadi Santosa
Freelance Linux trainer and consultant
 
blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

[ Thread continues here (4 messages/4.27kB) ]



Talkback: Discuss this article with The Answer Gang

Copyright © 2010, . Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 172 of Linux Gazette, March 2010

Tux