The Locate Command

Lost a file? Type:

locate <filename>

and you'll instantly see where it is. Locate uses a list of all the files on your disk that's updated once a week (or sometimes once a day). That's how it's able to instantly tell you the location of the files.

You don't have to type a complete filename. For example, if you want to see all the *.tex files on your computer, type:

locate tex

You'll get a list that includes paper.tex, thesis.tex, but also text.doc. That is, if the string you type matches any part of the name of a file, locate tells you about it.

Reminder dialog boxes

Here's a small script that pops up a dialog box with some reminder message at some point in the future. It works on Red Hat/Fedora boxes.

at "$1" <<END
export DISPLAY=:0.0
zenity --info --text="$2" 1> /dev/null 2> /dev/null
END

You use it by typing things like

remind 'now+5min' 'Pick up print job!'
remind '4pm' 'Go to dentist!'

The time specification is just passed to 'at' so type 'man at' to see all the different ways you can do it.

Generic subsitutions

This script takes commands like

subst foo=1 bar=2 < infile > outfile

and if infile looks like this:

Foo is <foo> and bar is <bar>

then outfile will look like this:

Foo is 1 and bar is 2

The script doesn't need to know anything about the names of the tags; whatever name you give them on the command line, that's the name the script looks for in the file.

#!/usr/bin/env perl 

while (@ARGV > 0) {
    $_ = shift(@ARGV);
    @pair = split(/=/);
    $subst{$pair[0]} = $pair[1];
}

while(<>) {   
    foreach $key (keys %subst) {
        s/<$key>/$subst{$key}/g;
    }
    print;
}

Xargs

The find command can do complicated searches for files in a directory tree, but it returns one match per line. If you want to feed those files to some other command, it's difficult. Use xargs to turn one-filename-per-line into a space separated list of files on one line, suitable for feeding to another command. Ie:

find . -name \*.jpg | xargs display

will find all the .jpg files in the directory tree and then execute display file1.jpg file2.jpg ... on them all.

Perl Magic

You have a bunch of files and you want to change all occurrences of "foo" to "bar" in them. There's a classic one-line perl command to do this:

perl -p -i.bak -e 's/foo/bar/g' *.txt

The argument to -e is any perl command, in this case a substitution. The -i tells perl to "edit in place," reading from each file on the command line in turn and writing back to the same file. The .bak means make backup copies of the original files using a .bak extension. Finally the -p wraps the command you gave with -e inside a particular loop which is a common Perl idiom. It basically means "execute the -e command on every line of every file."

Note the possibilities by combining this with xargs and find.

Operating on Many Files

Sometimes you want to do something with a bunch of files, like copy them, but there are so many that you get "argument list too long" if you just try to do "cp * wherever/". In this case the -l option to xargs comes in handy. It splits the argument list into smaller chunks and operates on each of them. For example, suppose that you want to delete files:

find . -name '*' | xargs -l100 rm

This deletes the files 100 at a time. However, doing this with cp or mv represents a problem because the destination needs to be at the end of the argument list, and xargs puts the long argument list at the end. The developers of mv and cp apparently considered this and provided the option "--target-directory=...", which takes the place of the destination at the end. So, to move all the files, you can do:

find . -name '*' | xargs -l100 mv --target-directory=dir

Alas, however again, scp does not have that option, so if you are trying to remote-copy all those files, you are still SOL. To solve this problem, you have to write a small shell script:

#!/usr/bin/sh
# "reverse-scp" copies with the destination first, unlike scp
dest=$1
shift
scp $* $dest
Finally, we can do:
find . -name '*' | xargs -l100 reverse-scp dest

Voila. Or, if you have a copy of rsync locally, use it with ssh as its transport:

 rsync -r --rsh=ssh ./ user@destination-host:path/

The amazing ADS paper fetching script

I'm boarding a plane, and I want to fetch a bunch of papers from ADS to read at my leisure. Here's a script to help you get them. Its use is a little convoluted, but here it is anyway. Maybe it'll evolve into a more reasonable tool.

To use it: Do any ADS search you want. For best results, click "Select References with at least one of the following" and then check "Full text" and "Scanned article." Also, click on "sort by citation count" and the script will do something useful with those numbers.

Save the html file to disk. The script is not incredibly intelligent about parsing the file, so I've had the best luck with doing "View Source" and then "Save as" to avoid extra linebreaks and such. Then feed the HTML to the following script like this:

snarf < ads.html It will dutifully fetch (most) of the papers for you and give them useful filenames of the form: something-citations-year-title.pdf. Then when you list the files, they're sorted by citation count... so you know which papers you should probably read first.

At long last, here it is:

#!/usr/bin/perl
# Fun with regular expressions.  Figuring out what the hell is going on  
# is left as an exercise to the reader.
while (<>) {
    # Find the link to the full text article
    if (/<a href=\"([^\"]*ARTICLE)\"/) {
        $link = $1;     
        # Find the number of citations
        if (/>([0-9]+)\.[0-9]+</) {
            $cite = $1;
        }
        # Find the year
        if (/\/([0-9]+)</) {
            $year = $1;
        }
        # read next line for title
        $_ = <>;
        # Find the title
        if (/>[A-z][^<]*<.*>([A-z][^<]*)</) {
            $title = $1;
            $title =~ s/\W/_/g;
        }
        # Make the filename
        $filename = sprintf "Gebhardt-%04d-$year-$title.pdf", $cite;
        # Fetch the document
        system "wget \"$link\" -O $filename\n";
    }
}

Answers/Code/Scripting (last edited 2007-09-13 00:48:04 by GregNovak)