The Locate Command
Lost a file? Type:
locate <filename>
and you'll instantly see where it is. Locate uses a list of all the files on your disk that's updated once a week (or sometimes once a day). That's how it's able to instantly tell you the location of the files.
You don't have to type a complete filename. For example, if you want to see all the *.tex files on your computer, type:
locate tex
You'll get a list that includes paper.tex, thesis.tex, but also text.doc. That is, if the string you type matches any part of the name of a file, locate tells you about it.
Reminder dialog boxes
Here's a small script that pops up a dialog box with some reminder message at some point in the future. It works on Red Hat/Fedora boxes.
at "$1" <<END export DISPLAY=:0.0 zenity --info --text="$2" 1> /dev/null 2> /dev/null END
You use it by typing things like
remind 'now+5min' 'Pick up print job!' remind '4pm' 'Go to dentist!'
The time specification is just passed to 'at' so type 'man at' to see all the different ways you can do it.
Generic subsitutions
This script takes commands like
subst foo=1 bar=2 < infile > outfile
and if infile looks like this:
Foo is <foo> and bar is <bar>
then outfile will look like this:
Foo is 1 and bar is 2
The script doesn't need to know anything about the names of the tags; whatever name you give them on the command line, that's the name the script looks for in the file.
#!/usr/bin/env perl
while (@ARGV > 0) {
$_ = shift(@ARGV);
@pair = split(/=/);
$subst{$pair[0]} = $pair[1];
}
while(<>) {
foreach $key (keys %subst) {
s/<$key>/$subst{$key}/g;
}
print;
}
Xargs
The find command can do complicated searches for files in a directory tree, but it returns one match per line. If you want to feed those files to some other command, it's difficult. Use xargs to turn one-filename-per-line into a space separated list of files on one line, suitable for feeding to another command. Ie:
find . -name \*.jpg | xargs display
will find all the .jpg files in the directory tree and then execute display file1.jpg file2.jpg ... on them all.
Perl Magic
You have a bunch of files and you want to change all occurrences of "foo" to "bar" in them. There's a classic one-line perl command to do this:
perl -p -i.bak -e 's/foo/bar/g' *.txt
The argument to -e is any perl command, in this case a substitution. The -i tells perl to "edit in place," reading from each file on the command line in turn and writing back to the same file. The .bak means make backup copies of the original files using a .bak extension. Finally the -p wraps the command you gave with -e inside a particular loop which is a common Perl idiom. It basically means "execute the -e command on every line of every file."
Note the possibilities by combining this with xargs and find.
Operating on Many Files
Sometimes you want to do something with a bunch of files, like copy them, but there are so many that you get "argument list too long" if you just try to do "cp * wherever/". In this case the -l option to xargs comes in handy. It splits the argument list into smaller chunks and operates on each of them. For example, suppose that you want to delete files:
find . -name '*' | xargs -l100 rm
This deletes the files 100 at a time. However, doing this with cp or mv represents a problem because the destination needs to be at the end of the argument list, and xargs puts the long argument list at the end. The developers of mv and cp apparently considered this and provided the option "--target-directory=...", which takes the place of the destination at the end. So, to move all the files, you can do:
find . -name '*' | xargs -l100 mv --target-directory=dir
Alas, however again, scp does not have that option, so if you are trying to remote-copy all those files, you are still SOL. To solve this problem, you have to write a small shell script:
#!/usr/bin/sh # "reverse-scp" copies with the destination first, unlike scp dest=$1 shift scp $* $dest Finally, we can do: find . -name '*' | xargs -l100 reverse-scp dest
Voila. Or, if you have a copy of rsync locally, use it with ssh as its transport:
rsync -r --rsh=ssh ./ user@destination-host:path/
The amazing ADS paper fetching script
I'm boarding a plane, and I want to fetch a bunch of papers from ADS to read at my leisure. Here's a script to help you get them. Its use is a little convoluted, but here it is anyway. Maybe it'll evolve into a more reasonable tool.
To use it: Do any ADS search you want. For best results, click "Select References with at least one of the following" and then check "Full text" and "Scanned article." Also, click on "sort by citation count" and the script will do something useful with those numbers.
Save the html file to disk. The script is not incredibly intelligent about parsing the file, so I've had the best luck with doing "View Source" and then "Save as" to avoid extra linebreaks and such. Then feed the HTML to the following script like this:
snarf < ads.html It will dutifully fetch (most) of the papers for you and give them useful filenames of the form: something-citations-year-title.pdf. Then when you list the files, they're sorted by citation count... so you know which papers you should probably read first.
At long last, here it is:
#!/usr/bin/perl
# Fun with regular expressions. Figuring out what the hell is going on
# is left as an exercise to the reader.
while (<>) {
# Find the link to the full text article
if (/<a href=\"([^\"]*ARTICLE)\"/) {
$link = $1;
# Find the number of citations
if (/>([0-9]+)\.[0-9]+</) {
$cite = $1;
}
# Find the year
if (/\/([0-9]+)</) {
$year = $1;
}
# read next line for title
$_ = <>;
# Find the title
if (/>[A-z][^<]*<.*>([A-z][^<]*)</) {
$title = $1;
$title =~ s/\W/_/g;
}
# Make the filename
$filename = sprintf "Gebhardt-%04d-$year-$title.pdf", $cite;
# Fetch the document
system "wget \"$link\" -O $filename\n";
}
}
