How can Iterate through all the files and directories

kalinga · 10 Sep 2008 15:50

How can Iterate through all the files and directories under the current directory

Any help :wall:

flabdablet · 20 Jul 2011 15:29

I generally do this by using the find command to output a list of the pathnames I want, then pipe that output into a read loop to do the processing I want.

For example, here's a snippet that renames every .JPG, JpG, .jPG, .JPEG, .jPeG etc. file in the current directory and all its subdirectories to give them all consistent lowercase .jpg extensions:

find . -type f -iname '*.jpg' -o -iname '*.jpeg' -print |
while read -r name
do
    mv "$name" "${name%.*}.jpg"
done

Check man find - it's a very flexible file finding tool with a heap of options, and every Unix-like system has it.

In the snippet above I'm using -type f to specify that I only want find to return the names of files, not directories; you can use -type d to get directories only, or leave the -type option out altogether to get both.

The -iname '*.jpg' part tells find to list only files whose names match case-insensitively the pattern '*.jpg' i.e. end with '.jpg' or '.JPG' or '.Jpg'. The -o -iname '*.jpeg' part extends the matching to cover all variants of '.jpeg' as well (read -o as "or").

Note that those wildcard patterns are quoted to stop the shell from expanding them; unusually for a Unix tool, we actually want find to see patterns, not filenames.

read has the -r option applied to stop it misinterpreting any \ in a filename as an escape character. Even so, read uses a newline to mark the end of input, and this code will fail if any of the filenames contains a newline character. Fortunately such filenames are very rare in practice (they break lots of shell scripts).

There are a couple of ways to use find to process a bunch of files without piping its output through a read loop. One is to use its inbuilt -exec option. Here's an example using that to copy every file from the current directory and all its subdirectories into a single directory under /tmp, effectively giving us a "flattened" directory:

mkdir /tmp/foo
find . -type f -exec cp {} /tmp/foo \;

Everything between -exec and \; (non-inclusive) is treated as a command and arguments, invoked once for each file found, with {} replaced by the file's pathname. Note that \; is escaped to stop the shell from treating the semicolon as its own end-of-command marker.

Invoking one command per file can be a bit slow when processing lots of files. Rather than use -exec, it can be a lot quicker to pipe the output of find into xargs:

mkdir /tmp/foo
find . -type f -print0 | xargs -0 cp -t /tmp/foo

xargs repeatedly builds and executes a command consisting of its own arguments followed by as many strings read from its standard input as will fit in a command line until it runs out of input. Note the use of the -print0 action in find, which tells it to output pathnames terminated by NUL (\0) characters rather than newlines; since NUL cannot occur inside any Unix filename, that makes this pipeline completely bulletproof against weird names. xargs has the -0 option to make it expect its input stream to be formatted in that way, and cp uses the -t option so that the destination directory can come before all the source pathnames.

SS64 Forum

#1 10 Sep 2008 15:50

How can Iterate through all the files and directories

#2 20 Jul 2011 15:29

Re: How can Iterate through all the files and directories

Board footer