#1 03 May 2021 22:43

Rekrul
Member
Registered: 17 Apr 2016
Posts: 98

For: recurse into directories, but list sub-dir contents first?

I want to write a script that generates a formatted list of all the files and sub-directories in a given directory, but I want the sub-directories listed first and then the files. I need to use the For command for this because I need to be able to get the size of each file as well, which will be outputted as part of the list.

In other words, if you do this;

For /R %%F in (*.*) do Echo %%F >>List.txt

You will get this (spaced for clarity);

G:\File1.ext
G:\File2.ext

G:\Extras\File1.ext
G:\Extras\File2.ext

G:\Extras\Stuff\File1.ext
G:\Extras\Stuff\File2.ext

I want the list formatted with the sub-directories listed first, like this;

G:\Extras\Stuff\File1.ext
G:\Extras\Stuff\File2.ext

G:\Extras\File1.ext
G:\Extras\File2.ext

G:\File1.ext
G:\File2.ext

So that for each nested directory, any sub-dirs ae are always listed first.

I know there's no magic command to do this, but I'm stumped as to the logic of how I could do this.

If I knew for certain that there would never be nested sub-directories, I could make two passes and filter out the files in the root dir on the first pass by putting it in a variable, removing the filename and current dir, and then checking to see if the variable is empty, which would leave only the files in the sub-dir, but I can't see any way to reliably do that with each sub-dir. Unless I go into each one, but then how many levels deep do I go? If I make it account for three levels of nested directories, it will fail if there are four. I can't think of a good way to write a routine that will burrow down an unlimited number of directories checking each one to see if it contains more nested directories.

I thought I could use the initial For command and some logic to CD to each sub-directory, but then how do I parse the deepest nested directory in each one? In other words, how do I parse Extras\Stuff\ before Extras\?

I want something that will be automatic and handle any level of nested sub-directories, but always put the deepest level of sub-dirs in each one first.

I don't expect someone to write a whole routine to do this for me, but can someone suggest a way of doing this?

Offline

#2 04 May 2021 18:32

Simon Sheppard
Super Administrator
Registered: 27 Aug 2005
Posts: 1,118
Website

Re: For: recurse into directories, but list sub-dir contents first?

If I'm understanding this correctly then this is a more complex problem than it appears at first glance.

Although you are sorting each directory alphabetically, the directories themselves are sorted by directory depth, starting with the deepest directory and finishing with the top level.

The problem is that until everything has been scanned we don't know what the deepest directory level is going to be, so this will likely involve scanning the directory tree at least twice.

First scan, find the deepest directory level, say 6
Second scan, output the content of all directories which are 6 levels deep, there may be more than 1.
Sort that output and save to a file.
Third scan, output the content of all directories which are 5 levels deep, there may be more than 1.
Sort that output and append to the file.
repeat for other levels.

If this is a one-off task I would just import the file listing you have into Excel, convert the subdirectories into columns and sort that way.

If having all the deepest directories together at the beginning is not important then you could use a command like:

DIR /s /b /o-n >filename.txt

Which will list everything in reverse order, then just reverse the order of all the lines in filename.txt

Offline

#3 04 May 2021 20:06

Rekrul
Member
Registered: 17 Apr 2016
Posts: 98

Re: For: recurse into directories, but list sub-dir contents first?

Simon Sheppard wrote:

If I'm understanding this correctly then this is a more complex problem than it appears at first glance.

Yes, I've been struggling with it for a while.

Simon Sheppard wrote:

Although you are sorting each directory alphabetically, the directories themselves are sorted by directory depth, starting with the deepest directory and finishing with the top level.

Correct. The format I want mirrors what you see in Explorer or virtually any file manager. Directories are always shown first, files second. And when you go into one of those directories, any sub-directories are once again shown first and the files after them. No matter how many levels deep you go, the dirs always come first.

Simon Sheppard wrote:

The problem is that until everything has been scanned we don't know what the deepest directory level is going to be, so this will likely involve scanning the directory tree at least twice.

Some of the theories I had involved scanning the directory multiple times, but ultimately, I ran into problems.

Simon Sheppard wrote:

If this is a one-off task I would just import the file listing you have into Excel, convert the subdirectories into columns and sort that way.

Nope, not a one-off task. I archive stuff by burning it to disc. So I can keep track of everything, I generate a neatly formatted text listing for each disc that's easily searchable. For this purpose, it really doesn't need to be in the order I've specified, it's just that I'm picky and my lists up to this point have been that way and I prefer it, so I want to continue to have them formatted this way. Previously I was using a file manager plug-in, which would sort the lists this way, but I'd still have to manually edit them to insert the directories. I want to completely automate the task.

At first I thought I could run one pass and count how many different directories there were, then run a second pass outputting each one to a separate file, then joining them together in reverse order. However that wouldn't work because it would end up with this;

Stuff2\Extras2\File1.ext
Stuff2\Extras2\File2.ext

Stuff2\Extras1\File1.ext
Stuff2\Extras1\File2.ext

Stuff1\Extras2\File1.ext
Stuff1\Extras2\File2.ext

Stuff1\Extras1\File1.ext
Stuff1\Extras1\File2.ext

File1.ext
File2.ext

I don't know how to parse the path itself to separate the individual directories. In other words, I don't know how to take "Stuff1\Extras1\" and split it into "Stuff1\" and "Extras1\" so that I can determine what is inside of what. I can compare the path itself to tell when the directory changes, but that won't tell which one is the longest/deepest. Also, I can't just go by the length of the path, because I could have something like this;

Stuff1\File1.ext
Stuff1\File2.ext

Stuff2\Some-Really-Long-Directory-Name\File1.ext
Stuff2\Some-Really-Long-Directory-Name\File2.ext

I could make some routine to try counting the number of "\" in a variable by comparing the length before and after replacing them, but without parsing the individual parts of the path, I still don't know how they're nested. In other words, I can't tell "Stuff1\Extras1\Crap\" from "Stuff1\Extras1\Junk\". Sure, I can tell that they're different, but unless I can parse "Crap\" and "Junk\", I can't properly sort them.

For what it's worth, the final listing would look something like this;

List0000 - DVD0000

123,456  Stuff\Extras1\File1.ext
123,456  Stuff\Extras1\File2.ext
        -=-=-=-=-=-=-=-=-=-=-=-=-
123,456  Stuff\Extras2\File1.ext
123,456  Stuff\Extras2\File2.ext
        -=-=-=-=-=-=-=-=-=-=-=-=-
123,456  File1.ext
123,456  File2.ext

This then gets joined into one big master list with all the discs listed in order. I have the code to format the sizes and insert the dividers, I'm just stuck on how to organize the directories the way I want.

Offline

#4 08 May 2021 02:05

Rekrul
Member
Registered: 17 Apr 2016
Posts: 98

Re: For: recurse into directories, but list sub-dir contents first?

I believe I've found a relatively simple solution. I haven't written any code yet because it's going to involve togging Delayed Expansion on/off, multiple variable search/replace operations and a bunch of other crap, and anything moderately complex that I write always takes me a lot of trial and error to get it working right. Anyway, my theory seems sound.

First recursively loop through the current directory, placing the token into a variable minus the filename to leave just the path, pad said variable with a set number of "z"s, and write it to a file. Each new directory is compared to the previous one to avoid writing duplicate lines. When done, sort the file and output it to a second file, so that nested directories will appear above their parent directories. Use a loop to read in each line of the sorted file, remove the "z"s, CD to the directory, first check that files exist in that directory (that it doesn't just contain sub-directories) and if so, output a divider (skipped on the very first dir to avoid having a divider at the top of the list) and output those files to the list.

I tested it by sorting a text file I created to simulate path names and it worked great, I just need to write the actual code.

Offline

#5 08 May 2021 14:22

RG
Member
From: Minnesota
Registered: 18 Feb 2010
Posts: 362

Re: For: recurse into directories, but list sub-dir contents first?

You may want to consider using a character that cannot appear in a filename to pad your variable to avoid removing "Z"s or trailing "z"s in a filename that really does contain a "z".


Windows Shell Scripting and InstallShield

Offline

#6 09 May 2021 05:52

Rekrul
Member
Registered: 17 Apr 2016
Posts: 98

Re: For: recurse into directories, but list sub-dir contents first?

RG wrote:

You may want to consider using a character that cannot appear in a filename to pad your variable to avoid removing "Z"s or trailing "z"s in a filename that really does contain a "z".

That thought had occurred to me, but it shouldn't be a problem. I'll make sure that every path written to the file ends with a backslash and the "z"s will be appended after that. Also, I'll tell it to remove the exact number of "z"s that I added, so only the final part of the string should match. Said number will be large enough to ensure that even the shortest path ends up longer than the longest path in each group, so it's unlikely that any legitimate path will ever contain that many consecutive "z"s in a row.

Regardless, I'm open to using another character. Can you suggest one that would never appear in a path name and that will be sorted last? I tried the usual illegal characters like ? * / \, but they all end up at the top of the list.

Offline

#7 09 May 2021 13:50

Simon Sheppard
Super Administrator
Registered: 27 Aug 2005
Posts: 1,118
Website

Re: For: recurse into directories, but list sub-dir contents first?

From the SORT page:
The sort order (collation) is not the same as that used by Windows Explorer (StrCmpLogicalW), so for example the ~ character will sort first in Explorer and last in SORT.
Many Unicode chracters will sort after 'Z' in both SORT and Windows Explorer, e.g. the greek character Xi, Ξ or Omega Ω.
https://ss64.com/nt/sort.html

Offline

#8 10 May 2021 04:02

Rekrul
Member
Registered: 17 Apr 2016
Posts: 98

Re: For: recurse into directories, but list sub-dir contents first?

Simon Sheppard wrote:

From the SORT page:
The sort order (collation) is not the same as that used by Windows Explorer (StrCmpLogicalW), so for example the ~ character will sort first in Explorer and last in SORT.
Many Unicode chracters will sort after 'Z' in both SORT and Windows Explorer, e.g. the greek character Xi, Ξ or Omega Ω.
https://ss64.com/nt/sort.html

I saw that. Unfortunately, my text editor of choice, an older version of Textpad, won't let me enter those greek characters.

Also, I tested the tilde character and it sorts first. Maybe newer versions of Windows handle this differently, but I'm still using an old system and under XP, it comes before letters and numbers. Even among symbols, it comes before +, =, <, and >.

Offline

#9 10 May 2021 13:03

Simon Sheppard
Super Administrator
Registered: 27 Aug 2005
Posts: 1,118
Website

Re: For: recurse into directories, but list sub-dir contents first?

Rekrul wrote:

Also, I tested the tilde character and it sorts first. Maybe newer versions of Windows handle this differently, but I'm still using an old system and under XP, it comes before letters and numbers. Even among symbols, it comes before +, =, <, and >.

I just realised thats the same in W10, the sorting of ~ is only last when locale is 'C', I'll reword that page to include this.

Offline

Board footer

Powered by FluxBB