For /f documentation

Microsoft Windows
Post Reply
User avatar
MigrationUser
Posts: 336
Joined: 2021-Jul-12, 1:37 pm
Contact:

For /f documentation

Post by MigrationUser »

25 Oct 2012 09:42
carlos

Hello.
In the two pages of for /f documentation, for_cmd.html and for_f.html when mentions the info about the 61 tokens and show the caret, the ascii symbol for copy paste is other. The ascii code for caret is 0x5e and the two pages use in the sequence the ascii code 0x88

Also, the two pages don't mentions the undocumented option: useback as a abbreviation of usebackq

Too, when say: in theory "eol=" should turn this feature off, but in practice this fails.
I would like add that when you specify the eol option within quotes the next character is read and used. So if you use this options "eol=" the eol character would be the " (quote) or if you use this options "eol= delims=" the eol character assumed is the space. Then, if you like disable the eol character you would be not enclose the options in quotes and escape special characters (space equal colon) writing eol= to the end (important!) without specifing any character, like this (read all tokens, dont use delimiter and dont use eol):

Code: Select all

For /f tokens^=*^ delims^=^ eol^= %%a in (file.txt) do echo.%%a
instead of:

Code: Select all

For /f "tokens=* delims= eol=" %%a in (file.txt) do echo.%%a
Anyways, when For /f open a file for read, read until the end or found the ascii 0.
Because this, if you have a file.txt with this lines:

Code: Select all

;semicolon
"quote
 space
%percent
=equal
NUL_0
^caret
where NUL_0 is in binary these hex ascii: 00 5f 30
and you open the file with for /f you read all the lines until the line =equal

Because the hex 0 is a string terminator and in this line is used at the begin, you can't save the line NUL_0 in a variable and neither print. For catch it line you need replace the hex 0 for a 0xd 0xa using the find command, find /v "" file.txt

In conclusion, the secure way for read all the lines with text of a file is the next way:

Code: Select all

For /f skip^=2^ tokens^=*^ delims^=^ eol^= %%a in ('Find /v "" file.txt') do echo.%%a
But, because generally the text files don't have hex 0 poisoning characters, is enough with this:

Code: Select all

For /f tokens^=*^ delims^=^ eol^= %%a in (file.txt) do echo.%%a
Last edited by carlos (25 Oct 2012 09:51)

----------------------------

#2 25 Oct 2012 14:07
dbenham

Wow, very cool 8-)

The only way I knew to effectively disable both EOL and DELIMS was to set EOL to a linefeed character. This works because FOR /F delimits lines at linefeed, so it can never read one. But the syntax is ugly: http://www.dostips.com/forum/viewtopic.php?p=10729

Code: Select all

for /f eol^=^

^ delims^= %%a in ...
or

Code: Select all

set LF=^


for /f eol^=^%LF%%LF%^ delims^= %%a in ...
But I like your syntax much better. I think all the batch file "experts" got so used to always putting DELIMS at the end so that we could use a space that we never thought to put EOL at the end instead.

Note that your use of TOKENS=* is not needed if DELIMS is disabled. I see many people add that unnecessary bit. All that is needed is:

Code: Select all

for /f delims^=^ eol^= %%a in ...
And to disable EOL when DELIMS is not disabled, then all you need to do is set EOL to one of the delimiter characters. (noted in the earlier link in my post)

If using default DELIMS=space and tab

Code: Select all

for /f "eol= " %%a in ...
If explicitly defining DELIMS

Code: Select all

for /f "eol=, delims=,;" %%a in ...
------------------

Also good to know about the NUL character issue - thanks.

Last edited by dbenham (25 Oct 2012 14:57)

----------------------------

#3 25 Oct 2012 18:58
carlos

Thanks for the note. You are right, if you disable delims and eol is unnecessary use tokens=*

But, be careful if you don't disable delims, always you do a left trim of all initial characters.

Test the next code:

Code: Select all

::delims is # eol disabled
For /f delims^=#^ eol^= %%a in ("###trim ###") do echo.%%a

::delims is # eol disabled tokens=*
For /f tokens^=*^ delims^=#^ eol^= %%a in ("###trim ###") do echo.%%a

::delims is # eol is #
For /f "tokens=* delims=# eol=#" %%a in ("###trim ###") do echo.%%a

::delims is # eol is disabled
For /f delims^=#^ eol^= %%a in ("###trim ###") do echo.%%a

::delims is disabled # eol is disabled tokens=* is unnecesary
For /f delims^=^ eol^= %%a in ("###trim ###") do echo.%%a

::delims is disabled # eol is disabled tokens=* is redundant
For /f tokens^=*^ delims^=^ eol^= %%a in ("###trim ###") do echo.%%a
Last edited by carlos (25 Oct 2012 18:58)

----------------------------

#4 25 Oct 2012 20:34
Simon Sheppard

Great sleuthing Carlos!, I've updated both those pages now.

----------------------------

#5 03 May 2019 19:22
GCRaistlin

Unfortunately find.exe is buggy. Create a file with 3 lines: the first one consists of 4094 chars, the second one is empty and the third one consists of 1 char. Then process it with

Code: Select all

find /v "" test.txt
The second line is missing from the output.

----------------------------

#6 03 May 2019 22:28
Simon Sheppard

Why do you consider that buggy?

find /v "" is asking to return all lines NOT containing an empty string, the second line of your file is an empty string so it doesnt get returned.

----------------------------

#7 29 Jul 2019 22:34
GCRaistlin

There's no such thing as "empty substring" so matching against "" should never return a result (hence /v "" should return all lines). And it works generally this way, but not in the case I mentioned.

----------------------------

#8 30 Jul 2019 01:07
Simon Sheppard

On the page for FIND, I have this note:
Although FIND can be used to scan large files, it will not detect any string that is positioned more than 1070 characters along a single line

I think you have found another similar limitation.

When I tested this it behaved consistently with a line of up to 4091 characters, with a line 4092 characters long, the last 12 characters are spilled over to the next row - as reported by FIND with /n to display line numbers.
With a line of 4093 characters or longer FIND displays the second line at the end of the first, i.e. it has lost the CR/LF that should be at the end of the previous line. Again this is shown by using /n to display line numbers.

I will add a note about this to the page, thanks for flagging it up here.

----------------------------

#9 15 Jul 2020 17:02
RaceQ

Even though "FOR /F" documentation already mentioned double quote as a delimeter (which I did not notice until after I wrote this) - here is detailed example of using double quote as the "FOR /F" delimiter character.

Sample log data file (".\test\for_demo_quotes.txt") where quoted file names appear on lines:

Code: Select all

"E:\HOME\Media\wallpaper\Cumulonimbus-Clouds-Storm-Clouds-Sunset-Tornadoes.jpg"
; "\\server1\media1\pics\Wallpaper\Share1\Misc\pic2 1280x1024.jpg"
"D:\pics\Wallpaper\Share1\Misc\pic2 1280x1024.jpg"
; "\\server1\media1\pics\Wallpaper\Share1\Sci-fi\pic 3 jump gate v2.jpg"
"D:\pics\Wallpaper\Share1\Sci-fi\pic 3 jump gate v2.jpg"
; "\\server1\media1\pics\Wallpaper\Share1\Locations\pic 4.jpg"
"D:\pics\Wallpaper\Share1\Locations\pic 4.jpg"
"E:\HOME\Media\wallpaper\White-Clouds-Blue-Sky.jpg"

Using this code:

echo off
for /f tokens^=1-5*^ usebackq^ delims^=^"^ eol^= %%A in (".\test\for_demo_quotes.txt") do ( 
    if "%%B"=="" ( 
        echo %%A
    ) else (
        echo %%B
    )
) 
Output:

Code: Select all

E:\HOME\Media\wallpaper\Cumulonimbus-Clouds-Storm-Clouds-Sunset-Tornadoes.jpg
\\server1\media1\pics\Wallpaper\Share1\Misc\pic2 1280x1024.jpg
D:\pics\Wallpaper\Share1\Misc\pic2 1280x1024.jpg
\\server1\media1\pics\Wallpaper\Share1\Sci-fi\pic 3 jump gate v2.jpg
D:\pics\Wallpaper\Share1\Sci-fi\pic 3 jump gate v2.jpg
\\server1\media1\pics\Wallpaper\Share1\Locations\pic 4.jpg
D:\pics\Wallpaper\Share1\Locations\pic 4.jpg
E:\HOME\Media\wallpaper\White-Clouds-Blue-Sky.jpg

This would also be useful if you data input lines were formats like:
"fullpath1" "fullpath2" "fullpath3"
keyword1 "fullpath4"
keyword2:"fullpath5"
Parsing by space or colons would split "fullpath[1-5]" strings that contain spaces.

Last edited by RaceQ (15 Jul 2020 17:23)

----------------------------

#10 16 Jul 2020 00:32
GCRaistlin
RaceQ wrote:

This would also be useful if you data input lines were formats like

In this case, it's more efficiency to replace "?", "*", and "=" with substitute strings in such data input lines, then parse them by FOR, then replace substitute strings back.

----------------------------

#11 16 Jul 2020 00:34
GCRaistlin

I mean "fullpath1" "fullpath2" "fullpath3" case.
Post Reply