Function to Sanitize User Input

Microsoft Windows
Post Reply
User avatar
Simon Sheppard
Posts: 190
Joined: 2021-Jul-10, 7:46 pm
Contact:

Function to Sanitize User Input

Post by Simon Sheppard »

Heres a function that can be fed a string and will strip out all characters not in a pre-defined list.
Intended to be used with any user-supplied data, i.e. SET /P

Code: Select all

@ECHO OFF
Setlocal
:: Name: Michael Wright/Simon Sheppard
:: Date: 2023-05-05
:: Desc: sanitize function
:: Sanitize an input string omitting any non-approved characters from the output

:: Example
Set "_example=Sa@mple'{~£} Text12.34$-[🏍“more”]
Echo Input: %_example%
CALL :sanitize _example
Set "_result=%_sanitizedResult%"
Echo %_result%
goto:eof

:sanitize
Setlocal
CALL Set "_string=%%%~1%%"
Set "_sanitizedString="

:sanitizeLoopStart
IF "%_string%"=="" (GOTO :sanitizeLoopEnd)
Set _match=false
FOR %%A IN (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 ' . - # $ + ? @ [ ] { } ~ ) DO (
   IF "%_string:~0,1%"=="%%A" (
      Set "_sanitizedString=%_sanitizedString%%%A"
      Set _match=true
   )
)
rem IF %_match%==false (SET "_sanitizedString=%_sanitizedString%~")
Set "_string=%_string:~1%"
GOTO :sanitizeLoopStart

:sanitizeLoopEnd
ENDLOCAL & Set "_sanitizedResult=%_sanitizedString%"
Would be interested in any thoughts on this approach or alternatives.
Also are there any other characters that should be included/excluded by default?
e.g. If you never use DelayedExpansion then it could be safe to leave exclamation marks in.

One option I'm considering is inserting a placeholder character whenever something is removed from the input string, the problem with this is that double byte unicode characters will result in two placeholders being added to the output which seems messy.
jeb
Posts: 12
Joined: 2023-May-10, 1:28 pm

Re: Function to Sanitize User Input

Post by jeb »

Nice try,

... but try to get it working with this simple test string :o

Code: Select all

Set "_example=jeb1: Part1 ^ & Part2 !"^&" & Part3"
To avoid the first problem when showing the _example variable, I changed the code a bit

Code: Select all

REM Replace: Echo Input: %_example%
<nul set /p ".=Your Input: "
set "_example"
You can not solve it without using delayed expansion at some point (okay it could, but then you need a temporary file and ... and ... and ...)
User avatar
Simon Sheppard
Posts: 190
Joined: 2021-Jul-10, 7:46 pm
Contact:

Re: Function to Sanitize User Input

Post by Simon Sheppard »

Hi Jeb, welcome to the forum :D
You raise a good point, its probably better to not echo the original input string at all, thats just asking for trouble.

I think it is always going to be necessary to filter out a couple of the more obvious poison characters and then use the function to remove everything else.

Code: Select all

@ECHO OFF
Setlocal
:: Sanitize an input string omitting any non-approved characters from the output
CALL Set "_string=%%%~1%%"
Set "_sanitizedString="

:sanitizeLoopStart
IF "%_string%"=="" (GOTO :sanitizeLoopEnd)
FOR %%A IN (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 ' . - # $ + ? @ [ ] { } ~ ) DO (
   IF "%_string:~0,1%"=="%%A" (
      Set "_sanitizedString=%_sanitizedString%%%A"
   )
)
Set "_string=%_string:~1%"
GOTO :sanitizeLoopStart

:sanitizeLoopEnd
ENDLOCAL & Set "_sanitizedResult=%_sanitizedString%"
Example calling the script:

Code: Select all

Set /P _ans=Please enter a Department:

:: remove &'s and quotes from the answer (via string replace)
Set _ans=%_ans:&=%
Set _ans=%_ans:"=%

:: Sanitize
CALL sanitize.cmd _ans
Echo %_sanitizedResult%"
There are probably a few more ASCII punctuation characters which could be on the allowed list, like '!' if not using Delayed expansion.
I put a copy of this on the main site here, but still open to any improvements anyone can offer.
Post Reply