Intended to be used with any user-supplied data, i.e. SET /P
Code: Select all
@ECHO OFF
Setlocal
:: Name: Michael Wright/Simon Sheppard
:: Date: 2023-05-05
:: Desc: sanitize function
:: Sanitize an input string omitting any non-approved characters from the output
:: Example
Set "_example=Sa@mple'{~£} Text12.34$-[🏍“more”]
Echo Input: %_example%
CALL :sanitize _example
Set "_result=%_sanitizedResult%"
Echo %_result%
goto:eof
:sanitize
Setlocal
CALL Set "_string=%%%~1%%"
Set "_sanitizedString="
:sanitizeLoopStart
IF "%_string%"=="" (GOTO :sanitizeLoopEnd)
Set _match=false
FOR %%A IN (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 ' . - # $ + ? @ [ ] { } ~ ) DO (
IF "%_string:~0,1%"=="%%A" (
Set "_sanitizedString=%_sanitizedString%%%A"
Set _match=true
)
)
rem IF %_match%==false (SET "_sanitizedString=%_sanitizedString%~")
Set "_string=%_string:~1%"
GOTO :sanitizeLoopStart
:sanitizeLoopEnd
ENDLOCAL & Set "_sanitizedResult=%_sanitizedString%"
Also are there any other characters that should be included/excluded by default?
e.g. If you never use DelayedExpansion then it could be safe to leave exclamation marks in.
One option I'm considering is inserting a placeholder character whenever something is removed from the input string, the problem with this is that double byte unicode characters will result in two placeholders being added to the output which seems messy.