Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- @Echo off
- For /f "tokens=4 delims=: " %%G in ('CHCP')Do Set "Restore_Codepage=CHCP %%G > nul"
- Set "Return[Len]=" & Set "Return[String]=" & Set "{input}=" & Set "Modified="
- Setlocal DISABLEDelayedExpansion
- REM the label marker ":#" is used within this script to delimit help output.
- :#
- :# ========================= ASCII string filter v3.1 by T3RRY ======================
- Rem - This script iterates over an input string character by character and tests
- Rem each character against a a whitelist of printable ASCII characters, with
- Rem succesful matches used to build a new string containing only printable
- Rem ASCII characters.
- Rem - Execution time increases as string length increases. Each character in the
- Rem string is tested against a whitelist containing 96 printable ASCII characters.
- :#
- :# Usage: Filepath <"String"> [ /P | /R | /T ] | [ -? | /? | -help ]
- :#
- :# Rem to use from another batch file:
- :# For /f delims^= %%G in ('FilePath "string"')Do Echo(%%G
- :#
- :# Accepts input String via doublequoted argument - reads %* and trims trailing " \P" " \T" or " \R"
- :# switches if present at EOL
- :# - No escaping of characters in the argument is required
- :# - If unbalanced doublequotes exist in the string all doublequotes will be Removed.
- :#
- :# Use Switch /P to preserve original spaces
- :# - Default behaviour is to Remove all double spaces from the string.
- :# Errorlevels:
- :# 0 : String contained only printable ASCII characters; Return[String]
- :# contains the original input string.
- :# -1 : String contained NonASCII or nonprintable ASCII characters;
- :# Return[String] contains only printable ASCII characters
- :# from the input string.
- :#
- :# Use Switch /R to reject input containing NonASCII characters
- :# - Errorlevel 0 : string contains only printable ASCII Characters
- :# - Errorlevel 1 or GTR: string contains one or more characters that are
- :# not ASCII printable characters. The errorlevel corresponds to
- :# the 1 indexed position of first non ASCII character encountered.
- :# Note: the presence of TAB literals in the string will result
- :# in an incorrect position being reported.
- :#
- :# Use Switch /T to truncate strings on first occurance of a non-Ascii character
- :# - Errorlevel returned is String length
- :# - String returned in Return[String] variable
- :#
- ::::::::::::::::::::::::::::::::::
- Rem Version changes 20/Jan/2021 :
- Rem - Added switch: /T
- Rem Truncates string on occurance of first non ASCII character
- ::::::::::::::::::::::::::::::::::
- Rem Version changes 11/Dec/2021 :
- Rem - Added TAB to ASCII printable characters. Handled via substitution. Seee help for more info.
- Rem - Script now differentiates between original paired spaces and paired spaces
- Rem resulting from removal of non ASCII characters.
- ::::::::::::::::::::::::::::::::::
- Rem Version changes 09/Dec/2021 :
- Rem - Changed input method to handle cases where qouted args contain
- Rem standard delims within quotes IE: "string "substring=text""
- Rem - Implemented negative errorlevel return: -1 to flag if
- Rem the input string has been modified. 0 indicates unmodified, -1 modified.
- ::::::::::::::::::::::::::::::::::
- Rem Version changes 08/Dec/2021 :
- Rem - Added Help Switches -? /? and -help
- Rem - Added switch: /R
- Rem - Reject strings containing non ASCII characters. Default: Strip NonASCCi
- Rem characters from the string.
- Rem Note: this switch does not define Return[Len] or Return[String]
- ::::::::::::::::::::::::::::::::::
- Rem Version changes 07/Dec/2021 :
- Rem - Rewritten for faster performance - NOTE:
- Rem - Added Switch: /P
- Rem - Preserve all whitespace. Default: multiple spaces truncated to single.
- Rem - Renamed variable for returning String : Return[String]
- Rem - Added variable Return[Len] to return 0 indexed string length.
- Rem - Corrected handling of completely non ASCII strings to return empty / 0 Len
- Rem ** Utilize alternate data stream to store variable containing printable ASCII
- Rem characters so the variable only needs to be generated on first execution.
- Rem ** Requires this batch file to be run from an NTFS drive.
- :# =================================================================================
- Set LF=^
- %= Empty lines above required =%
- For /F eol^=^%LF%%LF%^ delims^= %%A in ('forfiles /p "%~dp0." /m "%~nx0" /c "cmd /c echo(0x09"') do Set "TAB=%%A"
- Set "ASCII= !"
- 2> nul (
- more < "%~f0:ASCII.dat" > nul || (
- Setlocal EnableDelayedExpansion
- For /l %%i in (34 1 126) Do (
- Cmd /c Exit %%i
- Set "ASCII=!ASCII!!=ExitCodeAscii!"
- )
- >"%~f0:ASCII.dat" (Echo(Set ^^"ASCII=!ASCII!")
- ENDLOCAL
- ))
- Set "ASCII="
- For /f "delims=" %%G in ('More ^< "%~f0:ASCII.dat"')Do %%G
- If not Defined ASCII (
- 2> nul (
- Powershell.exe -nologo -noprofile -command "Remove-item -path '%~nx0' -Stream '*'"
- )
- 1>&2 Echo(An error has occured. Ensure "%~nx0" is located on an NTFS drive.
- Pause
- ENDLOCAL
- Exit /b 1
- )
- Rem Maximum stringlength to support. Modify here to propagate to RemoveChar loop and Return[Len]
- REM maximum 1015 chars due to input reading method.
- Set "SupportLength=1015"
- Set "{input}="
- ::====================================================================================================
- Rem :: input capture method is a modified version of Dave Benhams method:
- Rem :: https://www.dostips.com/forum/viewtopic.php?t=4288#p23980
- SETLOCAL EnableDelayedExpansion
- 1>"%~f0:Params.dat" <"%~f0:Params.dat" (
- SETLOCAL DisableExtensions
- Set prompt=#
- Echo on
- For %%a in (%%a) do rem . %*.
- Echo off
- ENDLOCAL
- Set /p "{input}="
- Set /p "{input}="
- Set "{input}=!{input}:~7,-2!"
- @Rem duplicate {input} for the purpose of counting doublequotes.
- Set "count=!{input}!"
- ) || (
- 1>&2 Echo(%~nx0 requires an NTFS drive system to function as intended.
- CMD /C Exit -1073741510
- ) || Goto:Eof
- ::====================================================================================================
- Rem the below line can be used to Remove the aleternate data stream this file creates.
- Rem Powershell -c "Remove-item -path '%~nx0' -Stream '*'"
- CHCP 65001 > nul
- If not defined {input} (
- Echo(Demo:
- Rem escaped for definition in DelayedExpansion environment
- Set "{input}=this is [ ] a demo) * ^! & ☺ ^= ¶ | ^! <. ~ ^^ & %% ▒ ╔ § ♣ This"
- Set {input}
- )
- REM handle help switches
- Set {input} | %SystemRoot%\System32\Findstr.exe /Xli "{input}=\/? {input}=-? {input}=-help" > nul && (
- Setlocal EnableDelayedExpansion
- For /f "tokens=2* delims=#" %%G in ('%SystemRoot%\System32\Findstr.exe /blic:":# " "%~f0"')Do (
- Set "Usage=%%G"
- Echo(!Usage:Filepath=%~f0!
- )
- ENDLOCAL & ENDLOCAL
- Exit /b 0
- )
- REM substitute doublequotes in {input} clone 'count'; count substring in string;
- REM assess if count is even; If false; Remove doublequotes from string.
- Set Div="is=#", "1/(is<<31)"
- Set "{DQ}=0"
- Set ^"count=!count:"={DQ}!"
- 2> nul Set "null=%count:{DQ}=" & Set /A {DQ}+=1& set "null=%"
- Set /A !Div:#={DQ} %% 2! 2> nul && (%= Doublequote count is Odd. =%
- Set ^"{input}=!{input}:"=!"
- )
- REM handle nonhelp switches /R and /P [ mutually exclusive; only enacted if switch terminates commandline input. ]
- Set "ASCIISwitch[R]="
- Set "ASCIISwitch[P]="
- If defined {input} (
- Set {input} | %SystemRoot%\System32\findstr.exe /Eli "\/P \/R \/T" > nul && (
- If /I "!{input}:~-3!"==" /P" (
- Set "{input}=!{input}:~0,-3!"
- Set "ASCIISwitch[P]=true"
- ) Else If /I "!{input}:~-3!"==" /R" (
- Set "{input}=!{input}:~0,-3!"
- Set "ASCIISwitch[R]=true"
- ) Else If /I "!{input}:~-3!"==" /T" (
- Set "{input}=!{input}:~0,-3!"
- Set "ASCIISwitch[T]=true"
- )))
- Rem Remove outer doublequotes from input argument if not already removed due to unbalanced quoting.
- If .^%{input}:~0,1%^%{input}:~-1%. == ."". Set "{input}=!{input}:~1,-1!"
- Rem Substitute TAB
- If not defined ASCIISwitch[R] If not defined ASCIISwitch[T] For /f "delims=" %%G in ("!TAB!")Do Set "{input}=!{input}:%%G={TAB}!"
- Rem Substitute Paired spaces prior to character removal
- If not defined ASCIISwitch[R] If not defined ASCIISwitch[T] Set "{input}=!{input}: ={2xSp}!"
- Rem RemoveChar loop - iterate over input character by character; Compare against each character in whitelist
- Rem Appends ASCII Whitelist characters to New string unless /R switch used, in which case NonASCII characters
- Rem trigger an exit of the script with a positive errorlevel indicating the string is not ASCII.
- Rem the return value is the 1 indexed position of the first non ascii character encountered.
- Set "end=" & Set "New="
- For /l %%i in (0 1 %SupportLength%)Do If not "!{input}:~%%i,1!"=="" (
- Set "Char=!{input}:~%%i,1!"
- Set "ISAscii="
- For /l %%c in (0 1 94)Do If not "!ASCII:~%%c,1!" == "" (
- Set "C_Char=!ASCII:~%%c,1!"
- if "!Char!"=="!C_Char!" (
- Set "New=!New!!Char!"
- Set "ISAscii=true"
- ))
- If Not Defined ISAscii (
- If Defined ASCIISwitch[T] (
- For /f "delims=" %%G in ("!New!")Do (
- Echo(!New!
- Endlocal & Endlocal & Set "Return[string]=%%G"
- %Restore_Codepage%
- )
- Exit /b %%G
- )
- Set "Modified=true"
- If Defined ASCIISwitch[R] (
- Endlocal & Endlocal & %Restore_Codepage%
- For /f "delims=" %%G in ('Set /A %%i+1')Do Exit /b %%G
- )))
- Rem strip new Paired spaces from string if switch /P not used.
- Set "{Input}=!New!"
- If not Defined ASCIISwitch[P] (
- For /l %%i in (0 1 9)Do if defined {Input} Set "{Input}=!{Input}: = !"
- )
- Rem reinsert original paired spaces and Tab:
- If defined {input} (
- Set "{input}=!{input}:{2xSp}= !"
- Set "{input}=!{input}:{TAB}=%TAB%!"
- )
- If defined {input} (
- Echo(
- <nul Set /p "=!{input}!"
- For /l %%i in (0 1 %SupportLength%)Do If not defined Return[Len] If "!{input}:~%%i,1!"=="" Set "Return[Len]=%%i"
- ) Else (
- ENDLOCAL & ENDLOCAL & Set "Return[Len]=0"
- Set "Return[String]="
- )
- If defined {input} For /f "Delims=" %%G in ("!{Input}!")Do (
- ENDLOCAL & ENDLOCAL & Set "Return[Len]=%Return[Len]%" & Set "Return[string]=%%G"
- )
- %Restore_Codepage%
- If not defined modified Exit /B 0
- Exit /b -1
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement