Canonicalization mistakes are caused when your application makes a security decision based on a name (such as a filename, a directory name, or a URL) and more than one representation of the resource name exists, which can lead to the security check being bypassed.
Test Cases
1. For all vulnerable API, attempt to read/write files using the following variations: “FileName::$DATA” and “File~Name.txt”
2. Attempt to read/write a file by using parent paths (i.e. /../../autoexec.bat).
3. Use hexadecimal escape codes (i.e. %20, the space character) to represent characters in an attempt to read/write a file.
4. Use UTF-8 variable-width encoding to read/write a file. UTF-8 variable encoding allows one character to potentially map to multiple-byte representations, and thus can be problematic. For instance, use %c0%af (which stands for //) to read/write a file.
5. Use UCS-2 Unicode encoding and double encoding
6. Use HTML escape codes (i.e. < and >) to read/write a file from a web page (if applicable).
7. Pass a really long file name in an attempt to read/write a file.
Analyze the data management layer for code that inputs file paths and explore for canonicalization mistakes