What is a path traversal vulnerability?
What are path or directory traversal vulnerabilities, and how can you protect against them?
A path traversal vulnerability (CWE-35), also known as a directory traversal vulnerability, is a type of security flaw that allows attackers to access files and directories they should not be able to reach. They typically occur in web applications, but can affect any software that handles file paths.
Like many other common flaws, path traversal vulnerabilities are theoretically easy to eradicate, but show no signs of going away. They have accounted for roughly 5% of all new CVEs every year since 2014, make up roughly 5% of CISA’s known exploited vulnerabilities catalog, and are so widely abused by criminals that CISA and the FBI published a Secure by Design Alert about them in May 2024.
How they work
Path traversal vulnerabilities exploit the fact that a file on a website has two addresses. The first address is the file’s private location on a computer file system, and the second address is the public URL used to access the file via HTTP.
When a user requests the file using the public URL, web server software maps the public address to the private file system location and retrieves the file.
For example, a website might offer a file called report.pdf, with a public URL that looks like this:
https://www.example.com/?file=report.pdf
Which maps to the file report.pdf in the directory /var/www/html on a server file system.
/
└── var/
└── www/
└── html/
└── report.pdf
On a secure, well configured website or application, the web server enforces rules that restrict the files and directories a website user has access to.
On a website with a path traversal vulnerability, an attacker can manipulate a public HTTP address to access parts of the private file system they should not be able to reach. Typically, they do this using “relative paths,” file paths that contain symbols like ../, which means “go up one level from here”.
So, a path traversal vulnerability might allow a user to replace report.pdf in our example URL with a relative path, like ../../../../etc/passwd.
https://www.example.com/?file=../../../../etc/passwd
Note that the URL might be URL encoded, in which case it would look like this:
https://www.example.com/?file=%2E%2E%2F%2E%2E%2F%2E%2E%2F%2E%2E%2Fetc/passwd
This URLs above both map to /var/www/html/../../../etc/passwd on the file system, which will be interpreted as /etc/passwd, the Unix password file.
/
└── etc/
└── passwd
Path traversal vulnerabilities can be used to access confidential data, such as credentials, and in some cases they can even be used to run arbitrary code, which can lead to a complete takeover of the target.
Protecting against path traversal attacks
Software you use
Preventing path traversal attacks starts with ensuring that your servers are configured correctly, starting with the correct file and directory permissions. Publicly accessible files should also be contained within the web server’s document root rather than scattered around the file system. It is worth noting, too, that relative paths cannot span disk drives, which is why you will see recommendations like: “For Windows IIS servers, the web root should not be on the system disk.” Otherwise, a successful attacker could use a path traversal vulnerability to gain access to system directories.
The best way to protect your organization against path traversal vulnerabilities caused by coding errors in software you use is to use a tool like ThreatDown’s vulnerability and patch management. This ensures that patches are applied quickly and efficiently, before an attacker can reverse engineer them to create an exploit. A web application firewall (WAF) can provide a useful second layer of protection, by blocking suspicious-looking requests.
Software you maintain
If your website uses software written in house, or your organization is a software vendor, then you are responsible for ensuring that path traversal vulnerabilities don’t creep into your code.
This begins with validating and sanitizing all user input so that users are restricted to a set of allowed actions. Validation ensures that input matches what’s expected, while sanitization removes or renders safe anything that might be harmful. Note that if you are dealing with URLs, input may come URL encoded. (When dealing with URL encoding, watch out for the null-bytes being used to bypass validation—?file=report.pdf%00.exe will be interpreted by some applications as ending in “.pdf” while the operating system sees a file ending in “.exe”.)
Input should be validated and sanitized on arrival, and again before it is used in system calls, to protect against attackers providing input in unexpected ways.
Rather than inventing your own validation and sanitization routines, we recommend you use robust, well-maintained libraries and frameworks to do it for you, wherever possible.