Introduction
Your application serves user-uploaded files through a download endpoint. The filename parameter seemed safe until a security researcher submits ../../../etc/passwd as the filename, and your server returns the system password file. The same technique reveals database credentials, API keys, and source code.
Path traversal occurs when applications use user input to construct file paths without proper validation. Attackers manipulate path components to escape intended directories and access arbitrary files. This guide explores attack techniques, prevention strategies, and why AI-powered testing excels at finding these vulnerabilities.
Understanding the Risk
Basic Path Traversal:
# Vulnerable file download endpoint
@app.route('/download')
def download_file():
filename = request.args.get('file')
filepath = f"/var/www/uploads/{filename}" # Direct concatenation
return send_file(filepath)Attack: GET /download?file=../../../etc/passwd
The path /var/www/uploads/../../../etc/passwd resolves to /etc/passwd.
Encoding Bypass Techniques
Applications filtering ../ may be vulnerable to encoded variants:
%2e%2e%2f = ../
%252e%252e%252f = ../ (double encoded)
..%c0%af = ../ (overlong UTF-8)
....// = ../ (if ../ removed once)
..%5c = ..\ (Windows)
Attack Targets
# Linux
/etc/passwd
/proc/self/environ # Environment variables
/var/www/app/.env # Application secrets
# Windows
C:\Windows\System32\config\SAM
C:\inetpub\wwwroot\web.configPrevention Best Practices
Use Path Canonicalization
import os
UPLOAD_DIR = '/var/www/uploads'
def get_safe_path(filename):
requested_path = os.path.join(UPLOAD_DIR, filename)
canonical_path = os.path.realpath(requested_path)
if not canonical_path.startswith(os.path.realpath(UPLOAD_DIR) + os.sep):
raise ValueError("Access denied: path traversal attempt")
return canonical_pathUse Basename Extraction
def download_file_safely(user_filename):
safe_filename = os.path.basename(user_filename)
if not safe_filename:
raise ValueError("Invalid filename")
filepath = os.path.join(UPLOAD_DIR, safe_filename)
return send_file(filepath)Implement File ID Mappings
FILE_REGISTRY = {}
def register_upload(filepath):
file_id = hashlib.sha256(filepath.encode()).hexdigest()[:16]
FILE_REGISTRY[file_id] = filepath
return file_id
def download_by_id(file_id):
if file_id not in FILE_REGISTRY:
raise FileNotFoundError("File not found")
return send_file(FILE_REGISTRY[file_id])Normalize Before Validation
from urllib.parse import unquote
def safe_file_access(user_filename):
decoded = unquote(unquote(user_filename)) # Double decode
if '..' in decoded or '/' in decoded or '\\' in decoded:
raise ValueError("Invalid path characters")
return get_safe_path(decoded)Node.js Example
const path = require('path');
const UPLOAD_DIR = '/var/www/uploads';
function getSecurePath(userFilename) {
const requestedPath = path.join(UPLOAD_DIR, userFilename);
const canonicalPath = path.resolve(requestedPath);
if (!canonicalPath.startsWith(path.resolve(UPLOAD_DIR) + path.sep)) {
throw new Error('Access denied');
}
return canonicalPath;
}Why Traditional Pentesting Falls Short
Path traversal appears in file downloads, uploads, template loading, and backup restoration. The variety of encoding bypass techniques requires testing with hundreds of payload variations across different operating systems and frameworks.
How AI-Powered Testing Solves It
RedVeil's AI agents identify file path handling across your application, testing with comprehensive payload sets covering encoding bypasses and platform-specific path separators. When vulnerabilities are found, RedVeil demonstrates impact by showing accessible files.
Conclusion
Path traversal transforms file-serving features into arbitrary file read capabilities. Effective defense centers on path canonicalization—resolving the full path and verifying it remains within allowed boundaries.
AI-powered penetration testing from RedVeil identifies path traversal vulnerabilities, systematically testing encoding bypasses and platform-specific techniques.
Protect your files from path traversal attacks—test with RedVeil today.