
Extracting files from Burp intruder
About 2 min
Following the discovery of an authenticated Insecure Direct Object Reference (IDOR) vulnerability which allowed connected users to download files, I needed to extract them out of Burp. I could have done this with a loop and wget/curl but I was on Windows and it was much easier to deal with cookies within Burp. So I ran my intruder and then saved the items to a file.
Burp export a file containing XML with both the request and the response base64 encoded. The document is like so :
<?xml version="1.0"?>
<!DOCTYPE items [
<!ELEMENT items (item*)>
<!ATTLIST items burpVersion CDATA "">
<!ATTLIST items exportTime CDATA "">
<!ELEMENT item (time, url, host, port, protocol, method, path, extension, request, status, responselength, mimetype, response, comment)>
<!ELEMENT time (#PCDATA)>
<!ELEMENT url (#PCDATA)>
<!ELEMENT host (#PCDATA)>
<!ATTLIST host ip CDATA "">
<!ELEMENT port (#PCDATA)>
<!ELEMENT protocol (#PCDATA)>
<!ELEMENT method (#PCDATA)>
<!ELEMENT path (#PCDATA)>
<!ELEMENT extension (#PCDATA)>
<!ELEMENT request (#PCDATA)>
<!ATTLIST request base64 (true|false) "false">
<!ELEMENT status (#PCDATA)>
<!ELEMENT responselength (#PCDATA)>
<!ELEMENT mimetype (#PCDATA)>
<!ELEMENT response (#PCDATA)>
<!ATTLIST response base64 (true|false) "false">
<!ELEMENT comment (#PCDATA)>
]>
<items burpVersion="2022.12.4" exportTime="Mon Dec 12 10:00:00 CET 2022">
<item>
<time>Mon Dec 12 10:00:00 CET 2022</time>
<url><![CDATA[https://domain.tld/documents/15008/download/]]></url>
<host ip="X.X.X.X">domain.tld</host>
<port>443</port>
<protocol>https</protocol>
<method><![CDATA[GET]]></method>
<path><![CDATA[/documents/15008/download/]]></path>
<extension>null</extension>
<request base64="true"><![CDATA[REDACTED]]></request>
<status>200</status>
<responselength>35419</responselength>
<mimetype></mimetype>
<response base64="true"><![CDATA[REDACTED]]></response>
<comment></comment>
</item>
</items>In order to extract the files from it, I used the following script ugly script.
import base64
import re
import xml.etree.ElementTree as ET
# Variables
path = 'document-download.burp'
output_path = './extracted/'
def main():
# Parse Burp file
mytree = ET.parse(path)
myroot = mytree.getroot()
# Search through each item in the file
for item in myroot.findall('item'):
try:
# Retreive the response
based = (item.find('response').text)
# Decode the response
data = base64.b64decode(based)
# Retrieve the headers from the response
headers = data.split(b'\r\n\r\n')[0]
# Retrieve the content from the response
content = data.split(b'\r\n\r\n')[1]
# Extract the filename from the response headers
regex_expression = r'filename=(?:\")?([a-zA-Z0-9\-\_\.\_]*)(?:\")?'
filenames = re.findall(regex_expression, headers.decode("utf-8"))
if len(filenames) <= 0 :
raise Exception("No filename was identified")
# Get the first match
filename = filenames[0]
# Generate the output path
output_name = output_path + filename
print (f"[+] Extracted : {filename}")
# Write the body to a file using extracted filename
f = open(output_name, "wb")
f.write(content)
f.close()
# If something goes wrong, print the exception
except Exception as e:
print(e)
if __name__ == "__main__":
main()The script is using the filename value out of the Content-Disposition HTTP header of the response.