About a week ago, I was asked if I had tools for OneNote files.
I don’t, and I had no time to take a closer look.
But last Thursday night, I had some time to take a look. I looked at this OneNote maldoc sample.
I opened the file in the binary editor I use often (010 Editor):
I expected to see some magic header, a special sequence of byte that would tell me which file type is used. I didn’t see that, but I noticed that the first 16 bytes look random. And they were the same for another sample. So this could be a GUID. GUIDs in Microsoft’s representation are a mix of little- and big-endian hexadecimal integers. That’s why 010 Editor has an entry for GUIDs in its inspector tab:
This is the GUID represented as a string: {7B5C52E4-D88C-4DA7-AEB1-5378D02996D3}
Looking this up with Google:
That’s great, Microsoft has a document [MS-ONESTORE] describing this file format.
Unfortunately, I did a quick search but didn’t find a pure Python module to read this file format. Maybe it exists, but I didn’t find it.
Next I tried my pecheck.py tool to locate the executable inside the onenote sample. That worked well:
At position 0x2aa4, here’s an embedded PE file. Taking a look with the binary editor:
I see the MZ header, and 36 bytes in front of that, another random looking sequence of 16 bytes. Maybe another GUID:
{BDE316E7-2665-4511-A4C4-8D4D0B7A9EAC}
A bit of Google search:
Turns out that this is a FileDataStoreObject structure.
So looking for this GUID in any file, one can find (and extract) embedded files. So that’s what I quickly coded using my Python template for binary files (there are some issues with this GUID-search method, I’ll address these in an upcoming blog post or video)
A new tool: onedump.py
Click to Open Code Editor