This thread is dedicated to consolidate and describe attempt of reverse engineering binary data files produced with Cameca Peaksight version 5.1 (or 5.0) to 6.4.
This is aimed for interoperability and hopefully can provide some nice additions to PfE (i.e. reading of optical, single channel video or multichannel/mapping impDat, but other options like reading wdsDat or calDat I think would be also interesting to have).
Forum is used and read mostly by PfE users, but I see that not all forum users are PfE users (including myself ofc), and this RE attempt can be useful for making independent workflows from PfS. Also it can be useful for recovering old datasets and other customized necessities with old data (for users which got PfE, but still have lots of old data in PeakSight formats). So this RE is done in essence independently from PfE.
There are some previous threads which discussed or mentions lack of knowledge about impDat structure:
There is no questions for qtiDat, calDat or wdsDat in the forum, thus I am adding some code examples of usage for impDat first.
Albeit my personal reasons for RE lays mostly in wdsDat and impDat together with qtiDat, and python code of direct implementation using this RE descriptions will be firstly for wdsDat (I need that to prove some points in some other forum threads).
A bit of history and how it evolved into current shapeThis attempt started as attempt to do reverse engineering of simple (produced with "Save Image" button in peaksight) impDat images, but after seeing how complicated it gets with impDat produced with mapping workflow, the attempt was stalled. Later with a need to read massive wdsDat, the attempt was continued. It looked OK for PeakSight 6 produced wdsDat and impDat, but was not working for PeakSight 5 produced files. Up to that point RE was done in old-school style, that is by taking notes of offsets from bisecting different binary files in hexeditors (wxHexEditor, and partly Hexinator) and directly writing/modifying code for parsing in target language (python). It was hard to find logic and elegant differences between versions and so that attempt stalled again. I should also mention that I had RE ovl files as very first (As far I remember I was already using it in 2016), and had made some ovl file manager, as I find such functionality in PeakSight lacking, even in v6.4.
And there is where Kaitai Struct came to rescue
https://kaitai.io/. Kaitai Struct is language agnostic way to describe the binary formats (but actually it is an excellent tool for reverse engineering binary formats). This also allows to be flexible as parser can be used with change of language of chose in which some custom workflow/methods are defined.
Kaitai struct is declarative (to emphasize: it is not imperative) way of parsing, meaning that bisecting between different kind of files and versions is much more easier to do (compared to traditional hex-editors it has its pros in that, but it has cons in other places). For this the attempt of RE only particular type of files was replaced with the full-scale attempt to parse all of
.***Dat files, as that gives larger sample of structures to look into, and allows to recognize common structures used in between those different files. As the
.***Dat file descriptions were nearly finalized, I realized then that the
.***Set files has similar philosophy, and while its direct usefulness (necessity to RE) is questionable, the successful parsing of those could cover the
.***Dat files better. And so the parser was extended to partially-parse .***Set files and .ovl (overlap correction) files. Bisecting all of these files makes it clear where common header structure finishes, and different structure types starts (bisecting only
***Dat files makes it hard to point where the header ends, as similar structures of ***Dat files are occupying sectors after header - this is why including
.***Set and
.ovl files into RE workflow is actually very important).
The binary structure description in Kaitai Struct is saved as .ksy files. Descriptions inside .ksy is based on YAML (Initially I though it is an acronym of "Yet Another Mark[up/down] Language", but creators say that it is "YAML Ain't Markup Language", which I find kind of hilarious).
The repository of this attempt is on github, where this binary description is updated, opening issues or pull requests are welcome:
https://github.com/sem-geologist/peaksight-binary-parserThe simple method to check if Your binary files can be parsed with the current state of format description is by using kaitai struct web-IDE
https://ide.kaitai.io/. It works in the most popular web browsers (chromium, firefox, brave...). After downloading ksy file from github repository, it can be drag-dropped into IDE, and then d&d binary file which is wished to be inspected. They appear in the list at the left of IDE. double clicking on ksy and then on binary file will select the parser and file to be parsed, and that will result in highlighting the structure in hexviewer and generating the object tree view (which is nicely interconnected, clicking on hexviewer on highlighted part it will bring focus in object tree to the selected node, and it works other way around).
It is possible to generate the parsing code for one of the languages by right clicking on ksy file in list and choosing target language from the menu. The generated code will appear as one of tabs over hexview. Compilations to some languages (i.e. C++, header and implementation) generates few files in separate tabs. Content can be copied to a new file in the plain text editor. To use the generated code some Kaitai runtime libraries for given target language needs to be downloaded (for some target languages package managers can be used).
Parsing in The Kaitai Struct Web-IDE demonstration: