Here are a few examples to show what can be done with Disark.
Classic disassembling without symbol table
Without generating labels
org #1000 ld hl,Label1 ld b,4 Label1 djnz Label1 ret
Once assembled with any assembler and disassembled with Disark:
ld hl,4099 ;You can also ask for hex numbers! ld b,4 djnz $-2 ret
Nothing fancy here. Let’s continue, shall we…
org #1000 ld hl,Label1 ld b,4 Label1 djnz Label1 ret
Once assembled and disassembled with Disark using the –genLabels and –src16bitValuesInHex (to have 16 bits number in hex) options:
ld bc,#1003 lab0003 ld b,4 ;This label is generated. djnz lab0003 ret
This is getting better…
Using a symbol table
The real power of Disark is when the original source comes with a symbol table. Most formats can be understood, and you can indicate the “column” where to find the labels and address in the symbol table, via the –labelPositionInSymbolFile and –addressPositionInSymbolFile.
By using a symbol table, you can recreate the original source code from the binary!
org #1000 ;Original code. START ld bc,LABEL1 ld hl,$ ld de,LABEL1 + 3 LABEL1 ld b,4 djnz LABEL1 jp START
The result is very close to the input! Check by yourself:
org #1000 ;Generated via the "--genOrg" option. START ld bc,LABEL1 ld hl,START+3 ;Disark choses the closest label. ld de,LABEL1+3 LABEL1 ld b,4 djnz LABEL1 jp START
The command I used is:
Disark <binary file to load> <z80 code to generate> --symbolFile <symbol file to load> --loadAddress 0x1000 --genOrg --src16bitValuesInHex
By using the loadAddress option, we define the entry point. This becomes necessary so that the symbol file matches the addresses of each instructions.
So far the disassembled programs only consist in code. But what if you have bytes, word, even pointer areas? This is an example:
org #1000 ld ix,TABLE LABEL1 ld hl,0 LABEL2 ld de,0 LABEL3 ld bc,0 TABLE dw LABEL1 ;Oops, how to reconstruct this?? dw LABEL2 dw LABEL3 LEVELDATA db 1 ;Oops, how to reconstruct this?? db 2 db 3 MUSICPERIODS dw 456 ;Oops, how to reconstruct this?? dw 147 dw 999
Disark, just like any disassembler, has no idea that not everything is code. Without more information, Disark will produce this:
org #1000 ld ix,TABLE LABEL1 ld hl,#0 LABEL2 ld de,#0 LABEL3 ld bc,#0 ;So far, so good. TABLE inc b ;ARG! No! This was "db 1"! djnz MUSICPERIODS+1 djnz MUSICPERIODS+6 LEVELDATA equ $+1 ;ARG! Our LevelData area is compromised too! djnz LEVELDATA+2 ld (bc),a inc bc MUSICPERIODS ret z ;ARG! This is not good! ld bc,#93 rst 32 inc bc
Ok, this is bad. But do not worry, there is a solution. Disark manages all this via Label semantics! The original source can contain labels indicating how to reconstruct it faithfully. These labels must have a specific format so that Disark can locate them in the symbol table. They can be added a prefix, and must be added a suffix. Some examples:
These are default, so you shouldn’t need to declare them.
DisarkCodeRegionStart a code region starts here. DisarkCodeRegionEnd a code region ends here.
DisarkByteRegionStart a code region starts here. DisarkByteRegionEnd a code region ends here.
Example of use:
org #1000 ld hl,#1234 jr AFTER_STUFF MyDemo_DisarkByteRegionStart_Stuff ;Declares a byte area... db 1, 2, 3, 4, 5, 6, 7 db 8, 9, 10, 11, #ff MyDemo_DisarkByteRegionEnd_Stuff ;...that ends here! AFTER_STUFF ld de,#4567 ret
And the result is:
org #1000 ld hl,#1234 jr AFTER_STUFF db 1 db 2 db 3 db 4 db 5 db 6 db 7 db 8 db 9 db 10 db 11 db 255 AFTER_STUFF ld de,#4567 ret
YES! Now the code is perfectly regenerated! Note how the Disark marker labels have been removed from the disassembled source. They are not needed in the code anymore!
Word region and Pointer region:
DisarkWordRegionStart ;A word region starts here. DisarkWordRegionEnd ;A word region ends here. DisarkPointerRegionStart ;A pointer region starts here. DisarkPointerRegionEnd ;A pointer region ends here.
But what is the difference between Word and Pointer region? No reference will be inferred (i.e. “guessed”) from the Word region, but they will from the Pointer region. In the example above, music periods must stay “as-is” and not converted to a pointer, else relocating the code will also modify your music periods!
They are many Labels you can use to add semantic to your code, check the list here.
Let’s go back to our example
By adding these labels in the original code, we can help Disark determine where is the code, and the data:
org #1000 ld ix,TABLE LABEL1 ld hl,0 LABEL2 ld de,0 LABEL3 ld bc,0 TABLE MyDemo_DisarkPointerRegionStart_1 ;Declares our Pointer region. dw LABEL1 dw LABEL2 dw LABEL3 MyDemo_DisarkPointerRegionEnd_1 ;The prefix and postfix must match! LEVELDATA MyDemo_DisarkByteRegionStart_LevelData ;This is the Byte region. db 1 db 2 db 3 MyDemo_DisarkByteRegionEnd_LevelData MUSICPERIODS MyDemo_DisarkWordRegionStart_3 ;This is our Word region. dw 456 dw 147 dw 999 MyDemo_DisarkWordRegionEnd_3
Now let’s “disark” the generated binary and…
org 4096 ld ix,TABLE LABEL1 ld hl,0 LABEL2 ld de,0 LABEL3 ld bc,0 TABLE dw LABEL1 ;Oh yeah! The pointers are correct! dw LABEL2 dw LABEL3 LEVELDATA db 1 ;Yeeepeee! The DBs are well written! db 2 db 3 MUSICPERIODS dw 456 ;Hurray! The Words are intact! dw 147 dw 999
I guess you’ve understand how it works (if not, please contact me and I’ll try to give better examples!).