Here are a few examples to show what can be done with Disark.
Classic disassembling without symbol table
Without generating labels
Original code:
org #1000
ld hl,Label1
ld b,4
Label1
djnz Label1
ret
Once assembled with any assembler and disassembled with Disark:
ld hl,4101 ;You can also ask for hex numbers!
ld b,4
djnz $
ret
Nothing fancy here. Let’s continue, shall we…
Generating labels
org #1000
ld hl,Label1
ld b,4
Label1
djnz Label1
ret
Once assembled and disassembled with Disark using the –genLabels and –src16bitValuesInHex (to have 16 bits number in hex) options:
ld hl,#1005
ld b,4
lab0005 djnz lab0005 ;This label is generated.
ret
This is getting better…
Using a symbol table
The real power of Disark is when the original source comes with a symbol table. Most formats can be understood, and you can indicate the “column” where to find the labels and address in the symbol table, via the –labelPositionInSymbolFile and –addressPositionInSymbolFile.
By using a symbol table, you can recreate the original source code from the binary!
org #1000 ;Original code.
START ld bc,LABEL1
ld hl,$
ld de,LABEL1 + 3
LABEL1 ld b,4
djnz LABEL1
jp START
The result is very close to the input! Check by yourself:
org #1000 ;Generated via the "--genOrg" option.
START ld bc,LABEL1
ld hl,START+3 ;Disark choses the closest label.
ld de,LABEL1+3
LABEL1 ld b,4
djnz LABEL1
jp START
The command I used is:
Disark <binary file to load> <z80 code to generate> --symbolFile <symbol file to load> --loadAddress 0x1000 --genOrg --src16bitValuesInHex
By using the loadAddress option, we define the entry point. This becomes necessary so that the symbol file matches the addresses of each instructions.
Label semantics
So far the disassembled programs only consist in code. But what if you have bytes, word, even pointer areas? This is an example:
org #1000
ld ix,TABLE
LABEL1 ld hl,0
LABEL2 ld de,0
LABEL3 ld bc,0
TABLE
dw LABEL1 ;Oops, how to reconstruct this??
dw LABEL2
dw LABEL3
LEVELDATA
db 1 ;Oops, how to reconstruct this??
db 2
db 3
MUSICPERIODS
dw 456 ;Oops, how to reconstruct this??
dw 147
dw 999
Disark, just like any disassembler, has no idea that not everything is code. Without more information, Disark will produce this:
org #1000
ld ix,TABLE
LABEL1 ld hl,#0
LABEL2 ld de,#0
LABEL3 ld bc,#0 ;So far, so good.
TABLE inc b ;ARG! No! This was "db 1"!
djnz MUSICPERIODS+1
djnz MUSICPERIODS+6
LEVELDATA equ $+1 ;ARG! Our LevelData area is compromised too!
djnz LEVELDATA+2
ld (bc),a
inc bc
MUSICPERIODS ret z ;ARG! This is not good!
ld bc,#93
rst 32
inc bc
Ok, this is bad. But do not worry, there is a solution. Disark manages all this via Label semantics! The original source can contain labels indicating how to reconstruct it faithfully. These labels must have a specific format so that Disark can locate them in the symbol table. They can be added a prefix, and must be added a suffix. Some examples:
Code region:
These are default, so you shouldn’t need to declare them.
DisarkCodeRegionStart a code region starts here.
DisarkCodeRegionEnd a code region ends here.
Byte region:
DisarkByteRegionStart a code region starts here.
DisarkByteRegionEnd a code region ends here.
Example of use:
org #1000
ld hl,#1234
jr AFTER_STUFF
MyDemo_DisarkByteRegionStart_Stuff ;Declares a byte area...
db 1, 2, 3, 4, 5, 6, 7
db 8, 9, 10, 11, #ff
MyDemo_DisarkByteRegionEnd_Stuff ;...that ends here!
AFTER_STUFF
ld de,#4567
ret
And the result is:
org #1000
ld hl,#1234
jr AFTER_STUFF
db 1
db 2
db 3
db 4
db 5
db 6
db 7
db 8
db 9
db 10
db 11
db 255
AFTER_STUFF ld de,#4567
ret
YES! Now the code is perfectly regenerated! Note how the Disark marker labels have been removed from the disassembled source. They are not needed in the code anymore!
Word region and Pointer region:
DisarkWordRegionStart ;A word region starts here.
DisarkWordRegionEnd ;A word region ends here.
DisarkPointerRegionStart ;A pointer region starts here.
DisarkPointerRegionEnd ;A pointer region ends here.
But what is the difference between Word and Pointer region? No reference will be inferred (i.e. “guessed”) from the Word region, but they will from the Pointer region. In the example above, music periods must stay “as-is” and not converted to a pointer, else relocating the code will also modify your music periods!
They are many Labels you can use to add semantic to your code, check the list here.
Let’s go back to our example
By adding these labels in the original code, we can help Disark determine where is the code, and the data:
org #1000
ld ix,TABLE
LABEL1 ld hl,0
LABEL2 ld de,0
LABEL3 ld bc,0
TABLE
MyDemo_DisarkPointerRegionStart_1 ;Declares our Pointer region.
dw LABEL1
dw LABEL2
dw LABEL3
MyDemo_DisarkPointerRegionEnd_1 ;The prefix and postfix must match!
LEVELDATA
MyDemo_DisarkByteRegionStart_LevelData ;This is the Byte region.
db 1
db 2
db 3
MyDemo_DisarkByteRegionEnd_LevelData
MUSICPERIODS
MyDemo_DisarkWordRegionStart_3 ;This is our Word region.
dw 456
dw 147
dw 999
MyDemo_DisarkWordRegionEnd_3
Now let’s “disark” the generated binary and…
org 4096
ld ix,TABLE
LABEL1 ld hl,0
LABEL2 ld de,0
LABEL3 ld bc,0
TABLE dw LABEL1 ;Oh yeah! The pointers are correct!
dw LABEL2
dw LABEL3
LEVELDATA db 1 ;Yeeepeee! The DBs are well written!
db 2
db 3
MUSICPERIODS dw 456 ;Hurray! The Words are intact!
dw 147
dw 999
I guess you’ve understand how it works (if not, please contact me and I’ll try to give better examples!).