Examples

Here are a few examples to show what can be done with Disark.

Classic disassembling without symbol table

Without generating labels

Original code:

    org #1000
    ld hl,Label1
    ld b,4
Label1
    djnz Label1
    ret

Once assembled with any assembler and disassembled with Disark:

    ld hl,4101    ;You can also ask for hex numbers!
    ld b,4
    djnz $
    ret

Nothing fancy here. Let’s continue, shall we…

Generating labels

    org #1000
    ld hl,Label1
    ld b,4
Label1
    djnz Label1
    ret

Once assembled and disassembled with Disark using the –genLabels and –src16bitValuesInHex (to have 16 bits number in hex) options:

    ld hl,#1005
    ld b,4
lab0005 djnz lab0005         ;This label is generated.
    ret

This is getting better…

Using a symbol table

The real power of Disark is when the original source comes with a symbol table. Most formats can be understood, and you can indicate the “column” where to find the labels and address in the symbol table, via the –labelPositionInSymbolFile and –addressPositionInSymbolFile.

By using a symbol table, you can recreate the original source code from the binary!

        org #1000                ;Original code.
START   ld bc,LABEL1
        ld hl,$
        ld de,LABEL1 + 3
LABEL1  ld b,4
        djnz LABEL1
        jp START

The result is very close to the input! Check by yourself:

       org #1000            ;Generated via the "--genOrg" option.
START  ld bc,LABEL1
       ld hl,START+3        ;Disark choses the closest label.
       ld de,LABEL1+3
LABEL1 ld b,4
       djnz LABEL1
       jp START

The command I used is:

Disark <binary file to load> <z80 code to generate> --symbolFile <symbol file to load> --loadAddress 0x1000 --genOrg --src16bitValuesInHex

By using the loadAddress option, we define the entry point. This becomes necessary so that the symbol file matches the addresses of each instructions.

Label semantics

So far the disassembled programs only consist in code. But what if you have bytes, word, even pointer areas? This is an example:

        org #1000

        ld ix,TABLE
        
LABEL1  ld hl,0
LABEL2  ld de,0
LABEL3  ld bc,0
        
TABLE
        dw LABEL1       ;Oops, how to reconstruct this??
        dw LABEL2
        dw LABEL3

LEVELDATA
        db 1            ;Oops, how to reconstruct this??
        db 2
        db 3
        
MUSICPERIODS
        dw 456          ;Oops, how to reconstruct this??
        dw 147
        dw 999

Disark, just like any disassembler, has no idea that not everything is code. Without more information, Disark will produce this:

    org #1000
    ld ix,TABLE
LABEL1 ld hl,#0
LABEL2 ld de,#0
LABEL3 ld bc,#0             ;So far, so good.
TABLE inc b                 ;ARG! No! This was "db 1"!
    djnz MUSICPERIODS+1
    djnz MUSICPERIODS+6
LEVELDATA equ $+1           ;ARG! Our LevelData area is compromised too!
    djnz LEVELDATA+2
    ld (bc),a
    inc bc
MUSICPERIODS ret z          ;ARG! This is not good!
    ld bc,#93
    rst 32
    inc bc

Ok, this is bad. But do not worry, there is a solution. Disark manages all this via Label semantics! The original source can contain labels indicating how to reconstruct it faithfully. These labels must have a specific format so that Disark can locate them in the symbol table. They can be added a prefix, and must be added a suffix. Some examples:

Code region:

These are default, so you shouldn’t need to declare them.

DisarkCodeRegionStart        a code region starts here.
DisarkCodeRegionEnd          a code region ends here.

Byte region:

DisarkByteRegionStart        a code region starts here.
DisarkByteRegionEnd          a code region ends here.

Example of use:

    org #1000
    ld hl,#1234
    jr AFTER_STUFF
MyDemo_DisarkByteRegionStart_Stuff   ;Declares a byte area...
    db 1, 2, 3, 4, 5, 6, 7
    db 8, 9, 10, 11, #ff
MyDemo_DisarkByteRegionEnd_Stuff     ;...that ends here!
AFTER_STUFF
    ld de,#4567
    ret

And the result is:

    org #1000
    ld hl,#1234
    jr AFTER_STUFF
    db 1
    db 2
    db 3
    db 4
    db 5
    db 6
    db 7
    db 8
    db 9
    db 10
    db 11
    db 255
AFTER_STUFF ld de,#4567
    ret

YES! Now the code is perfectly regenerated! Note how the Disark marker labels have been removed from the disassembled source. They are not needed in the code anymore!

Word region and Pointer region:

DisarkWordRegionStart           ;A word region starts here.
DisarkWordRegionEnd             ;A word region ends here.
DisarkPointerRegionStart        ;A pointer region starts here.
DisarkPointerRegionEnd          ;A pointer region ends here.

But what is the difference between Word and Pointer region? No reference will be inferred (i.e. “guessed”) from the Word region, but they will from the Pointer region. In the example above, music periods must stay “as-is” and not converted to a pointer, else relocating the code will also modify your music periods!

They are many Labels you can use to add semantic to your code, check the list here.

Let’s go back to our example

By adding these labels in the original code, we can help Disark determine where is the code, and the data:

        org #1000

        ld ix,TABLE
        
LABEL1  ld hl,0
LABEL2  ld de,0
LABEL3  ld bc,0

TABLE
MyDemo_DisarkPointerRegionStart_1    ;Declares our Pointer region.
        dw LABEL1
        dw LABEL2
        dw LABEL3
MyDemo_DisarkPointerRegionEnd_1     ;The prefix and postfix must match!

LEVELDATA
MyDemo_DisarkByteRegionStart_LevelData  ;This is the Byte region.
        db 1
        db 2
        db 3
MyDemo_DisarkByteRegionEnd_LevelData
        
MUSICPERIODS
MyDemo_DisarkWordRegionStart_3   ;This is our Word region.
        dw 456
        dw 147
        dw 999
MyDemo_DisarkWordRegionEnd_3

Now let’s “disark” the generated binary and…

        org 4096
        ld ix,TABLE
LABEL1  ld hl,0
LABEL2  ld de,0
LABEL3  ld bc,0
TABLE   dw LABEL1      ;Oh yeah! The pointers are correct!
        dw LABEL2
        dw LABEL3
LEVELDATA db 1         ;Yeeepeee! The DBs are well written!
        db 2
        db 3
MUSICPERIODS dw 456    ;Hurray! The Words are intact!
        dw 147
        dw 999

I guess you’ve understand how it works (if not, please contact me and I’ll try to give better examples!).