/* ______ / ___ \ / / \_\ _ _ _ | | _____ ___| |_ ___ _ _ _ _ (_) |_ _ _ | | |___/ / -_) _|/ -_) ._| . \| | _| || | | | |___\ \___|\__|\___|_| |_|__|_|\__|\_, | | | __ |__/ \ |___/ / \______/ (c) Lada 'Ray' Lostak (c) Unreal64 (c) 2000-2004 Content: 6510 family crossassembler fast DOC */ 0.1 Preface =================================================================================================== Sorry for typos and language mistakes, this is 1st version :) Ray 1.0. Format =================================================================================================== Every line holds intruction / special command and label. Line can be also empty. Anything after ';' is takes as comment. variable = value ldx #start_value [instruction modifier] label dex bne label table .special_cmd .... Instead of address at instruction operands and values at assigments, can be used 'expression'. Expression can be also used at special directives (like if). Every object (function, macro, label, ...) shares the same memory space. So, error becomes if you defined macro XXX and then you're going to use label called XXX. 1.1 Expressions =================================================================================================== Expression consists of operators, constants,funcions and labels. Expression can be 8 or 16 bits long. Size is determined from place where expression is used. Like "lda #expression" means 8 bit. Sometimes is not possible to determine size. For example "lda address". It can mean 8 or 16. In this case, expressions is processed as 16 bit, and instruction determine n according to result if lda will be zero page (expression < 256) or memory (expression >= 256). You can also use instruciton modifiers to set instruction size. Expression itselfs is processed as 32 or 64 bit internally (depends on host CPU). There is no way how to use expression in float (for couting). If you need to process float arithetic (for ex. for generating sinus tables) you have few ways: a. make in some external thingy & import by some special directive b. (a) and export as as text c. define function in Ant script (c) is the best way how to do it, because you have all in one file. Larger internal format bring one potential problem with zeropage addressing. Let's say we have this piece of code: zp_start =$fd ; our work place starts here pack_dst_lo zp_start+1 ; dst memory pack_dst_hi zp_start+2 pack_src_lo zp_start+3 ; src memory pack_src_hi zp_start+4 zp_end zp_start+4 ; see down... ... lda #start sta pack_src_lo ; this is ZP sta stx pack_src_hi ; this is ZP sta lda #dst sta pack_dst_lo ; this is ZP sta stx pack_dst_hi ; this __NOT__ ZP sta !!! ... lda (pack_src_lo),y ; this is mystake ! assembled as lda ($ff),y sta (pack_dst_lo),y; ; this is ok, sta ($fd) ... So, this piece of code will not work well, but no error is showed (because it is well written in syntax). You need to be aware of this ZP bugs. But as you pobably knows, we add 'zp_end' because we can test it: .if zp_end>255 .error "Hey bastard, we overflow zero page !" .endif Knowing 'zp_end' also also usefull for backuping ZP - we knows length (zp_end-zp_start+1) - we can easili swap with it around... Because of 32/64 bit internal processing, writing this code will mean: dst = $fffa lda dst+16 What happenends ? expression have value $1000a. It is > 256, lda will be long. And result value is trunacted to 16 bit, so $000a. But let's change expression to: lda (dst+16)>>1 Result of THIS expression will be $8005. NOT $0005. So, you can safely write expressions using higher value than $ffff is. It is useful for generating tables etc. But keep in mind, expression is truncaed AT THE END. Expression is internally always SIGNED. It doesn't mean anything for us because out internal format can hld highest 16 bit unsingned value without prolem. 1.1.1 Numbers - constants ------------------------- ccc - decial constant (0-9) $ccc - hexadecimal constant (0-9,a-f) %ccc - binar constant (0-1) 1.1.2 Operators --------------- 1.1.2.1 Aritmetic operators --------------------------- Aritmetic operators. Work directly with number :) ( ) a*(b+c) - process expression in brackets first ~ ~a - logival not + a+b - add - a-b - substract / a/b - divide * a*b - multiply % a%b - remainder from a/b expression & a&b - bit AND | a|b - bit OR ^ a^b - bit XOR << a<> b>>n - bit rol right 1.1.2.2 Boolean operators --------------------------- These operators returns 0 or 1 in dependenci of expression. Can be also used in normal expression if it have meaning. ! !a - loical not (0->1 and 1->0) && a&&b - returns 1 if a and b are non-zero || a||b - returns 1 if a or b are non-zero < a a>b - returns 1 if a>b else 0 <= a= a>b - returns 1 if a>b else 0 = a=b - returns 1 if a equal b != a!=b - returns 1 if a equal b 1.1.2.3 Misc operators ---------------------- < LO BYTE - returns LO byte from word (returns 8 bit) > HI BYTE - returns HI byte from word (returns 8 bit) As you can see, < and > can be used for two different action. How Asm resolve which one you mean ? Verry symple. If > or < is used on form A > B it means BOOLEAN test. If it is used in form >A or 1 lda #irq_test .else lda #irq .endif sta $fffe sta $ffff 1.1.2.4 Operator precedence --------------------------- Follows table with operator precedence. First are operators with highest precedence (processed first). If more operators is on one line, they're processed left->right. Operator group -------------- function_call,others ( ) (negative sign) ! ~ * / % + - LO/HI << >> < <= > >= = != & ^ | && || Because of precedence, these expressions are procees as they're: expression processed as -------------------------------------------------------------------- lda #1+const*8 1+(const*8) lda #=3 sta $d500+(channel-3)*7 ; address second SID sta $d500+(channel-3)*7 .else sta $d401+channel*7 sta $d401+channel*7 .endif .endmacro Macros are verry powerfull. Macro can use other macros inside, but can't define different macro or variable. Multiple macro referencing can cause problems with labels. If we have macro like: ; zero whole SID zero_whole_sid .macro ldx #0 txa loop sta $d400,x inx bne loop .endmacro If we will reference macro few times, we will get error about 'allready defined symbols'. It is clear. Why all symbols are not 'automatically' private ? Because sometimes you need reference macro labels. How to solve ? We have 3 ways: 1. use 'bne *-4' instead of 'bne loop' - this is usable only for simple macros 2. use private/public (see 'sharing') ; zero whole SID zero_whole_sid .macro .private ldx #0 txa loop sta $d400,x inx bne loop .public .endmacro This fully solves our problem. But we can't reference 'loop' (in this case we don't need, but sometimes it is usefull). We can ofcourse use '.export'. And ofcurse, we can pass 'symbol' which we want to access as macro parameter. 3. use privat/public and pass 'prefix' as parameter - only for cases where we need 'access' macro symbols (special cases - some tables, and similar) ; zero whole SID zero_whole_sid .macro ... .if ::cnt>0 .private ::1 .else .private .endif ldx #0 txa loop sta $d400,x inx bne loop .public .endmacro And references: zero_whole_sid one ... ... ... zero_whole_sid two Now we can reference one@loop and one@two if necessary. But if we don't need reference its internals we can just write zero_whole_sid 1.4.3 General ------------- General group... 1.4.3.1. error -------------- Aborting assembling with user error. Sometimes verry usefull. Mainly for checking constants bounds, and similar stuffs. .error "error showed by assembler" 1.4.3.2. allign/allign_swap/allign_swap_end --------------------------------------- Align next line on boundary defined by expression. .allign expression .allign_swap [flags] [#expression] ; datas which can be moved in memory .allign_swap_end flags: unit - swap blocks creates unit and can't be divided (if not defined block can be divided on ever label) Expression is processed as number. If you will align on $100 next instruction will start on $xx00 boundary. You can also align on EXACT address, like $2fff. Physicall allignign is discussed later. You need to include #expression to tell assembler where block need to be supposed. Because aliging can waste memory, is recomended to setup block allign_swap & allign_swap_end. Datas in this block, can be MOVED to align block to minimize memory wasting. Assembler assume that data in block can be moved in parts sets by labels ! If you need to flow with WHOLE block add as flags 'unit' Because of many resons, assembler MOVES ALL allign_swap blocks to the END ! So, in real, everything in align_swap blocks will be at the end. If you will change memory location (using * operator) asm will flush all align_swap blocks which are in queue. You can align at any place including macros, repeats, whatevr. You can use .allign inside align_swap block. align_swap blocks can't be nested. Example: we're going to execute some raster thingy, and we can't cross $100 boundary, so, we need to be aligned at $xx00 (it consumes 1 cycle in addition) .allign_swap ; some static data which we do not care where they're colors1 .byte $00,$0b,$0c,$0f,$0e,$01 colors2 .byte $a1,$a3,$b1,$c6 .align_swap_end .... sta $01 rts ; end 'normal' routines ;* .allign $100 ; raster starts here, or some disc communication - whatever what need EXACT cycling ldx #$00 .... rts At position marked * can be wasted memory because of alligning. Up to (align-1) bytes. But because we used blocks align_swap and align_swap_end this PLACE will be filled by 'colors1' and/or 'colors2'. If there will be ENOUGH place, both will be MOVED here. If not, only one of them. As you can see, colors1 and colors2 can't be used at our align place. Assembler shedules swaping with these block well. Alignign to pysical address. If you need allign some code into pysical address, you have to setup block with address. In this case, assembler need to 'terminate' instruction stream and put requested block inside. It will terminate instruction stream and place here 'jmp after_inserted_block'. If there is curently no instrument stream, but some other emit (.byte, ....) assembler generates ERROR. Assembler can't assume where can terminate emit. So, before terminating palce need to follow at lest 3 valid instruction of CPU. This feare is used really in special, verry speial cases :) Example: ; we need to PUT $aa to address $2fff but from $2000-$3fff is code... what to do ? *=$2000 .align_swap_start #$2fff .byte $aa .align_swap_end ..... ; code, code, code ..... At $2ffc (or $2ffb or $2ffa - depends on last instruction size) will be placed 'jmp $3000'. Then follows zeroes if there is difference and on address $2fff our align block. So, we have $aa at $2fff. ***WARNING*** using any form of align with zero page will produce error message 1.4.3.3. offset --------------- Sets offset between code generation address and 'output address. Sometimes is usefull to have assembled program for different location. .offset expression Normally, offset is 0. You can use '*' variable in expression - it will help you. Also you will probably need a bit arithmetic around :) See example. Example: ;some code... .... stx $01 rts; ; now follows 'packer' routines, moved to $0200, but not they're at $abcd pack_routine =* *=$0200 .offset pack_routine-* pack_go lda #$00 ldx #$03 ...... rts ; ok, routines is ending... we need to kack offset to zero and count new * pack_length =*-pack_go *=pack_routine+pack_length .offset 0 ; code continue.... finish ldx #$00 finish_loop lda pack_routine,x ; lda $abcd,x sta pack_go,x ; sta $0200,x inx bne finish_loop ; assume length $100... jmp pack_g ; jmp $0200 Verry sympe and useful. For disc loaders, whatever where you need different address for assembling and running. 1.4.3.4. start -------------- Sets address, where code should be execuded. Not all output formats can hold start address. For ex. 'memory image' can't hold. But obj format which is used for tranfering can hold it. Also, some output formats REQURE use of .start. For example 'run-cruncher' need to know where to jump after de-compress... .start expression If you will not use .start $0000 is assumed. 1.4.3.5. symbols ---------------- Sometimes it is necessary to tell asm some information about symbol, typically when used code located in zero page. Circular references may occurs and you will get errors about inconsitent sizes/instruction lengths during passes. Sometime you with to use long instruction forms, even short can be used. name .symbol options will set symbol options for fiven symbol 'name'. Avilable options: force16 - force 16 bit addressing (if possible) force8 - force 8 bit addresing (if possible) symbol need to be UNDEFINED when settings its options. Example: option = $a5 lda option -> will exit a5 a5 option .symbol force16 option = $a5 lda option -> will emit ad 00 a5 This forcing is not hard - it means, if you will use force8 on 16 bit value, assembler will silently use 16 bits forms. It is because some isntructions doesn't have both 8/16 addressing. If you want to have sure about emited form, use INSTRUCTION modifier rather symbol. If expression will combine more 'forcing' symbols, 16 bit will always winns. Another example: s1 .symbol force16 s2 .symbol force8 s1 =$0000 s2 =$fe lda s1+s1 -> $ad $fe $00 lda s1+1 -> $ad $01 $00 lda s2+1 -> $a5 $ff lda s2+2 -> $ad $00 $01 1.4.4 Including --------------- Group of specials which is used for including text/binar files. 1.4.4.1. include ---------------- Includes text file which is normally processed (before lines after include directive). File is processed exactly same way like if it is written 'instead' if .include line. It is recomended to use .private .public .import .export specials. .include "path_to_file" 1.4.4.2. load -------------- Loads file. This is universal thingy. Every 'format' can consists of one or more data blocks. Majority of formats have one data block. .load "file_path" [format_name] [flg_silent] [flg_query] [get.type[.block] variable] [set.type[.block] expression] If operator of expression includes SPACES you HAVE TO include '( )' - like "set.address (expression where you used spaces)" When query is used, NO DATAS are omited. But everything else is like with datas. If silent is not specified, directive changes current pc. For formats, where is included more blocks, PC is equal to LAST block ! Last block depends on 'format' - so, for multi blocks is recomended to use *. By using get you can fetch special informations in dependency on format or additionals. Symbol 'variable' is CREATED and its values it sets. You can specify block name if needed. Standard variables: name format meaning ----------------------------------------------- addr start memory pos size physical size end address of _LAST_ byte occupied in memory init init address play play address sscnt subsoung count ssdef default subsong author song author copy song copyright name song name Similar to get work set. Instead of variable expression is used. It sets block property. First block (or for address for given type of block. First block is equial to 'at'. If nothing given, block normally continue after previous one. If 0 is given, block is 'ignored' and is NOT included. Standard types: name format meaning ----------------------------------------------- addr set block memory position size set physical size offs set offset from file If you wish to load some standartized format, you can specify it. If format is not given, 'c64' standard file is assumed. Default supported formats: format_name dest - what is loaded --------------------------------------------- psid PSID song format, format is readed and load address / data / size is fetched from file binar load address = 0, size = real file size c64 (default) load address = first 2 bytes, size = real size-2 img image is converted to physical C64 valid pixels (8x8 pixels -> 1 C64 char) its size have to be (in both axis) dividable by 8 in Y axis, last 'line' is additional, where pixels from left have to be plooted with 'right' color (hires: 2 pixels, multi: 4, etc.) Example: .load "music.sid" format psid .start * sei ; this is valid, we don't need to set * because .load did it... .... ; out code always starts at the end of song ; now start other datas table .word 1,2,3,4 music2 .load "music2.dat" set.addr * ; we can move this song l8r to right location If you will produce '.load "some_file" format binar' it will be loaded from $0000. It is probably something what you didn't want to do. 1.4.4.3. insert --------------- Exactly like load but by default load address is equal to * (so, it is inserted at current pc position). 1.4.5. Sharing -------------- This group allows to share more 'indepenend' files - their symbols and similar. They are mainly supposed to work with .include directive, but they're also usefull as standalone. 1.4.5.1. public --------------- All new defined symbols/macros/... are 'public' - shared. private & public belongs to group. Every .public have to belong to .private directive. In case of nesting private/public, .public means 'public' in previous contents, not global one. .private something .private something else .public label <- this label is not GLOBAL but is public in contents of first .private directive .public 1.4.5.2. private ---------------- All new defined symbols/macros/... are 'private'. Other files don't see them. Private symbols are created from from macros, repeats, includes, etc. They are connected together by '@' and they are left-to rigth in time. It means, if you defined macro in included file, prefix order wil lbe macro@include_file. Every part between @ consists of: K:file:line:count or K:file:line-uniq:count where K is kind - m for macro i for included file r for repeat line is source line uniq is unique index for repeated blocks (e.g. repeats) count is curently included file count It is not improtant how are prefixex for symbol created, but it is usefull to know, if you have symbols and finding something. NEVER reference these automatic symbols directly in sources. Prefixes may change any time. In fact, due using ':' and '-' you can't actually ;) You can also define OWN prefixes - and reference them easily directly. Useful @ macros, etc. If you use user prefix, YOU have to care of about uniquing. These user prefixes always belong to current 'scope'. They are not 'global'. Command form: .private ["user_prefix"] Private can be nested: .private "test" .private "foo" xxx .byte 12 .public .public Caller can reference 'xxx' foo@test@xxx. Always use 'public' and 'private' in pairs. 1.4.5.3. export --------------- Override not yet defined symbol scope - and put them into 'upper' level of private (or public if private is one-level). Symbols names can use standard * convention. .export symbol[,symbol2,....] 1.4.5.4. import --------------- Override allready existing symbol scope - and put them into 'upper' level of private (or public if private is one-level). Symbols names can use standard * convention. .import symbol[,symbol2,....] Example: We have some.ass file which includes test.ass. This file 'exports' 2 functions: Beer_create and Beer_drink. It also exports constant called Beer_max and its assembling can be changes by Beer_flags symbol. How to process ? ;---------------------------------------------------------------------------------------------------- some.ass ; some code Beer_flags = whatever ; set properties .include "test.ass" Beer_heap .rep Beer_max ; normally use exported 'symbol' .byte 0 .endrep jsr Beer_create jmp some_loop ; this 'some_loop' will not conflict with 'some_loop' defined @ test.ass ;---------------------------------------------------------------------------------------------------- test.ass ; we are at normal scope, put here exported constants, etc. Beer_max = 20 .export Beer_create,Beer_drink ; tell asm that these labels will not be affected by .private [these symbols not yet exists] .import Beer_flags ; tell asm that Beer_flags .... [this symbol is allready defined] ; switch to private, all defined here is only our 'own' .private Beer_create lda #Beer_flags ; without .import this symbol will be undefined jsr init jsr something_else some_loop jsr do_something ; some_loop doesn't conflict with some_loop at some.ass cmp #4 beq some_loop rts Beer_drink jmp $fce2 ; without .export this will be in real randomtrash@Beer_drink - test.ass will not be abble to reference this private symbol Simple but powerfull for bigger projects. Isn't it ? (o: 1.5. Instruction modifiers =================================================================================================== Instruction modifiers sets instruction format. In major cases it is automatical, but sometimes you need to specifi it by hand. Remember, this is REALLY REALLY for special using. You will problably never need it :) Some instruction modifiers can be combined and follows after instruction itselfs. Modifier when to use meaning --------------------------------------------------------------------------------------- long addressing instruction address operand will be ignores (its size) and ins will have long address format short addressing reverse... Note: short operator doesn't produce error message if used address is > 255. It quity truncate value.... 1.5.1 Example - addressing memory --------------------------------- start = $0010 lda start Because of unknown lda instruction size, 'start' expression is prcessed as 16 bit. Then asm will test if expression is < 256. In this case yes, so, 'lda' is assemblerd as "lda $10" ($a5 $10). If you need long format ($ad ...) you have 2 ways. Sets 'start' to value > 256. It is the best way. But sometimes you can't do this, so, you can modify instruction size by adding instruction modifier. So, in this case you can use: start = $0010 lda start long