{"id":978,"date":"2018-12-02T15:41:13","date_gmt":"2018-12-02T23:41:13","guid":{"rendered":"https:\/\/www.pagetable.com\/?p=978"},"modified":"2018-12-02T15:41:13","modified_gmt":"2018-12-02T23:41:13","slug":"a-visual-defragmenter-for-the-commodore-64","status":"publish","type":"post","link":"https:\/\/www.pagetable.com\/?p=978","title":{"rendered":"A Visual Defragmenter for the Commodore 64"},"content":{"rendered":"<p>For decades, PC users have been able to relax by watching the computer defragment a disk. Now C64 users can do the same! Introducing &#8220;defrag1541&#8221;, a disk defragmentation tool for C64 and 1541.<\/p>\n<p>\n<!--\n<video autoplay loop muted inline>\n<source src=\"docs\/defrag1541\/defrag1541.mp4\" type=\"video\/mp4\">\n<\/video>\n--><br \/>\n<img decoding=\"async\" src=\"docs\/defrag1541\/defrag1541.gif\"\/>\n<\/p>\n<ul>\n<li>Download: <a href=\"https:\/\/github.com\/mist64\/defrag1541\/blob\/master\/defrag1541.prg\">defrag1541.prg<\/a> (9 blocks)<\/li>\n<li>Source: <a href=\"https:\/\/github.com\/mist64\/defrag1541\">github.com\/mist64\/defrag1541<\/a>\n<\/ul>\n<p>defrag1541 consists of 49 lines of BASIC, takes about 15-30 minutes to defragment a typical disk, and visualizes its work using colored blocks on the screen. The different colors symbolize the different files, and the character shapes show whether the block is at its original (round) or final (square) position. (Note that this tool is meant for visualization only, do not use it on real data!)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"docs\/defrag1541\/defrag1541_source.png\" width=\"640\" height=\"400\"\/><\/p>\n<p>In the remainder of this article, I will describe how the program works.<\/p>\n<h2>The 1541 Disk Format<\/h2>\n<p>Commodore 1541 floppy disks use the CBM filesystem. A disk consists of 683 sectors of 256 bytes, which are addressed as a track\/sector tuple. There are<\/p>\n<ul>\n<li>35 tracks, numbered 1-35<\/li>\n<li>with 21 to 17 sectors each, numbered from 0<\/li>\n<\/ul>\n<p>The number of sectors depends on the track, higher tracks (closer to the center) have fewer sectors:<\/p>\n<table border=\"1\">\n<th>Tracks<\/th>\n<th>Sectors<\/th>\n<tr>\n<td>1-17<\/td>\n<td>21<\/td>\n<\/tr>\n<tr>\n<td>18-24<\/td>\n<td>19<\/td>\n<\/tr>\n<tr>\n<td>25-30<\/td>\n<td>18<\/td>\n<\/tr>\n<tr>\n<td>31-35<\/td>\n<td>17<\/td>\n<\/tr>\n<\/table>\n<p>The following table visualizes this: The columns are the tracks, the lines are the sectors. The higher numbered tracks don&#8217;t have as many sectors.<\/p>\n<pre>\n              11111111<b>1<\/b>112222222222333333\n   T012345678901234567<b>8<\/b>900123456789012345\n S\n 0  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 1  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 2  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 3  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 4  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 5  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 6  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 7  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 8  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n 9  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n10  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n11  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n12  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n13  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n14  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n15  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n16  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOOOOOOO\n17  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOOOOOOOO\n18  OOOOOOOOOOOOOOOOOO<b>O<\/b>OOOOOOO\n19  OOOOOOOOOOOOOOOOOO\n20  OOOOOOOOOOOOOOOOOO\n<\/pre>\n<p>Track 18 is special, it holds all the metadata. 18\/0 contains the disk name and the Block Availability Map (BAM), which uses one bit per block to keep track of which blocks are allocated, and the remaining sectors contain 32 byte directory entries, so 8 directory entries per block. Every directory entry consists of a file type, a 16 byte file name, and a track\/sector tuple for the first block of the file.<\/p>\n<p>Every file is stored as a linked list. Every block holds the track\/sector tuple of the next block in its first two bytes, followed by 254 data bytes. A track value of 0 indicated the last block of a file.<\/p>\n<h2>The Data Model<\/h2>\n<p>The easiest way to look at the problem of defragmenting a disk is to imagine all files on the disk as concatenated into one long file that needs to be defragmented. Now every used block on disk has a logical block index. Logical block 0 is the first block of the first file, block 1 is the second block of the first file \u2013 if the first file was two blocks or more. Otherwise it&#8217;s the first block of the second file, and so on.<\/p>\n<p>So for every physical block, we need to know what logical block is stored there. Then we build the desired mapping: For every physical block, what logical block <i>should<\/i> be stored there. Our core defragmenter code then moves sectores around and updates links until the two mappings are identical.<\/p>\n<h2>The Data Structures<\/h2>\n<p>At the very beginning, we are defining the arrays. We do this early, so we can&#8217;t run into <tt>OUT OF MEMORY<\/tt> errors at runtime. In fact, this program occupies about 99% of the 38911 bytes available to BASIC on the C64.<\/p>\n<pre>\n10 dimds(17):dimn(143):dimcc(143):dimci(682):dimic(682):dimcj(682):dimjc(682)\n15 dimtr(682):dimsc(682):dimaa(682):dimdd(682):dimoo(682):dimss(682):n$=chr$(0)\n<\/pre>\n<p>The following table contains a small overview of the arrays. We will talk about them more as they are used in the code.<\/p>\n<table border=\"1\">\n<tr>\n<td><tt>ds(17)<\/tt><\/td>\n<td>array of sector numbers of dentry chain<\/td>\n<\/tr>\n<tr>\n<td><tt>n(143)<\/tt><\/td>\n<td>map dentry index -> number of blocks<\/td>\n<\/tr>\n<tr>\n<td><tt>cc(143)<\/tt><\/td>\n<td>map dentry index -> color<\/td>\n<\/tr>\n<tr>\n<td><tt>ci(682)<\/tt><\/td>\n<td>map physical block index -> logical block index &#8211; current<\/td>\n<\/tr>\n<tr>\n<td><tt>ic(682)<\/tt><\/td>\n<td>inverse of ci()<\/td>\n<\/tr>\n<tr>\n<td><tt>cj(682)<\/tt><\/td>\n<td>map physical block index -> logical block index &#8211; desired<\/td>\n<\/tr>\n<tr>\n<td><tt>jc(682)<\/tt><\/td>\n<td>inverse of cj()<\/td>\n<\/tr>\n<tr>\n<td><tt>tr(682)<\/tt><\/td>\n<td>map block index -> track<\/td>\n<\/tr>\n<tr>\n<td><tt>sc(682)<\/tt><\/td>\n<td>map block index -> sector<\/td>\n<\/tr>\n<tr>\n<td><tt>aa(682)<\/tt><\/td>\n<td>map block index -> screen ram location<\/td>\n<\/tr>\n<tr>\n<td><tt>dd(682)<\/tt><\/td>\n<td>map logical block index -> dentry<\/td>\n<\/tr>\n<tr>\n<td><tt>oo(682)<\/tt><\/td>\n<td>map logical block index -> offset<\/td>\n<\/tr>\n<tr>\n<td><tt>ss(682)<\/tt><\/td>\n<td>stack<\/td>\n<\/tr>\n<\/table>\n<h2>The UI<\/h2>\n<p>We create a border around a 35&#215;21 character area on the screen, so that every block can be represented by a character.<\/p>\n<pre>\n20 a=1024:b=1060:poke646,15:v=53280:pokev,0:pokev+1,15:printchr$(147):pokev+1,0\n25 pokea,112:pokea+22*40,109:fori=1to35:pokea+i,67:pokea+22*40+i,67:next\n30 pokeb,110:fori=1to21:pokea+40*i,66:pokeb+40*i,66:next:pokeb+22*40,125\n<\/pre>\n<p>The array <tt>cc()<\/tt> will map a file index to a color. We cycle through the C64 palette skipping black, which is the background color.<\/p>\n<pre>\n35 j=1:fori=0to143:cc(i)=j:j=j+1:ifj=16thenj=1\n40 next\n<\/pre>\n<h2>Reading the Current State<\/h2>\n<p>As a first step, we need to find out where data is stored on disk, i.e. for every physical block, we need to know what logical block (or none) is currently stored there.<\/p>\n<p>The following code iterates over all directory entries and follows the linked lists of all files, recording in the array <tt>ci()<\/tt> the logical block index for each physical block. The reverse mapping (at what physical location is the logical block stored?) is built in the array <tt>ic()<\/tt>. The array <tt>nb()<\/tt> collects the number of blocks each file occupies.<\/p>\n<p>In addition, we are keeping track of the sectors on track 18 that are used for directory entries. We will need this information later to update the directory entries when blocks have moved.<\/p>\n<p>Since we are using linear block indexes for everything, we need to convert the track\/sector tuples that the filesystem works with. This is done by the <tt>t2<\/tt> comparisons in the second half of the code block.<\/p>\n<pre>\n45 open1,8,15,\"i\":fori=0to682:ci(i)=-1:ic(i)=-1:next:s=1:open2,8,2,\"#\"\n50 ds(di)=s:ci(357+s)=-2:di=di+1:print#1,\"u1 2 0 18\"s:get#2,t$,s$:t1=asc(t$+n$)\n55 s1=asc(s$+n$):fori=0to255step32:print#1,\"u1 2 0 18\"s:poke1082+s*40,4\n60 print#1,\"b-p 2\"i+2:get#2,f$:f=asc(f$+n$):iff<>129andf<>130goto105\n65 get#2,t$,s$:t2=asc(t$+n$):s2=asc(s$+n$):ift2=0goto105\n70 print#1,\"u1 2 0\"t2;s2:poke1064+t2+s2*40,81\n75 poke55336+t2+s2*40,cc(ni):ift2<18thena=s2+(t2-1)*21:goto95\n80 ift2<25thena=s2+(t2-18)*19+357:goto95\n85 ift2<31thena=s2+(t2-25)*18+490:goto95\n90 a=s2+(t2-31)*17+598\n95 ci(a)=cd:ic(cd)=a:n(ni)=n(ni)+1\n100 cd=cd+1:goto65\n105 ni=ni+1:next:s=s1:on-(t1<>0)goto50:ci(357)=-2\n<\/pre>\n<p>The <tt>ci(357)<\/tt> and <tt>ci(357+s)<\/tt> assignments mark the directory sectors as occupied, so that the defrag algorithm can&#8217;t use them as temporary storage.<\/p>\n<p>As this code is walking the linked lists, it shows them on the screen in the form of a bullet, in a color that represents the file index.<\/p>\n<h2>Helper Data Structures<\/h2>\n<p>We have to do some programmatic work, which will take about a minute, so we draw progress bars. For this, we first draw &#8220;[&#8221; and &#8220;]&#8221; characters in an empty line at the bottom of the screen.<\/p>\n<pre>\n110 poke1944,27:poke1980,29\n<\/pre>\n<p>When updating links during defragmentation, we will need to convert linear block indexes back into track\/sector tuples, so we create the two arrays <tt>tr()<\/tt> and <tt>sc()<\/tt> for a quick lookup. In addition, in <tt>aa()<\/tt> we calculate the screen position for the character that corresponds to the block.<\/p>\n<pre>\n115 i=0:fort=1to35:readns:fors=0tons:tr(i)=t:sc(i)=s:aa(i)=1064+t+s*40:i=i+1\n120 next:poke1944+t,46:next:data20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20\n125 data20,18,18,18,18,18,18,18,17,17,17,17,17,17,16,16,16,16,16\n<\/pre>\n<p>Later, we will also need to know for every logical block index what file it belongs to and which index it has within the file. This will be stored in <tt>dd()<\/tt> and <tt>oo()<\/tt>.<\/p>\n<pre>\n130 ford=0to143:ifn(d)<>0theno=0:forj=1ton(d):dd(l)=d:oo(l)=o:l=l+1:o=o+1:next\n135 poke1945+int(d\/4.2),81:next\n<\/pre>\n<h2>Desired Allocation<\/h2>\n<p>We have to build a desired allocation array, which has the same format as the current allocation array, just with the data about where we would like the blocks to be. The array <tt>cj<\/tt> maps physical blocks to logical blocks, and <tt>jc()<\/tt> is the reverse of this.<\/p>\n<p>For simplicity, we are just filling the disk linearly, starting with track 1, and not using an interleave. This doesn&#8217;t make the defragmenter very useful, because the disk might actually read more slowly after this, but it makes for a very clean visualization!<\/p>\n<pre>\n140 fori=0to682:cj(i)=-1:jc(i)=-1:next:c=0:b=0:fori=0to143:ifn(i)=0goto155\n145 forj=1ton(i):ifc>=357andc<376thenc=376\n150 cj(c)=b:jc(b)=c:c=c+1:b=b+1:next\n155 poke1945+int(i\/4.2),160:next:fori=357to375:cj(i)=ci(i):next\n<\/pre>\n<p>The code could be easily extended to use a more reasonable strategy. The 1541 operating system for example picks the empty block closest to track 18 as the first block for a new file, then stays on the same track as long as there are free blocks, preferably with an interleave of 10. Then it goes on to the closest track with an empty sector again.<\/p>\n<p>We're done thinking, so let's clear the progress bar:<\/p>\n<pre>\n160 fori=0to34:poke1945+i,32:next:poke1944,32:poke1980,32:sl=0:i=0\n<\/pre>\n<h2>The Main Algorithm<\/h2>\n<p>Now we iterate over all physical blocks and make sure the current allocation matches the desired allocation.<\/p>\n<ul>\n<li>If the current allocation matches the desired allocation, do nothing.<\/li>\n<li>If the block is currently empty, just move the correct block there.<\/li>\n<li>Otherwise, we need to move the logical block that is there to its new location first \u2013 using recursion!<\/li>\n<\/ul>\n<p>For recursion, we use the array <tt>ss()<\/tt> as the stack and <tt>sl<\/tt> as the stack pointer. Recursion can cause a cycle though, e.g. when logical block A is stored where logical block B should be stored and vice versa. So if the block we are looking at is the same as the bottom of the stack, we have to break the cycle.<\/p>\n<p>The following code checks for block identity, empty block as well as a cycle:<\/p>\n<pre>\n165 on-(ci(i)=cj(i))goto190:on-(ci(i)=-1)goto185:on-(sl=0orss(0)<>i)goto180\n<\/pre>\n<p>To break the cycle, we need the help of a temporary block. We find a free block and move the current block there (subroutine <tt>225<\/tt>, we'll talk about it later), then continue.<\/p>\n<pre>\n170 forj=0to682:ifci(j)<>-1thennext\n175 i1=i:i2=j:gosub205:goto165\n<\/pre>\n<p>For recursion, we push the current index on the stack and continue with the physical block that should contain the data that the current block currently contains.<\/p>\n<pre>\n180 ss(sl)=i:sl=sl+1:i=jc(ci(i)):goto165\n<\/pre>\n<p>In case the current block is empty, we move it using subroutine <tt>225<\/tt>.<\/p>\n<pre>\n185 i1=ic(cj(i)):i2=i:gosub205\n<\/pre>\n<p>At the end of the loop, continue with the value from the top of the stack, if there is any, otherwise we take the next block in sequence.<\/p>\n<pre>\n190 ifci(i)>=0thenpokeaa(i),160\n195 ifsl<>0thensl=sl-1:i=ss(sl):goto165\n200 i=i+1:on-(i<683)goto165:close2:close1:open1,8,15,\"v\":end\n<\/pre>\n<p>When we are done, we send a \"V\" (validate) command to the drive, which will follow the links of all files and recreate the block availability map. For simplicity, our code never updates this table, so we have the operating system do it for us at the end.<\/p>\n<h2>Moving a Block<\/h2>\n<p>In order to move a block, in addition to copying the data between the two physical blocks and updating our data structures, we also have to update the track\/sector link tuple from the previous block or the directory entry.<\/p>\n<p>This reads the block into a buffer and writes it to a different location without transfering it to the computer:<\/p>\n<pre>\n205 pokeaa(i1),18:print#1,\"u1 2 0\"tr(i1)sc(i1):pokeaa(i2),23\n210 pokeaa(i2)+54272,cc(dd(i2)):print#1,\"u2 2 0\"tr(i2)sc(i2)\n215 pokeaa(i1),32:pokeaa(i2),160\n<\/pre>\n<p>The <tt>POKE<\/tt> instructions update the screen.<\/p>\n<p>To update the parent link, we have to find out whether this is the first block in a file, in which the parent link is located in the directory entry.<\/p>\n<pre>\n220 c=ci(i1):o=oo(c):t=tr(i2):s=sc(i2):ifo<>0goto240\n<\/pre>\n<p>If that's the case, we find the correct directory entry, read the block and write the new link.<\/p>\n<pre>\n225 d=dd(c):s1=ds(int(d\/8)):pp=peek(aa(357+s1)):pokeaa(357+s1),0\n230 print#1,\"u1 2 0\"18;s1:print#1,\"b-p 2\"(dand7)*32+3\n235 print#2,chr$(t)chr$(s);:print#1,\"u2 2 0\"18;s1:pokeaa(357+s1),pp:goto250\n<\/pre>\n<p>Otherwise, we read the preceding block in the file and update the link there.<\/p>\n<pre>\n240 a=ic(c-1):tt=tr(a):ss=sc(a):print#1,\"u1 2 0\"tt;ss:pp=peek(aa(a))\n245 pokeaa(a),0:print#2,chr$(t)chr$(s);:print#1,\"u2 2 0\"tt;ss:pokeaa(a),pp\n<\/pre>\n<p>Finally, we update the data structures to reflect the fact that this physical block now contains the new data, and the block we moved it from is now empty.<\/p>\n<pre>\n250 v=ci(i1):ic(v)=i2:ci(i2)=v:ci(i1)=-1:return\n<\/pre>\n<h2>Limitations<\/h2>\n<p>defrag1541 is not meant as a serious tool. Nevertheless, here are some limitations and ideas for improvement:<\/p>\n<ul>\n<li>The algorithm for the desired allocation was chosen for its looks, not to make disks actually faster.<\/li>\n<li>Optionally, the algorithm could also place data blocks on track 18. That would give the user 17 more blocks on the disk.<\/li>\n<li>That said, disks that currently have data blocks on track 18 can't get defragmented with the current algorithm. It avoids track 18 and will run out of space.<\/li>\n<li>The current code does not test for read errors at all.<\/li>\n<li>We also do not update BAM ourselves and rely on the operating system doing it at the end. While the code is written so that interrupting the process at any point will not lead to data loss, it is important to run the VALIDATE command at the end if data should ever be written to the disk again.<\/li>\n<li>Of course it could be much faster. A lot of this is CPU bound, because BASIC is so slow. Rewriting everything in assembly would help. Some parts of the code could also directly run inside the disk drive. I doubt the complete data structures would ever fit into the 1541's 2 KB of RAM, so defragmentation can't completely run on the disk drive, but major parts of the logic could still live there, to minimize the lag introduced by communication between the computer and the drive.<\/li>\n<ul>\n","protected":false},"excerpt":{"rendered":"<p>For decades, PC users have been able to relax by watching the computer defragment a disk. Now C64 users can do the same! Introducing &#8220;defrag1541&#8221;, a disk defragmentation tool for C64 and 1541. Download: defrag1541.prg (9 blocks) Source: github.com\/mist64\/defrag1541 defrag1541 consists of 49 lines of BASIC, takes about 15-30 minutes to defragment a typical disk, &#8230; <a title=\"A Visual Defragmenter for the Commodore 64\" class=\"read-more\" href=\"https:\/\/www.pagetable.com\/?p=978\" aria-label=\"Read more about A Visual Defragmenter for the Commodore 64\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41,8,13,16],"tags":[],"class_list":["post-978","post","type-post","status-publish","format-standard","hentry","category-c64","category-commodore","category-floppy-disks","category-github"],"_links":{"self":[{"href":"https:\/\/www.pagetable.com\/index.php?rest_route=\/wp\/v2\/posts\/978","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pagetable.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pagetable.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pagetable.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pagetable.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=978"}],"version-history":[{"count":0,"href":"https:\/\/www.pagetable.com\/index.php?rest_route=\/wp\/v2\/posts\/978\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.pagetable.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=978"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pagetable.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=978"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pagetable.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=978"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}