How Commodore tapes work

What information is stored in a tape?

In a Commodore 64 tape, the waveform is like this:

waveform

The C64 senses when the waveform goes from a value greater than zero to a value lesser than zero. This event is called trigger and causes an interrupt request to be delivered to CIA#1. This IRQ can be handled by an interrupt handler, or simply discovered by testing bit 4 of location $DC0D. The triggers are indicated by red circles in the figure.

The information is stored in the time interval between a trigger and the previous one. This time can be expressed with different time units: number of samples in the WAV file, number of microseconds, number of 6510 CPU clock cycles...

What information is stored in a TAP file?

A byte in the data part of a TAP file represent the duration of the interval between a trigger and the previous one. The unit is 1/8 of the CPU clock cycle. For C64 TAP files, this means 1/123156th of a second (in a C64, the CPU frequency is 0.985248 MHz).

TAP version 1 introduced a byte 0 with a special meaning: it precedes an interval expressed in 3 bytes, and with a precision of 1 CPU clock period. This innovation, in theory, would allow to create TAP files whose time measurements are 8 times as precise. In practice, such high precision is not needed. Therefore, the special byte is only used to store intervals larger than 2048 CPU clock cycles, or around 1/480th second, because those don't fit into 1 byte. No loader uses such large intervals: they only occur in pauses between a program and the following one on the same tape.

That's ironic: a higher precision is used when it is least needed. Pauses do not carry any data, so precise timing is practically unnecessary.

How to turn that into bits?

The interval bewteen two triggers is often called a pulse. The simplest and most used way to code information is this: if the time interval is shorter than a given duration, called threshold, a 0 bit is received, if the time interval is longer than the threshold, a 1 bit is received. The kernel ROM loader uses a more complicated coding, with three possible lengths (and it is a very slow and inefficient loader).

The waveform in the image translates to the bit sequence 010011101000.

How to turn bits into bytes?

That question can be split in two:

how bits are ordered in a byte?
when does a byte start?

The answer to question 1 depends on the loader. Some loaders use a "most significant bit first" endianness, while others use a "least significant bit first" endianness. In the following part, we assume a "most significant bit first" endianness: the same applies to the opposite endianness, with the obvious changes.

Now let's answer question 2. At the start, the loader is not synchronized: it does not know which is the first bit in the byte. So, it implements a shift register, that is a byte where the last bit arrived is stored in the rightmost (least significant) position, and, as new bits arrive, it is shifted left (towards the most significant position).

most                       least
significant -> 10011001 <- significant
(oldest)                   (newest)
bit                        bit

The initial state of the register is not significant. When a new bit arrives:

the most significant bit (the oldest one) is discarded
the other 7 bits are moved one position to the left
the new bit is put in the least significant position

This way, the shift register always contains the last 8 bits received. This goes on until the content of the shift register equals the lead-in byte, a value which is loader-dependent. Now, the loader is in the state of first synchronization: instead of collecting single bits, it collects whole bytes, formed by sequences of 8 bits, the first being the most significant one.

If the byte equals the lead-in byte, the loader stays in the state of first synchronization
If the byte equals the sync byte, the loader goes to the synchronized state
Otherwise, the loader comes back to the unsynchronized state

After the sync byte has been read, in some loaders, data can be read safely. But many loaders, to improve reliability, expect a fixed sequence of bytes after the sync byte. If one byte differs from the expected one, they come back to the unsynchronized state. After the sync sequence has been received, data transfer begins.

An example: Turbo tape 64 has a lead-in byte $02 (binary 00000010), a sync byte $09 (binary 00001001) and a sync sequence $08,$07,$06,$05,$03,$02,$01. Comes the stream

0010100110010010001110000000001000000010000000100000001000001001
0000100000000111000001100000010100000100000000110000001000000001


00101001
01010011
10100110
01001100
10011001
00110010
01100100
11001001
10010010
00100100
01001000
10010001
00100011
01000111
10001110
00011100
00111000
01110000
11100000
11000000
10000000
00000000
00000000
00000001
00000010 <- Got the first synchronization!
            Now let's start collecting whole bytes
00000010
00000010
00000010
00001001 <- Sync byte! Now check sync sequence
00001000
00000111
00000110
00000101
00000100
00000011
00000010
00000001 <- Sync sequence successful, now we are synchronized.