The various CD formats are a bit involved, and the official specifications ("red book" for audio CD, "yellow book" for data CD) are not freely available. But you can find some details in available standards like Ecma-130.
The original audio CD (also called CD-DA) was modelled on the vinyl record, which means it also uses is a spiral track of continous audio data (the DVD later used circular tracks). Interleaved within this audio data in a very complex way are 8 subchannels (P to W), of which the Q subchannel contains timing information (literally in minutes/seconds/fractions of seconds) and the current track number. For the original purpose this was enough: For continous play, the lens was just adjusted slightly to follow the track. To seek, the lens would move while decoding the Q subchannel until the right track was found. This positioning is a bit coarse, but completely adequate to listen to music.
Still today, many computer CD drives cannot completely accurately position the lens and synchronize the decoding circuitry so that reading of audio samples starts at an exact position. This is why many CD ripping programs have a "paranoia" mode, where they do overlapping reads and compare the results to adjust for this "jitter". As part of the audio stream, the subchannel is also subject to jitter, and that is why you get different subchannel files when you rip on a CD drive that cannot position accurately.
When the data CD (CD-ROM) specification was developed to extend the CD-DA specification, the importantance to accurately address and read data was recognized, so the audio frame of 2352 byte was subdivided into 12 sync bytes and 4 header bytes (for the sector address), leaving the remaining 2336 bytes for data and an additional level of error correction. Using this scheme, sectors can be addressed exactly without having to rely on the Q channel information only. Therefore the jitter effect doesn't apply, you get always the same data when you dump a CD-ROM, and no additional cleverness in dumping is needed.
Edit with more details:
According to Ecma-130, the data is scrambled in stages: 24 bytes make up an F1-Frame, the bytes of 106 of these frames are distributed into 106 F2-Frames, which get 8 extra bytes of error correction. Those frames in turn each get an extra byte ("control byte") to make them into F3-Frames. The extra byte contains the subchannel information (one subchannel for each bit position). A group of 98 F3-Frames is called a section, and the 98 associated control bytes contain two sync bytes and 96 bytes of real subchannel data. The Q subchannel in addition has 16 bits of CRC error correction in those 96 bits.
The idea behind this is to distribute data on the surface of the disk in such a manner that scratches, dirt etc. don't affect a lot of continous bits, so the error correction can recover the lost data as long is the scratches are not too big.
As a consequence, the CD drive hardware needs to read a complete section after repositioning the lens to find out where it is in the data stream. The descrambling of the various stages is done by hardware, which needs to sync itself to the 2 sync bytes in control-byte stream. All CD drive models need a different amount of time to sync compared to other models (you can test that by reading from two different drives, if you have them), depending how the hardware is implemented. Also, many models don't always take the exact same time to sync, so they can start a little early or late, and output the descrambled data not always at the same byte.
So when the ripping program issues a
READ CD (0xBE) command, it supplies a transfer length and a start address (or rather, Q-channel time). The drive positions the lens, descrambles the frames, extracts the Q-channel, compares the time, and when it finds the correct time, it starts to transfer. This transfer doesn't always begin at the same byte as explained above, so the result of multiple
READ CD commands may be shifted against each other. That's why you see different subchannel files from your ripper.
Depending on the hardware and the circumstances when the lens is adjusted, it's more or less random if the transfer starts a few samples early or a few samples late. So the only pattern you'll see in the results is that the shifts are a multiple of the transfer length.
Some drive models actually have accurate hardware which will always start transfer at the same time. The standard defines a bit in mode page 0x2a ("CD/DVD Capabilities and Mechanical Status Page") which indicates if that is the case, but real-world experience shows that some drives claiming to be exact are in fact not. (Under Linux, you can use
sg_modes from the
sg3-utiles package to read the mode pages, I don't know what tool to use under Windows).