Secure ripping

Discuss the current and future development of Max.
Post Reply
User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Secure ripping

Post by sbooth » Mon Jan 23, 2006 9:36 pm

Now that I believe the bugs are ironed out in the 0.5 release series :wink:, I am going to start on writing (from scratch) a secure ripper à la EAC.

But, I am no expert so I am requesting some feedback or referrals to people who may know.

I understand the principle is to read a CD sector as many times as required until a level of certainty is obtained that the data extracted is "true". A CD sector is 2352 bytes, or 18816 bits. I'm a little rusty on probability, so my question is this: what is a mathematically correct algorithm for determining if the sector is correctly extracted?

I know there is more to it than that, though, with C2 error correction and such.

Any information or pointers to information would be much appreciated.

Maurits
Posts: 117
Joined: Sun Jan 29, 2006 1:36 pm
Location: London, Europe

Check out PyRipper thread on HA.org

Post by Maurits » Sun Jan 29, 2006 2:06 pm

You could check out this thread on Hydrogenaudio.

It's about developing a secure ripper for Linux and OS X and has a lot of background info on how to bypass hardware-caches in modern CD/DVD drives.

someone
Posts: 7
Joined: Mon Feb 13, 2006 5:47 am

Post by someone » Mon Feb 13, 2006 5:55 am

I believe there were patches which allowed CDParanoia to bypass CD drive cache. While I understand that it is very difficult to read code written by other, I would still suggest you to look at the source code of cdparanoia and build on top of it. There are interests in hydrogenaudio forum to continue the project, so you might want to contact these people.

Also, consider other options as well. Pyripper/Rubyripper uses the rip and test technique, which I mentioned to you in my bug report. For that, you probably want to use a *less* reliable ripping method and ignore all error reporting.

Lastly, you can try to rip 4-6mbs at once, which might also allow you to bypass the drive cache.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Tue Feb 14, 2006 5:21 am

I've been working on this and have hit some fairly large problems.

First of all, it seems that on OS X there is no good way to do exactly what I am trying to do. I've tried using the STUC MMC commands, but they fail for CDDA (not all; but the ones I am interested in, such as READ_12). I am able to use STUC for the READ CD command, but there is no force unit access (FUA) for that particular one. Ideally I want to use the READ_12, but as I said this seems to be unavailable for CDDA discs. So currently I am stumped...

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Tue Feb 14, 2006 5:30 am

someone wrote:I believe there were patches which allowed CDParanoia to bypass CD drive cache. While I understand that it is very difficult to read code written by other, I would still suggest you to look at the source code of cdparanoia and build on top of it. There are interests in hydrogenaudio forum to continue the project, so you might want to contact these people.
I have actually done this, to a certain degree. The problem I've had is that this code is targeted towards a linux-type environment with certain ioctls that don't exist on OS X. So to get it to run requires a port and the necessary features are not in place in OS X to do the port! A giant circle of sorts. There is a good chance I am missing something, though, since I am very new to IOKit programming. Any help would be appreciated.
someone wrote:Also, consider other options as well. Pyripper/Rubyripper uses the rip and test technique, which I mentioned to you in my bug report. For that, you probably want to use a *less* reliable ripping method and ignore all error reporting.
I've grabbed some code that computes the sha256 of arbitrary memory blocks; I have a prototype that extracts some data from a cd and computes the sha256 of the data. Then it does it over again, and compares the two hashes. The problem I've had with this is (again!) that the cache seems to be supplying the data, which basically invalidates the error checking. D'oh!
someone wrote:Lastly, you can try to rip 4-6mbs at once, which might also allow you to bypass the drive cache.
This would probably work, on drives with smaller caches than the chunk size. This seems a bit 'hackish' to me but I guess if it would work, and nothing else seems to...

someone
Posts: 7
Joined: Mon Feb 13, 2006 5:47 am

Post by someone » Wed Feb 15, 2006 6:41 am

sbooth wrote: I've grabbed some code that computes the sha256 of arbitrary memory blocks; I have a prototype that extracts some data from a cd and computes the sha256 of the data. Then it does it over again, and compares the two hashes. The problem I've had with this is (again!) that the cache seems to be supplying the data, which basically invalidates the error checking. D'oh!
Actually, what I really meant is to rip whole tracks in burst mode and compute hash for it. If any subsequent rerip results in the same hash, then that rip will be accepted. I don't believe the hardware cache of any CD drvie can hold a whole track.

someone
Posts: 7
Joined: Mon Feb 13, 2006 5:47 am

Post by someone » Wed Feb 15, 2006 6:54 am

This might be your only option:

http://tinyurl.com/9kuh3

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Thu Feb 16, 2006 7:17 pm

It seems ridiculous that one would have to write an in-kernel device driver just to read uncached audio from an MMC-compatible device.

Nonetheless, you may be right. Unfortunately I have no idea how to write device drivers!

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Thu Feb 16, 2006 7:19 pm

someone wrote:
sbooth wrote: I've grabbed some code that computes the sha256 of arbitrary memory blocks; I have a prototype that extracts some data from a cd and computes the sha256 of the data. Then it does it over again, and compares the two hashes. The problem I've had with this is (again!) that the cache seems to be supplying the data, which basically invalidates the error checking. D'oh!
Actually, what I really meant is to rip whole tracks in burst mode and compute hash for it. If any subsequent rerip results in the same hash, then that rip will be accepted. I don't believe the hardware cache of any CD drvie can hold a whole track.
Could you elaborate on what you mean by "burst mode"? I assume that you mean simply reading data without performing any error-correction. (just a READ CD scsi command)

someone
Posts: 7
Joined: Mon Feb 13, 2006 5:47 am

Post by someone » Fri Feb 17, 2006 3:02 am

"Burst mode" is a term used in EAC. I believe it is basically read with all error correction turned off, but you might want to ask Andre for more details.

Apple does not allow developers to directly send SCSI commands to drives from their application. For more information, see this:
http://developer.apple.com/qa/qa2001/qa1179.html

Before you start developing the driver, maybe it's useful to have a look at this:
http://darwinsource.opendarwin.org/10.1 ... dsDevice.h

tman
Posts: 3
Joined: Tue Mar 14, 2006 10:32 am

Re: Secure ripping

Post by tman » Tue Mar 14, 2006 10:43 am

sbooth wrote:Any information or pointers to information would be much appreciated.
A few years back, there was a Mac application called AudioCDRescue. It basically took the brute force approach to reading data off of CDs. It would read the CD multiple times and compare the data sector by sector. It would continue to do this until each sector had accumulated an user-definable (the default was 3, I think) number of identical reads.

I recall that the author basically took a low-tech approach to the cache problem: eject the disc and require the user reinsert it between each pass of the disc. Not exactly elegant, but it got the job done.

I haven't been able to find a copy of this program in a while (not that I've looked that hard) but perhaps you can find more info on this program and at least get a jump on version 1 of the secure read capability.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Re: Secure ripping

Post by sbooth » Mon Mar 20, 2006 9:45 pm

tman wrote:I recall that the author basically took a low-tech approach to the cache problem: eject the disc and require the user reinsert it between each pass of the disc. Not exactly elegant, but it got the job done.
Now that's an approach I hadn't thought of! :idea:

tomars
Posts: 13
Joined: Wed Mar 08, 2006 8:34 pm

Post by tomars » Tue Mar 21, 2006 1:57 pm

If it resulted in secure ripping and was optional, I wouldn't mind having to eject the CD inbetween rips.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Interesting results

Post by sbooth » Wed Mar 22, 2006 11:46 pm

I've hacked together a command-line ripper that basically:
  1. Reads a contiguous set of sectors one at a time off of a CD and saves them to a temporary file
  2. Re-reads the same set of sectors and saves them to a different file
  3. Compares the two files sector by sector (2352 bytes at a time)
  4. Writes matching sectors to a third file, and zeroes out non-matching sectors
  5. Prints out any differences it finds
This was more of a proof of concept than anything. And here is the interesting part- I am unable (in my short testing cycle) to get different read results! It makes no sense to me. First I tried ripping just a single track a few times. When this failed to cause any problems, I ripped a ton of sectors (207943 to be exact). I got this:

Code: Select all

Using cdrom /dev/rdisk2
Ripping sectors 0 to 207942 (207943 total)
Performing first read
Performing second read
Comparing two reads
Maybe I need to try with some old scratched-up CDs, but nonetheless I was surprised. I don't quite see how not even one byte could be different in two files this large:

Code: Select all

-rw-------    1 me        wheel  489081936 Mar 22 15:32 riposx2GhmbA.rip
-rw-------    1 me        wheel  489081936 Mar 22 15:28 riposxQrOeHx.rip
I'll be happy to share the code if anyone wants to take a look at it. Would reading the sectors one at a time cause any different result than reading a larger chunk? I imagine that, regardless of the read size, the drive will fill its buffer in anticipation of further reads.

tomars
Posts: 13
Joined: Wed Mar 08, 2006 8:34 pm

Post by tomars » Fri Apr 14, 2006 8:13 pm

Any updates on this sbooth?
Looks really interesting

Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests