Problem with Encodings in .cue file

Discuss Max, an open source CD audio extractor and audio converter.
Post Reply
cloud
Posts: 1
Joined: Fri May 22, 2009 1:49 pm

Problem with Encodings in .cue file

Post by cloud » Fri May 22, 2009 1:55 pm

Hello dear sirs and madams :)
I have trouble using Max - the contents of .cue file are written in Russian language and cannot be saved in ISO Latin 1 encoding which I suppose is used by Max as default for openning files. So I cannot extract .cue file =(
Maybe there's anyhow is a possibility to control encoding used by Max when you open a .cue file...?

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Re: Problem with Encodings in .cue file

Post by sbooth » Fri May 22, 2009 6:46 pm

This is a bug in the cue sheet parser that Max uses. For now there isn't a workaround aside from copying/pasting the affected data manually.

mattn
Posts: 152
Joined: Tue Sep 02, 2008 4:21 am

Re: Problem with Encodings in .cue file

Post by mattn » Tue Jul 07, 2009 1:37 am

I have trouble using Max - the contents of .cue file are written in Russian language and cannot be saved in ISO Latin 1 encoding which I suppose is used by Max as default for openning files. So I cannot extract .cue file =(
In my view this is the single greatest flaw with Max - a flaw which often makes it unusable (so that I have to use XLD instead). The problem appears (from a cursory glance at the code) to be that in CueSheetDocument.m all the C strings from the parser library are read in using

NSString stringWithCString:... encoding:NSASCIIStringEncoding

There are two big problems here. First of all, stringWithCString is deprecated and might even stop working in Snow Leopard. Second, ascii is a very limited range of characters; if the cue sheet contains any non-ascii characters, such as an umlauted u or an accented e, you're going to get nonsense.

Ideally, there needs to be a way to ask the user for the encoding, and then employ that encoding in moving the C strings over to NSStrings. But even in the absence of this, it would be nice to guess some reasonable encoding that would at least work some of the time. Cocoa is very encoding-savvy and can cope with any encoding; but Max isn't making any attempt here to use any encoding. If all those instances of NSASCIIStringEncoding were replaced by NSISOLatin1StringEncoding, then at least Max would show correct strings most of the time, I believe.

What I do, at present, is open the file with BBEdit, change it to UTF-8, and then open it with XLD and process the tracks. You could do that with your Cyrillic-encoded cue sheet, or you could just tell XLD what encoding the cue sheet is to start with.

But since Max is trying to read as ASCII, that won't work - no matter *what* encoding you save the cue sheet as, Max won't be able to interpret it correctly.

I may be misinterpreting the code, so of course if I'm misstating the source of the problem I stand corrected.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Re: Problem with Encodings in .cue file

Post by sbooth » Tue Jul 07, 2009 4:41 am

Your analysis is essentially correct- the underlying cuetools library has no concept of encoding (basically) and the cue sheet support in Max is rudimentary. It is something I had planned to change out a while back with my own cue sheet parser, which I have not gotten around to yet.

You are right that it would be better to assume a different encoding than ASCII- the question is, UTF-8 or Latin-1?

mattn
Posts: 152
Joined: Tue Sep 02, 2008 4:21 am

Re: Problem with Encodings in .cue file

Post by mattn » Wed Jul 08, 2009 4:15 am

Oh, UTF-8 without a doubt. Win-Latin can only express a few extra characters, whereas UTF-8 can express them all. If I receive a cue sheet in Win-Latin or any other encoding I just open it with BBEdit and save it as UTF-8 and now XLD can read it, Cog can read it, everybody can read it.

If we adjust what Max does with the strings it gets from the C library, I think Max will be able to read them too, and when its saves tag info into AAC files, it will come out correctly in iTunes and all will be well. Note that the C library's attitude is probably totally neutral here; it is just processing bytes, and that works perfectly well. The issue is, I suspect, wholly on the Cocoa side. A Cocoa NSString is a UTF-8 string, so as we gather the C string bytes and turn them into a Cocoa NSString, we must say what encoding the C string is in. My guess is that if we just change all those occurrences of encoding:NSASCIIStringEncoding to encoding:NSUTF8StringEncoding it should just work. Assuming that the target .cue file *is* UTF-8. But, you see, if it is pure ascii then it is also UTF-8, and if it has non-ascii characters, then no matter what encoding it is in, it can be converted to UTF-8 (using BBEdit, TextWrangler, TextEdit, iconv, or any other handy tool) before processing it with Max.

If you like, I'll make the change and compile and test at my end and let you know if that solves it. And if it does I'll send you a patch. Or if you want to just make the change and test it yourself, I'll send you a cue/flac pair where the cue file is UTF-8 and has "weird" characters so you have something to test with.

RonaldPR
Posts: 433
Joined: Tue May 30, 2006 8:27 am
Location: Amsterdam, Netherlands

Re: Problem with Encodings in .cue file

Post by RonaldPR » Wed Jul 08, 2009 11:08 am

mattn wrote:Oh, UTF-8 without a doubt. Win-Latin can only express a few extra characters, whereas UTF-8 can express them all. If I receive a cue sheet in Win-Latin or any other encoding I just open it with BBEdit and save it as UTF-8 and now XLD can read it, Cog can read it, everybody can read it.
"Everybody"? Compatibility with other software, also on other platforms, should be taken into account. How widely is UTF-8 encoding used for cue sheets?

mattn
Posts: 152
Joined: Tue Sep 02, 2008 4:21 am

Re: Problem with Encodings in .cue file

Post by mattn » Wed Jul 08, 2009 9:02 pm

Sorry, I spoke loosely. Unicode is the native OS X system string encoding, and by "everybody" I meant the cast of characters who live on Mac OS X. Cog expects UTF-8 cue sheets. iTunes expects UTF-8 track and artist strings. TextEdit expects UTF-8 files. And so on. Of course ideally Max would have a preference or option somewhere so the user could specify what encoding this cue sheet is / should be in. But I'm just saying that if we have, for the present, no such preference, UTF-8 gives the greatest latitude, because it is the coin of the realm on Mac OS X, and if the cue sheet come from or is to go to some other platform, well, anything can be converted to it and it can be converted to anything by many apps, some of which I listed in my previous note.

mattn
Posts: 152
Joined: Tue Sep 02, 2008 4:21 am

Re: Problem with Encodings in .cue file

Post by mattn » Tue Jul 14, 2009 2:38 am

Well, I threw caution to the winds and just dived into CueSheetDocument.m and replaced NSASCIIStringEncoding everywhere with NSUTF8StringEncoding and rebuilt Max and so far this is working great.

Also, my earlier concern that uses of stringWithCString:encoding: need to be replaced completely may have been incorrect, so this could be all that's needed (until we get an interface change where the user can state the encoding).

I'll continue to test and report back...

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 2 guests