Multi byte encoding fix request

Discuss the current and future development of Max.
Post Reply
Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Multi byte encoding fix request

Post by Sithgunner » Fri May 19, 2006 2:48 pm

Hi. So far I'm impressed at Max for the ease of use and the how featureful as a whole. But there's one thing that gives me the show stopper, which is about the multi byte encoding issue.

Things work, like rip and encode, but when Max goes to access CDDB, whether the CDDB server supports multi byte encoding or not, Max always makes garbage letters where there should be multi byte characters. And in the end, the ripped file name becomes something like '-1.mp3', but not sure why it ends up like that.

From what I saw in other ripping application, CDDB can return in various encoding method, so the ripper has to accept those various encodings by user's choice.
See how Grip [Linux cd ripper] does for example in here, which handle this issue fine - http://nostatic.org/grip/doc/discdbconfig.png

It's probably done through iconv (but I'm not an app developer, so I'm not too sure) and turn them into the encoding used for Max or Mac OS X, which probably is UTF-8-Mac but again, I'm not too sure about this sorry.

I can currently just retype the artist name into proper name from the barbage characters but the files end up being '-1.mp3', so it will be really nice for me to start using this application, if this issue is solved.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Fri May 19, 2006 3:24 pm

I've changed the encoding used for FreeDB from ASCII to UTF-8 in the svn version. Hopefully this will fix the problem.

Would you be able to compile the svn version and give it a test?

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Fri May 19, 2006 11:26 pm

I've tried to download subversion and xcode and checked out the sources, but I guess I don't know what to do to compile from here...

So, it would be nice if you can make me a compiled binary or help me with how to get it compile, so further testing might go smoother.

Sorry for the trouble.

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Fri May 19, 2006 11:49 pm

Sorry, I figured by double clicking on the project file through Finder and I could get it to compile and run the subversion version.

And wow! it works! :D Multi byte characters now show fine, and the reason I had '-1.mp3' was I didn't set the file name output method in the preference, now I set it and the output file names show fine too.

Great job, thanks.
I really appreciate the great applications you make.

And just to ask, will this be incorporated in the next stable version?
Hoping to have that version soon.

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Sat May 20, 2006 12:40 am

As I tested a bit more, I figured the tagging on flac works fine with proper characters used, but when it comes to mp3 tagging, all the strings containing multi byte characters once again have garbage letters.

It works almost flawlessly as in Max's screen, file name, flac tagging but mp3 tagging seems to be broken as of now, I have not tested other file formats supporting tags.

It was tested on 0.0.4.1 Cog to view the taggings.

To add, this multi byte encoding issue becomes quite tricky, as if I want to put the tagged mp3 on a portable hardware player, and if that player supports reading only specific character encoding (iPod and whatnot), then force tagging it UTF-8 might not give the best results to who have multi bytes stuck with their lives. So in the end, it should be so that users can choose what character encoding should be put for the tags and probably so for the CDDB case, though for me it works as is.

To further it down, I usually listen to flac on a Mac or Windows, and mp3 usually goes in my iPod, and since Mac (or Cog) wants UTF8 tags, and my iPod wants SJIS (one of Japanese encoding method) as the tag, it would be best to have the encoding settings separated by file types.

Encoding issue is way complicated and annoying that everyone should start using UTF, but in the end, where there is Windows, which have a stuck encoding method (like SJIS for Japanese), and which drags every software and hardware around it to be compatible with it, so it probably can't be perfect for people like me in mixed OS/hardware/language environment. To be perfect, players have to be able to choose what encoding is to be used, encoders have to do the same, and as soon as I start using propriety software that came with OS, it gets impossible to fix these issues, so I hope to see it where it's usable for at least on a Mac/iPod by making it selectable encode tagging.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Sat May 20, 2006 8:13 pm

Sithgunner wrote:As I tested a bit more, I figured the tagging on flac works fine with proper characters used, but when it comes to mp3 tagging, all the strings containing multi byte characters once again have garbage letters.

It works almost flawlessly as in Max's screen, file name, flac tagging but mp3 tagging seems to be broken as of now, I have not tested other file formats supporting tags.

It was tested on 0.0.4.1 Cog to view the taggings.
If you load the file into iTunes or QuickTime (or look at the info in the Finder) do the names display correctly?

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Sat May 20, 2006 11:18 pm

No. I put the encoded mp3 on both iTunes and QuickTime, they both showed the wrong characters, just as it were in Cog.

And by the look of how they are identically wrong, those apps should be expecting same character encoding for the tags.

But when I put flac on Cog, the tags read just fine.

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Sun May 21, 2006 3:40 pm

I think it's getting a little complicated, so I'd be fine for myself, if you can tag mp3 as UTF-8 like it does to flac than probably as ASCII of the current situation.

Then I can just re-tag them as the encoding I'd like with a script or something till you get the time to implement a better method.

Sithgunner
Posts: 40
Joined: Fri May 19, 2006 2:32 pm

Post by Sithgunner » Tue May 30, 2006 1:47 pm

For 0.6, I see it that flac gets tagged fine, while mp3 still gets tagged with garbage letters.

I couldn't find a way not to let mp3 gets tagged as well.
Would it be possible just to make the mp3 tagging as UTF8, just like you did with the Max's interface?

This is the last thing I want it to be fixed to replace my entire ripping environment to Max.

Sorry for asking much.

User avatar
sbooth
Site Admin
Posts: 2445
Joined: Fri Dec 23, 2005 7:45 am
Location: USA
Contact:

Post by sbooth » Tue May 30, 2006 9:05 pm

The MP3 tagging is accomplished with TagLib, which should use UTF8. Max is handing the tags off as UTF8, so I will have to investigate a little further to determine where the problem lies.

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests