Opened 15 years ago
Closed 15 years ago
Last modified 15 years ago
#5890 closed patch (fixed)
libmythsoundtouch: SIMD optimisation patch
Reported by: | Owned by: | Janne Grunau | |
---|---|---|---|
Priority: | minor | Milestone: | 0.22 |
Component: | mythtv | Version: | head |
Severity: | medium | Keywords: | soundtouch SIMD |
Cc: | mark@…, Rob Smith | Ticket locked: | no |
Description
The MMX optimised routines in libmythsoundtouch appear to be disabled since ALLOW_MMX is never defined. Also, the MMX routines were explicitly disabled for x86_64 since they give garbled output on that arch. This patch:
- Defines ALLOW_MMX in soundtouch if (arch == x86 && defined(MMX))
- Adds integer SSE2 routines for x86_64
This results in a huge speed increase - enough to make 1.2x possible with 1080i mpeg2 and 6ch AC-3 audio on a 2.2Ghz amd64 with cycles to spare.
Tested on x86_64 only. Patch applies against fixes and trunk.
Attachments (5)
Change History (26)
Changed 15 years ago by
Attachment: | soundtouch.patch added |
---|
comment:2 follow-up: 3 Changed 15 years ago by
Replying to jppoet@gmail.com:
I did notice, however, that it causes DD5.1 to fall back to DD2.0.
The patch doesn't cause this, timestretching 6 ch AC-3 has always given you 2 (reencoded) channels.
However, this patch will be followed by another that addresses this issue amongst a bunch of other things. Just tidying it up.
comment:3 follow-up: 4 Changed 15 years ago by
Replying to foobum@gmail.com:
Replying to jppoet@gmail.com:
I did notice, however, that it causes DD5.1 to fall back to DD2.0.
The patch doesn't cause this, timestretching 6 ch AC-3 has always given you 2 (reencoded) channels.
Actually, Mark Spieth added DD5.1 support 9 months ago in ticket #1104
However, this patch will be followed by another that addresses this issue amongst a bunch of other things. Just tidying it up.
Cool Thanks for your efforts.
comment:4 Changed 15 years ago by
Replying to jppoet@gmail.com:
Actually, Mark Spieth added DD5.1 support 9 months ago in ticket #1104
Yep, but it downmixed DD5.1 to DD2.0 once timestretch was on. This is fixed in the next patch. Glad it's been useful to you.
comment:5 Changed 15 years ago by
Cc: | Rob Smith added |
---|
comment:6 follow-up: 7 Changed 15 years ago by
Milestone: | unknown → 0.22 |
---|---|
Owner: | changed from Isaac Richards to Janne Grunau |
Status: | new → assigned |
how is the SSE asm related to sse_optimized.cpp in upstream soundtouch?
comment:7 Changed 15 years ago by
Replying to janne:
how is the SSE asm related to sse_optimized.cpp in upstream soundtouch?
It's not.
comment:8 Changed 15 years ago by
Think I missed the gist of your question. sse_optimized.cpp implements filters and calcCrossCorrStereo for float samples. The asm in the patch implements overlap and calcCrossCorr for stereo and multichannel and 16 bit int samples. The existing MMX filter code works fine so we use that for both x86 and x86_64.
comment:9 follow-up: 11 Changed 15 years ago by
yes, I've seen that after looking at the code. Do you care enough to submit your asm to upstream soundtouch?
ALLOW_MMX is defined in STTypes.h, no need to define it in the project file. I also don't like the ARCH_X86_64 ifdefs. I've changed the preference of mmx and sse code in TDStretch.cpp
comment:10 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:11 Changed 15 years ago by
Replying to janne:
yes, I've seen that after looking at the code. Do you care enough to submit your asm to upstream soundtouch?
Perhaps, although I haven't looked at whether their MMX suffers the same problem.
ALLOW_MMX is defined in STTypes.h, no need to define it in the project file. I also don't like the ARCH_X86_64 ifdefs. I've changed the preference of mmx and sse code in TDStretch.cpp
Ah.. Looks fine. Ta.
Changed 15 years ago by
Attachment: | soundtouch-update.patch added |
---|
comment:12 Changed 15 years ago by
An incremental patch that improves the asm, makes it x86 compatible and turns it on for x86 (where it should be noticeably faster than the mmx, unfortunately I can't test).
comment:13 Changed 15 years ago by
Resolution: | fixed |
---|---|
Status: | closed → new |
comment:14 follow-up: 15 Changed 15 years ago by
Status: | new → assigned |
---|
Which SSE version is do you use that?
iirc soundtouch has no runtime CPU detection. you just can't assume that a MMX capable cpu has also SSE?
comment:15 Changed 15 years ago by
Replying to janne:
SSE2 and no, I guess we shouldn't assume that.. I'll add runtime cpu detection and update patch tomorrow.
Changed 15 years ago by
Attachment: | soundtouch-update-1.patch added |
---|
comment:16 Changed 15 years ago by
Sorry, soundtouch does have runtime cpu detection. I have, however, renamed the define and the class SSE -> SSE2 and changed the rt cpu check to check explicitly for SSE2.
Changed 15 years ago by
Attachment: | soundtouch-update-2.patch added |
---|
comment:17 Changed 15 years ago by
Latest (-2) makes the asm more 'correct' - the SSE2 now produces output exactly equal to that of the C (unlike the MMX). Now works properly again when compiled with -O3 too (oops).
comment:18 Changed 15 years ago by
Status: | assigned → accepted |
---|
comment:19 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Changed 15 years ago by
Attachment: | st-fix.patch added |
---|
comment:20 Changed 15 years ago by
Sorry! Last diff, honest.. Missed a couple of lines in OverlapMulti(). This makes OverlapMulti()'s output bit exact too.
This patch does greatly improve timestretch performance under 64bit. Thank you for that!
I did notice, however, that it causes DD5.1 to fall back to DD2.0.