Differences From Artifact [ac56c8b767]:
- File
src/OFMatrix4x4.m
— part of check-in
[cf955413ab]
at
2023-11-06 00:59:37
on branch trunk
— OFMatrix4x4: SSE1 for -[transformVectors:count:]
This new SSE1 implementation is better than the SSE4.1 implementation,
hence this also deletes the SSE4.1 implementation. (user: js, size: 9770) [annotate] [blame] [check-ins using]
To Artifact [b53637018a]:
- File src/OFMatrix4x4.m — part of check-in [9ba7594f7b] at 2023-11-06 20:11:51 on branch trunk — OFMatrix4x4: Fix missing vector reload in SSE (user: js, size: 9780) [annotate] [blame] [check-ins using]
︙ | ︙ | |||
42 43 44 45 46 47 48 | __asm__ __volatile__ ( "test %0, %0\n\t" "jz 0f\n" "\n\t" "movaps (%2), %%xmm0\n\t" "movaps 16(%2), %%xmm1\n\t" | | < > > | 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | __asm__ __volatile__ ( "test %0, %0\n\t" "jz 0f\n" "\n\t" "movaps (%2), %%xmm0\n\t" "movaps 16(%2), %%xmm1\n\t" "movaps 32(%2), %%xmm2\n" "\n\t" "0:\n\t" "movaps (%1), %%xmm3\n" "\n\t" "movaps %%xmm0, %%xmm4\n\t" "mulps %%xmm3, %%xmm4\n\t" "movaps %%xmm4, (%3)\n\t" "addss 4(%3), %%xmm4\n\t" "addss 8(%3), %%xmm4\n\t" "addss 12(%3), %%xmm4\n" "\n\t" |
︙ | ︙ |