Exercise 3.1 (Uebung 3.1): | variants of matrix-matrix multiply. |
Exercise 3.2 (Uebung 3.2): |
SSUM and SAXPY implemented using Intel SSE intrinsics. See
BLAS Level-1 routines from BLAS. |
Exercise 3.1 (Uebung 3.1) solution: | variants of matrix-matrix multiply problem. |
Exercise 3.2 (Uebung 3.2) solution. |
SSUM and SAXPY implemented using Intel SSE intrinsics.
See BLAS Level-1 routines from BLAS . |
Altivec FFT in-line: |
binary radix FFT using workspace and Apple Altivec intrinsics.
This version expands step in-line. Otherwise, it is similar to "genericfft.c" below (Section 3.6). |
Altivec dot product: | unit stride sdot for Apple Altivec (Section 3.5.5). See BLAS Level-1 routines from BLAS. |
Altivec isamax: | unit stride isamax0 for Apple Altivec (Section 3.5.7). See BLAS Level-1 routines from BLAS. |
Altivec FFT: |
binary radix FFT using binary radix FFT using a workspace
and Apple Altivec intrinsics. It is similar to "genericfft.c" below (Section 3.6). |
Generic FFT: | generic binary radix FFT using a workspace but no SSE or Altivec intrinsics (Section 3.6). |
Multiple tridiagonal: |
sub-procedure for multiple right hand side solution of
tridiagonal systems via simple one-step recurrence formula - after Forsythe and Moler (see Section 3.5.2). |
SGEFA: |
tests variants of the simple parallel version of sgefa
in (Section 3.4.2). There is a README file in this gzipped tar file describing the variants. |
Rpoly: | recursive doubling version of polynomial evaluation (Section 3.5). |
SSE FFT in-line: |
version of workspace version of binary radix FFT with step
in-lined - same as genericfft.c above but using Intel SSE intrinsics. (from Section 3.6). |
SSE isamax: |
SSE example of isamax0, a unit stride isamax (from Section 3.5.7).
See BLAS Level-1 routines from BLAS . |
SSE FFT: |
SSE version of workspace version of binary radix FFT, same
as genericfft.c above but using Intel SSE intrinsics (see Section 3.6). |
Tridiagonal system tests: |
Tridiagonal system solver tests, also compares timings for
the simple recurrence method (from Forsythe and Moler's book) (see Section 3.5.2). |