Testing out code inclusion with the Syntax highlighter backend.
AVX Extensions are a new type of mixed integer and floating point vector instructions that use 256 bit wide registers, similar to the vector processing capabilities of SSE, which used 128 bit registers.
.file "main.c" .data .align 32 intset1: .float 1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8 .bss .align 32 intans: .rept 8 .float 0.0 .endr .text .align 4 .globl main .type main, @function main: pushq %rbp movq %rsp, %rbp vzeroall avx: vmovdqa intset1, %ymm0 vmovdqa intset1, %ymm1 nop vmulps %ymm0, %ymm1, %ymm2 vmovdqa %ymm2, intans movl $0, %eax popq %rbp ret
Fixed this code up, so to whoever might have been following this, now im seeing the correct output when i run this in the debugger.
Some interesting stuff here, backwards compatible with SSE if your kernel doesnt support AVX. (tested on the web server)
We move the string of eight (or four) floats into the vector registers ymm0 and ymm1 (or xmm0 and xmm1 for SSE) and vector multiply them into ymm2 (or xmm2). The result gets saved into bss. Really, there's no way to "see" this code work without the debugger, but the power of doing eight floating point multiplies in one clock cycle is amazing.