I think you missed the start of my first comment: namely, I was not talking about JS. I was talking about SIMD at the instruction-set level in general.
Adding fuel to your fire: hardware will often be implemented with multiple SIMD compute units (transistors are cheap!), and pipeline multiple SIMD calls. With instruction sets like AVX, the length is increasing.
SIMD is non-controversial: it's just like another ALU instruction but maps the inputs to the outputs differently. Doing something more exotic, like VLIW, is an uphill battle. However, as seen with GPUs, that can be fruitful.