Node selected for fabrication does not equalize power consumption. You could go ...

Node selected for fabrication does not equalize power consumption. You could go to the same node with other, inherently more power efficient architecture and gain even more oomph per watt. Power efficiency stems from the architecture itself; manufacturing process is a red herring (and a costly one). What you're saying is pretty much like "seasoned bodybuilder would kick white-belt karate practitioner's ass, so it's clear that bodybuilding is better than karate."

Compare things that are alike. If you take a manufacturer (say: TSMC), pick its node (say: 16nm FF+) and you decide on a package (physical manifestation of RTL primitives in the silicon) you get better performance per watt on one architecture over some other. ARM and MIPS are inherently very power efficient. You can't just take SPARC and make it more power efficient than these two. It doesn't work like that.

It's also not true that ISA doesn't matter. ISA impacts bandwidth requirements heavily. This in turn impacts latency and latency hiding, cache requirements and many other things. In fact data transfer is typically as costly as (if not more expensive than) computation. Getting data to all the right places on the schedule eats power like crazy. This is exactly why ARM has Thumb. It's not like internally core does different things than it would do with wide ISA. It's just that stuff's more densely packed, which helps tremendously.

Which brings me to my last point. There's an open architecture that's quite nice. It's SuperH (or SH2 in its open source form), which in turn is what ARM's Thumb is based on. It's not perfect, but it's pretty solid. Omitting it in the OP makes me think author isn't very thorough with his research. But everything has to start somewhere. ;)