there's a bunch of gatekeeping to get into ML. Part of it is that ML people don't want non-ML people to know just how much of what they do is drudgery and how little of it is exciting math, or have competition from people with similar skills. And those roles come with a lot of prestige.
I went through all that and am a SWE again instead of an ML engineer. The one thing I learned from all that? "The very best models are distilled from postdoc tears".
Getting state of the art performance in ML requires a lot of intuition about equations though. I've seen some of the top ML engineers work at Google, they all have a really good understanding of math, how formulas translates into measurable results etc. An ML education or research background seems less important, if you have that from studying physics or math or anything then it still translates.
I feel the biggest problem for people without an ML background is that you'd think "I don't know what I'm doing, I can't get hired for this job!", but fact is that people with ML backgrounds mostly don't know what they are doing either. They just get standard results by applying standard libraries, any programmer with some math skills could do the same, it is no harder than learning a frontend or backend framework, people just think it would be harder so they lack confidence about it. There are some gotchas you got to learn, but there are a lot of gotchas in both backend and frontend as well.
And the same can be said of non-ML IT ! You always contrast better when you understand the whole history behind why you write something a certain way, even if you could just learn on the job seeing it over and over. It's like how they teach proper sorting by giving you all the bad ways first.
Also, it's not often but you do have to show creativity at times, to solve a new problem or something, and having an intuitive theoretical understanding goes a long way vs someone who learned via base mimicry.
I think instead of gatekeeping we could build bridges: be very clear that salary / responsibilities will be lower at first and judge on results. If an ML person is brilliant, he won't be threatened by an idiot Java dev. And if a Java dev is able to produce good results even if the way he reached them is less graceful, then an ML engineer should probably start shifting the second gear :D
Sure, I've been doing that "intuitions about equations" thing since 1993 (my undergrad thesis was on using gradient descent to train the weights of a dynamic programming algorithm that found e.coli gene). I generally agree, to be a top ML researcher, you need those skills in excess of the average (I worked with quite a few of those people at Google). To do state of the art work? Mostly hard work, lucky guesses, lots of compute power, and a huge support apparatus to make rapid experimentation easier.
But the vast majority of people working in ML don't need that. Sadly, most of the work I did for one of the world's most powerful machine learning systems was literally computing frequencies and then sorting by the frequency, so features that were more common were encoded in smaller varints, saving lots of disk space.
I agree with you, but I also wish more ML people knew more math. Though I think there's a difference between research and production (I'm in research).
I went through all that and am a SWE again instead of an ML engineer. The one thing I learned from all that? "The very best models are distilled from postdoc tears".