Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The humans spent their time building a hideously difficult classification model. Out of the box GPT-3 worked better than the result of a year of their work.


How did they react to this as humans with human pride? Sounds painful.


As GP's declarations are backed by air, we can speculate they are self-reported statements by people working on non-business-centric applications.

edit: GP giving more downvotes than proofs


That's interesting, GPT-3 can do classification too? Or did I misunderstood and you meant your engineers used classification to build a language model that didn't perform as well as GPT-3 (which is less surprising indeed) ?


GPT-3 can do classification. For example you can give it a prompt like "Hacker News is a website. Excel is a Windows program. Visual Studio is a Windows program. Safari is a Mac program. CPU-Z is", and even GPT2 will complete this with "a Windows program" (with GPT2 you need to try multiple times, discard useless results and average what's most common, but it works and is straight-forward to automate).


Just because many more humans spent many more years and many more $$$ building GPT-3 for your convenience.


Right, but GPT-3 can be used generally. That's the difference. It scales because you don't need to build an entirely new model for each different use case.

You just change the prelude and use it for something new.


It sounds like a big deal. What a tempting idea. And a colleague was mildly annoyed with me for how unimpressed I seemed.

But you have to understand, the use cases you mention are shallow and limited. The heart of GPT, the fine-tuning, is gone. And it looks like even OpenAI gave up on letting users fine-tune, because it means they essentially do build an entirely new, expensive model for each use case.

I wanted to make an HN Simulator, the way that https://www.reddit.com/r/SubSimulatorGPT2/ works. But that's far beyond the capabilities of metalearning (the idea that you describe).


I think the onus is on you to prove that the use cases are shallow and limited. I've seen GPT-3 already being used for diverse and interesting ideas that would not have occurred to me personally.

However, even if they are, the point stands: currently, there are teams of people at companies all over the world tuning models for these shallow and limited use-cases. GPT-3 can replace them all, without OpenAI needing to invest another cent in training for a particular customer's use-case. That is in fact game-changing for the ML/DL world and current applications thereof.

Is it AGI? Obviously not. But the vast majority of ML applications don't need to be.


>However, even if they are, the point stands: currently, there are teams of people at companies all over the world tuning models for these shallow and limited use-cases. GPT-3 can replace them all, without OpenAI needing to invest another cent in training for a particular customer's use-case. That is in fact game-changing for the ML/DL world and current applications thereof.

The counterpoint is that it would be significantly cheaper AND have better performance to fine-tune models to each customer's use case than it is to just run GPT-3 at inference.


Clearly that is not true for the commenter that started this thread.


What other proof would you like, other than an example of what I wanted to do and can't?

(https://www.reddit.com/r/SubSimulatorGPT2/ but for HN.)

For a more extensive rebuttal, I wrote one here. https://news.ycombinator.com/item?id=23346972 Though that was more a rebut of GPT in general as a path to AGI than metalearning in particular for generating memes.


GPT-3 not being suitable for your particular use case does not mean that all use cases are shallow and limited?

That being said, I'm not sure I understand why you can't use GPT-3 to make an HN simulator.


What are the diverse and interesting ideas that would not have occurred to you personally?


Were there any concerns about GPT-3's latency? It looks like it takes a long time for online use cases.


So GPT-3 didn't replace your 2 ML engineers, OpenAI did. GPT-3 didn't build itself.


The iPhone didn't replace your flip phone, apple did. The iPhone didn't build itself.


Yes except they were saying the iPhone replaced Nokia's engineers.

GPT-3 is not doing what the ML engineers were doing (building models), GPT-3 is the end goal. The company just decided to outsource the work to OpenAI and pay a monthly fee to them instead of salaries to their ML engineers.

"We have already found several use cases for it, one of which replaces 2 ML engineers." -> Clearly makes it sounds like GPT-3 can do the job their ML engineers were doing.


From a business perspective, this is an irrelevant distinction. The requirement was satisfied in a different way, i.e. the engineers satisfying the requirement were replaced by GPT-3, the tool which satisfies the requirement.

I think everyone understood that.


The thread is not about business perspective, it's about the hype around what GPT-3 is and is not able to do.

One thing GPT-3 is not able to do for example, is replacing 2 ML engineers to build a GPT-3 like model. But OpenAI can do that.


It’s not clear what you are trying to say.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: