Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So RL is true AI. Alpha GO did make moves inconceivable to the best human mind on the subject.

There was no curve to fit into.



The learning from game itself was curve fitting, the Deep in Deep Reinforcement Learning usually means some difficult function is replaced by a deep neural network, approximating optimal values (for moves) trained on gameplay samples, usually in sense of rewards/punishments for reaching certain states; in games they could rank e.g. good/bad moves, winning states, losing states etc.


Right. But the curve itself, was invented by the machine.


I think the curve is defined by the rules of the game, and the machine learned some details of it that humans hadn't figured out yet.


It's curve-fitting with a few extra steps. You can do a lot with curve-fitting though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: