The learning from game itself was curve fitting, the Deep in Deep Reinforcement Learning usually means some difficult function is replaced by a deep neural network, approximating optimal values (for moves) trained on gameplay samples, usually in sense of rewards/punishments for reaching certain states; in games they could rank e.g. good/bad moves, winning states, losing states etc.
There was no curve to fit into.