Reinforcement Learning

Robot Learns to Flip Pancakes: Really interesting, even though it would be more difficult to do it with real pancakes. What mysterious reward is he getting? The weight of the pan after the flipping process? Or could the reward be inserted manually after every trial?
A crawling robot: a Q-learning example: A really nice implementation of the Q-algorithm in a crawling robot. It’s worth saying thanks for his brief explanation.


