Hacker News new | past | comments | ask | show | jobs | submit login

I think that's what the AlphaGo team did - they trained their agent against itself, and it learned new moves not explicitly programmed in! With an evaluation function just saying ahead / not ahead.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: