Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Usually what happens in cases like this is that the RL network ends up learning a slightly crappier version of A*

Or finds a bug in the environment and learns to teleport, those are fun too



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: