Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If by "do that" you mean mimic what a real driver would do for a specific set of sensor inputs, that is precisely ML tries to do.

To understand what the difficulty is, it's important to consider that the size of the sensor input is very large. Don't think of it like twenty range finders around the car, rather a 360 degree medium resolution color + depth image (about 0.5 million data points coming at 30 fps).

It's difficult because you will never encounter the same set of sensor inputs twice, so you can't treat it like a search space problem. Once you've accepted that, you're in AI/ML territory where you might try to reason about what the closest set of known sensor inputs and action would be (classical AI, expert system), but that is impractically difficult with as 0.5 million dimensional search space, or train an ML model to 'reason' about the sensor space to make a decision about the appropriate action.

Approaches using a small number of sensors can do automatic breaking and smarter cruise control, but haven't been seen to be successful about navigating and making strategic decisions. The current belief is that more can be done by using denser sensors and more data and seems to be the case. There are people working on reducing the sensor density requirement, but the main focus right now is building a successful and safe self driving car, regardless of sensor and compute costs.



According to Waymo most of the miles driven are simulated ( 2.7 billion miles in 2017). That's an order of magnitude more than actual miles (25K per day)[1]. And even the actual miles mostly don't have any user input.

Because of this I'm leaning towards thinking waymo isn't trying to mimic actual human input.

[1] https://waymo.com/


Anyone interested in this should read the Atlantic feature about it:

https://www.theatlantic.com/technology/archive/2017/08/insid...


Wait.. 30fps?! That’s the speed of this data? I would have hoped it would be at around 90-120- at the very least 60...


Movies are filmed at 27fps so the reasoning is humans have high confidence that they aren't missing any significant information between the frames, it should possible to make a 'mental model' of a road scene at the same fps to human skill level.

In the future we'll likely have super-human spatial and temporal resolution, right now more improvements have been gained from highest possible spatial resolution with minimal plausible temporal resolution.


>> Movies are filmed at 27fps so the reasoning is humans have high confidence that they aren't missing any significant information between the frames, it should possible to make a 'mental model' of a road scene at the same fps to human skill level.

I hope there is a better, more technical explanation that ML researchers are using, because as someone who is somewhat of an expert on human vision and building products around it, this foundation is godawful if it is to be taken at face value. Which again, I am sure this is a simplification. Or at least, that's what I am telling myself.


This whole driverless car thing reeks of more vaporware, more public-yet-profitless unicorn companies promising to promise to change the world and aside from surface level consideration of edge case accidents with pedestrians (hasn't one of them already killed or seriously injured a pedestrian??) there isn't much deep talk about less straightforward issues such as (former) members of the trucking industry sabotaging these new fleets or how liability and insurance is REALLY going to work out.

File the promises and the problems under fiction because it appears to be more important to keep the world order, its financial system and these ridiculous media darling fluff piece corporations alive while they bleed money.

And no, im not closed to the idea of successful work being done on automated driving but 30 fps, WTF? too much going on in the larger context of the world, this shit isn't happening in 2020 or 2024 or whatever else many might say.


Look at it this way: human ability to act is (for elite gamers) 300 actions per minute. That's 5fps. So with 30fps the AI could theoretically already have 1/6th the latency of the most-responsive human drivers


They are not actually responding at 300 actions per minute to changing input, a large percentage of those clicks are constant selections of team shortcuts.


This is even worse. Humans do not process 300 APM. They merely are limited physically to outputting 300 APM. You have no idea what the brain's capacity to process and analyze information that led to the output of 300 APM. If you think 5 FPS is the capacity of the brain's ability to process vision... well, don't make a driverless car, please.


We move and react slowly but we respond to info which comes at a much higher rate. I can notice the individual frames at 5 FPS. I know I'm not getting enough info.


I was going to post that the impact of higher FPS is likely too low to justify needing 4x higher processing power. But then the difference in reaction time between 30fps and 120fps is O(30ms). At 60mph that translates to almost 1 meter in stoppping distance. Tough call.


It isn't just stopping power. It is fidelity, confidence, and completeness of data. A camera quickly moving across a panorama opening/closing the lens every 1/30th of a second loses a ton of data compared to one opening its lens every 1/120th of a second.


Yea, but then you have to have 4x the processing power It's pretty easy to scale frame rates once you have stuff figured out, but I can see very good reasons to not have to try to get up to 120+fps right away. In addition, I'd imagine the ML they're using probably has a lot harder time distinguishing valid motion when the motion is 4x as small.

It's probably a very valid tradeoff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: