CoreML Benchmark for the iPhone 14 Pro

maegul · on Sept 17, 2022

Not trying to nitpick here, but does anyone find it strange that a webpage/website like this doesn’t have a graph of the data, but just a poorly designed table?

Is this a trend I’ve missed … some sort of post-D3 nadir of a datavis hype curve or something where graphs are a cringey thing SEOd click bait articles or news pages do?

MBCook · on Sept 17, 2022

That hadn’t occurred to me at all (though I would have preferred blank cells to “N/A”).

But now that you mention it, it does seem a little odd.

matthieurouif · on Sept 17, 2022

We are really more into shopping fast than shipping perfect but we improve over time. The fact that we introduce Mac and iPad in the chart make the graph a bit misleading as it’s not a chronological order

KMnO4 · on Sept 17, 2022

I honestly think you should focus on shipping perfect rather than shipping “fast”.

The world has enough quantity; let’s focus on quality.

mark_l_watson · on Sept 17, 2022

CoreML is really very good, both on its own, and tools for importing models from other platforms and compressing them. I wrote a book earlier this year on Swift and added a few CoreML examples (https://leanpub.com/SwiftAI). Google provides something similar.

Federated privacy preserving learning, local models, etc. all help keep your private data on your devices. Good stuff.

chmod775 · on Sept 17, 2022

A bit of a tangent, but where are we at when it comes to energy efficiency in AI?

Suppose I had one or two cameras attached to a computer and ran a software that would detect which object I'm pointing at and name it, how much power would that use?

The human brain would probably need around 0.5s - 1s to come up with an answer, consuming around 5 milliwatt hours of energy in that time.

How much power would the computer need to at least give it a fair shot compared to the human?

If we assume that a human is pretty close to the best theoretically achievable limit of overall usefulness vs energy usage (while, unlike current AI, having the ability to learn ad-hoc, self-correct and maintain itself), "work per watt" may give us an idea of how advanced our current technology really is compared to what already existed, and how far we can still go.

teruakohatu · on Sept 17, 2022

A human brain requires constant power to the brain and auxiliary systems. It has an inefficient enegery input system (food) and requires ~2-3kwh (depending on weight, sex etc.) of energy per day and cannot operate 24/7/365. When not being used for work, it still requires power.

A camera with a nvidia jetson might consume 0.5 kwh per day and run nonstop.

Ultimatly it is Apples to Oranges. A human brain can do a lot more than simply classify an object. Security guards watching cameras are evaluating the situation, not annotating images.

chmod775 · on Sept 17, 2022

> It has an inefficient enegery input system (food)

A human body can extract up to ~95% of the energy in that food (depends on the food), which is pretty damn efficient. You may have seen the number 20% thrown around, but that refers to how much of that can be turned into useful mechanical energy.

> and requires ~2-3kwh (depending on weight, sex etc.) of energy per day and cannot operate 24/7/365.

More like a fifth of that energy (which is what the brain uses). If you're going to look at the entire body, you're going to have to match those features in your hardware. I don't think current-gen hardware that could conceivable repair itself and take care of its own needs while being as energy efficient as a human body.

> A camera with a nvidia jetson might consume 0.5 kwh per day and run nonstop.

The nano?

If I had trained a model on the full Open Images Dataset (so we can get a number of categories that at least approaches what a human could do) are you sure that's going to cut it?

YOLOv3 doesn't even reach 2 fps on the nano (YOLOv3-tiny gets more, but using a crippled version won't win us any prizes), and that one only has 80 categories. The Open Images Dataset has five times that - which is still absolutely nothing compared to what a human can do (and the dataset is also a bit odd: the only specific street sign it knows is "stop sign" and there's weird one-offs like "facial tissue holder" but it can't tell a ferris wheel from a car wheel or steering wheel).

Even if you somehow managed to fit something with such a number of categories and acceptable accuracy on a nano, it would probably blow its energy budget, which is about 2 seconds of operation if it wants to match a human.

> Ultimatly it is Apples to Oranges. A human brain can do a lot more than simply classify an object.

Sure, but it's also not going to perform a lot of tasks at the same time. If you ask a human to keep classifying anything you're pointing at, they'll be mostly busy watching you and what you're pointing at, trying to conjure up the appropriate word to name the thing. If not you're not pointing fast enough.

Though I suppose we also have some sort of passive classification mode that we're using most of the time while we do other things. This mode just deals with concepts - it doesn't bother to inform us the thing flying at us is called "ball".

nl · on Sept 17, 2022

Something like https://canaan.io/product/kendryte-k510 will outperform a human on object recognition. Standby usage is 2mA, and 2W usage when being used (< 0.1s to run a single recognition).

chmod775 · on Sept 17, 2022

There's no NN that can outperform a human on accuracy or on number of categories it knows, the best you could hope for is being significantly worse, but faster. Even our best nets know only a tiny fraction of what a human can classify and have noticeably worse accuracy.

You'll want to be running a huge state-of-the-art network trained on large datasets on it to approach human capabilities and I don't think 2.5TFLOPS will cut it.

I had a look around and this thing is probably more in the right ballpark: https://www.nvidia.com/en-us/autonomous-machines/embedded-sy...

It uses up to 60W for 270TFLOPS at full power, but its processing power should be in the right ballpark to at least do decently with something trained on the best datasets there are.

There's a chance much smaller hardware would do if only our software was advanced enough, but it's probably not. I'm not sure where we are really at, hence my original question. You'd need to somehow work out Watts/HumanPerformance.

nl · on Sept 18, 2022

If you want unlimited categories (aka Zero shot classification), then CLIP does a pretty good job. I'd be a bit surprised if it can't run on a Jetson, although I guess RAM might be an issue.

kory · on Sept 17, 2022

To truly understand perf, ideally one should compare many types and sizes of models. I suspect some model types perform substantially better on the newer ANE / OS compared with others.

sqquima · on Sept 17, 2022

I wish they added Intel MacBooks with eGPU and a top notch Mac Pro with dual GPUs.

miohtama · on Sept 17, 2022

Any hope to run Stable Diffusion locally on iPhone?