Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

SAM3 is probably a great model to distill from when training smaller segmentation models, but isn't their DINOv2 a better example of a large foundation model to distill from for various computer vision tasks? I've seen it used for as starting point for models doing segmentation and depth estimation. Maybe there's a v3 coming soon?

https://dinov2.metademolab.com/



DINOv3 was released earlier this year: https://ai.meta.com/dinov3/

I'm not sure if the work they did with DINOv3 went into SAM3. I don't see any mention of it in the paper, though I just skimmed it.


We used DINOv2 as the backbone of our RF-DETR model, which is SOTA on realtime object detection and segmentation: https://github.com/roboflow/rf-detr

It makes a great target to distill SAM3 to.


> It makes a great target to distill SAM3 to.

Could you expand on that? Do you mean you're starting with the pretrained DINO model and then using SAM3 to generate training data to make DINO into a segmentation model? Do you freeze the DINO weights and add a small adapter at the end to turn its output into segmentations?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: