Hacker Newsnew | past | comments | ask | show | jobs | submit | theyCallMeSwift's commentslogin

I love this idea, but have a hypothesis that 90% of agents that people actually use today would fail this test inadvertently (false negative).

Industry best practice + standard implementation for most agents right now is to do web browsing / fetching via subagents. Their output is summarized using a cheaper model and then passed back to the parent. It's very unlikely that without preserving the actual content the subagents see that the `CANARY-` strings would be found in the output.

Any thoughts on how you'd change the test structure with this in mind?


Hey there - I'm the test author, and you've hit on one of the main points. For the summarization/relevance-based content return, this is a consideration for some of the agent platforms (although I've found others actually do better here than I expected!) - which is part of the point I'm trying to drive home to folks who aren't as familiar with these systems.

I chose to structure it this way intentionally because this is the finding. Most people are surprised that agents aren't 'seeing' everything that's there, and get frustrated when an agent says something isn't there when it clearly is. Raising awareness of this is one of the main points of the exercise, to me.


This isn't best practice. It's certainly not industry best practice. It would fail some pretty basic tests, like these, resulting in poor UX and poor reviews. There’s plenty of half-assed things labelled agent that do so, of course.

I think it describes generally how we can picture Claude and OpenAI working, but neglects further implementation details that are hard to see from their blog posts, ex. a web search vs. a web get tool.

(source: maintained a multi-provider x llama.cpp LLM client for 2.5+ years and counting)


Yeah, my colleague and I have been seeing in testing how much this is actually a problem in practice. It has been - surprising, and a little dismaying - how much this negatively impacts content retrieval and results in poor UX.

Anyone have thoughts or insights about why Lego is ending the partnership with FIRST? Thirty years is a long enough track record that it doesn't seem like an overnight decision...


The Lego Mindstorm robotics kit that powered the whole thing was discontinued in 2022. Since they're no longer making the robotics kits they have nothing to donate to the competition (or run the competition on).


The LEGO Education version of MINDSTORMS Robot Inventor, SPIKE Prime, is still available and a new robot kit, Computer Science and AI, is being released this year. After next season, LEGO will be continuing on with their own K-8 robotics program (as will FIRST).


> The LEGO Education version of MINDSTORMS Robot Inventor, SPIKE Prime, is still available

Well, the Spike line is being discontinued also: https://education.lego.com/en-us/spike-update-2026/

But you’re right in that they’ll have another new line—“Lego Education Computer Science & AI”, which is different in a way I don’t really understand and doesn’t fill me with a ton of confidence.


Which is a shame in itself


Major League Hacking (MLH) | Multiple Positions | Full-Time, Part-Time, & Contractor | Remote, New York (NY), Seattle (WA), London, Delhi/Bangalore | https://careers.mlh.io/

Major League Hacking (MLH) is rethinking technical education for the modern world. Now that every company is a technology company, we need faster and more effective ways to train the talent that everyone depends on. Our unique hands-on, community-driven approach has helped more than 500,000 technologists worldwide launch their careers and stay up-to-date with the latest trends. In fact, today 1 in 3 new programmers coming online are alumni of our community! If you're passionate about technology, education, and the future of work – this this the place for you!

Today we're a team of 25 and we're growing quickly! Nearly 80% of our team has a technical backgrounds (CS degrees, bootcamp grads, etc). Even if your job doesn't require you to write code every day, we love hiring people who understand what it's like to be an early-career technologist first-hand!

Current openings:

- Head of Engineering (https://careers.mlh.io/o/head-of-engineering?source=hn) - Head of Hacker Community Marketing (https://careers.mlh.io/o/head-of-community-marketing?source=...) - Senior Client Success Manager (https://careers.mlh.io/o/senior-client-success-manager?sourc...) - Solana Blockchain Technical Mentor (https://careers.mlh.io/o/blockchain-technical-mentor?source=...)

If you're an early career technologist yourself, check out the MLH Fellowship (https://fellowship.mlh.io/). It's a 12-week remote internship alternative where you'll learn to contribute to Open Source, add a production-level contribution to your resume, and you could even land a full-time job with companies like GitHub or Amazon at the end!


Major League Hacking (MLH) | Head of Engineering | Full Time | New York, Seattle, or Remote (US) | https://careers.mlh.io/o/head-of-engineering?soure=Hacker%20...

Major League Hacking (MLH) is rethinking technical education for the modern world. Now that every company is a technology company, we need faster and more effective ways to train the talent that everyone depends on. Our unique hands-on, community-driven approach has helped more than 500,000 technologists worldwide launch their careers and stay up-to-date with the latest trends. If you're passionate about technology, education, and the future of work – this this the place for you!

As our Head of Engineering you’ll be responsible for building out a team of developers who will create and maintain engineering products that help aspiring software engineers level up and launch their careers. The software that you and your team write will power the largest community of early career developers in the world and help redefine the future of work & technical education.

Stack: Ruby on Rails, Next.js, PostgreSQL, Redis, Heroku

Apply: https://careers.mlh.io/o/head-of-engineering?soure=Hacker%20...


Major League Hacking (MLH)| VP of Engineering | Full-time | New York / Seattle | Rails & Next.js

Major League Hacking (MLH) is rethinking technical education for the modern world. Now that every company is a technology company, we need faster and more effective ways to train the talent that everyone depends on. Our unique hands-on, community-driven approach has helped more than 500,000 technologists worldwide launch their careers and stay up-to-date with the latest trends. If you're passionate about technology, education, and the future of work – this this the place for you!

As our VP of Engineering you’ll be responsible for building out a team who will create and maintain engineering products that help aspiring software engineers level up and launch their careers. Within your first year, we’ll expect you to have hired a small team of developers and worked with them to create a successful and efficient engineering team. You’ll have added key features to our existing products driven by the needs of our team and feedback from our users.

https://careers.mlh.io/o/vice-president-of-engineering


Major League Hacking (MLH) | VP of Engineering | NYC | Full-Time

Major League Hacking (MLH) is rethinking technical education for the modern world. MLH has become the default destination for top tech talent to gain hands-on experience, build their professional networks, and ultimately level up their careers.

As our VP of Engineering you’ll be responsible for building out a team of developers who will create and maintain engineering products that help aspiring software engineers level up and launch their careers. The software that you and your team write will power the largest community of early career developers in the world and help redefine the future of work & technical education.

https://careers.mlh.io/o/vice-president-of-engineering

No recruiters, agencies, or dev shops.


Major League Hacking (MLH) | VP of Engineering | NYC/Bangalore | Full-time | Rails, Next.js

MLH is rethinking developer education. We've built the largest community of early career developers in the world (8% of US devs are alumni) and now we're helping them launch their careers via engineer apprenticeships. We need your help scaling the technology that makes this all work.

As our VP of Engineering you’ll be responsible for building out a team of developers. Our stack is primarily Ruby on Rails and Next.js. This is a key leadership role in a rapidly growing startup reporting directly to the CEO.

https://careers.mlh.io/o/vice-president-of-engineering


Major League Hacking (MLH) | VP of Engineering | NYC/Bangalore | Full-time | Rails, Next.js

MLH is rethinking developer education. We've built the largest community of early career developers in the world (8% of US devs are alumni) and now we're helping them launch their careers via engineer apprenticeships. We need your help scaling the technology that makes this all work.

As our VP of Engineering you’ll be responsible for building out a team of developers. Our stack is primarily Ruby on Rails and Next.js. This is a key leadership role in a rapidly growing startup reporting directly to the CEO.

https://careers.mlh.io/o/vice-president-of-engineering


The growth specifically in Post-baccalaureate Certificate program enrollment while general Certificate programs drop is really surprising to me. I wonder if undergrads just have less faith in certificates or if graduates are more worried about CV gaps.

Anyone have thoughts on why this might be happening?


Major League Hacking (MLH) | REMOTE | Open Source Mentor | Contract to FT | https://mlhfellowship.recruitee.com/o/open-source-mentor?sou...

TL;DR; Get paid to help junior developers contribute to Open Source and help them launch their careers.

The MLH Fellowship (https://fellowship.mlh.io/) a 12-week remote internship alternative for aspiring software engineers. We help fellows level up and learn the skills they need to enter industry by teaching them to contribute to Open Source Software under the guidance of full-time Open Source Mentors (like you!).

Mentors help teams of fellows scope out and make contributions to major Open Source projects like React, Homebrew, and Flask. In addition, you'll help with things like code reviews, pair programming, and career advice.

This is a 3-month, full-time contract role with the opportunity for full-time employment if things go well. There is also opportunity for an ongoing flexible schedule where you can work for periods of 3 months and then take a month vacation between batches.

Currently looking for a mentor who can help with contributions to React, React Native, Jest, and AWS Amplify, and Docusaurus. Additional openings for Python and JavaScript come up periodically as well!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: