Hacker Newsnew | past | comments | ask | show | jobs | submit | viksit's commentslogin

this is a great question. what are the main use cases that you have for this? i’ve been working on a library for something similar and exposing it via an mcp interface. would love to pick your brain on this (@viksit on twitter)


Yes! thanks for the memory haha.


you definitely succeeded in your humorous endeavors ;) i snorted haha


In my fourth post in the series, I tackle how to make multi-step agent workflows learn behavior from data. Most agents today rely on vibes: prompt tuning, hand-written templates, and hope(!). This post is about replacing that with metrics and optimization.

Each branch in the workflow learns how to behave, not just where to route. I show how to set up a reward, plug in an optimizer, and treat agent behavior as something you can tune like a model.


mostly aligned on this. couple of thoughts:

- raw accuracy is now a "vanity" metric. so the benchmarks need to get more sophisticated, and i think they're going to have to be far more task specific than hotpot or hover. they've become like the mnist of multi hop.

- in my use of MIPROv2 and SIMBA, I see a fair amount of improvements for multi hop tasks (published some of these on hn before). I'm going to try GEPA and see how it performs. so I think we're at the start of what I would call "meta learning".. tuning across a huge search surface rather than tweaking one prompt. hyper param search for higher dim spaces.

- tokens burned should be a reported result


vJEPA models, lecun's approach towards world models that have been derided by a lot of naysayers. (personally I think thats the way to go)


they’ve already written one! see omar’s x account for details!


Here's a link to a repost Omar made referencing it: https://x.com/DSPyOSS/status/1950733300420510006


Based on the feedback from my last HN post on differentiable routing, I ran a follow-up benchmark: local RNN vs GPT-4o for tool selection in LLM workflows.

Same accuracy, 40% lower cost. Appreciate all the suggestions, this post builds on them.


Following up on my last post about optimizing tool selection with differentiable programming, I’ve been thinking about how to extend those ideas to full agent workflows. This post shares some early experiments using DSPy to optimize routing and structure end-to-end for a sample customer service agent workflow. Feedback welcome!


would you have a link?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: