More

viksit · 2025-11-02T18:53:33 1762109613

this is a great question. what are the main use cases that you have for this? i’ve been working on a library for something similar and exposing it via an mcp interface. would love to pick your brain on this (@viksit on twitter)

viksit · 2025-10-11T17:34:38 1760204078

Yes! thanks for the memory haha.

viksit · 2025-08-28T13:35:43 1756388143

you definitely succeeded in your humorous endeavors ;) i snorted haha

viksit · 2025-08-07T15:26:01 1754580361

In my fourth post in the series, I tackle how to make multi-step agent workflows learn behavior from data. Most agents today rely on vibes: prompt tuning, hand-written templates, and hope(!). This post is about replacing that with metrics and optimization.

Each branch in the workflow learns how to behave, not just where to route. I show how to set up a reward, plug in an optimizer, and treat agent behavior as something you can tune like a model.

viksit · 2025-08-01T03:00:42 1754017242

mostly aligned on this. couple of thoughts:

- raw accuracy is now a "vanity" metric. so the benchmarks need to get more sophisticated, and i think they're going to have to be far more task specific than hotpot or hover. they've become like the mnist of multi hop.

- in my use of MIPROv2 and SIMBA, I see a fair amount of improvements for multi hop tasks (published some of these on hn before). I'm going to try GEPA and see how it performs. so I think we're at the start of what I would call "meta learning".. tuning across a huge search surface rather than tweaking one prompt. hyper param search for higher dim spaces.

- tokens burned should be a reported result

viksit · 2025-08-01T02:44:28 1754016268

vJEPA models, lecun's approach towards world models that have been derided by a lot of naysayers. (personally I think thats the way to go)

viksit · 2025-07-31T13:11:07 1753967467

they’ve already written one! see omar’s x account for details!

TheTaytay · 2025-07-31T15:47:43 1753976863

Here's a link to a repost Omar made referencing it: https://x.com/DSPyOSS/status/1950733300420510006

viksit · 2025-07-23T20:58:26 1753304306

Based on the feedback from my last HN post on differentiable routing, I ran a follow-up benchmark: local RNN vs GPT-4o for tool selection in LLM workflows.

Same accuracy, 40% lower cost. Appreciate all the suggestions, this post builds on them.

viksit · 2025-07-13T23:14:55 1752448495

Following up on my last post about optimizing tool selection with differentiable programming, I’ve been thinking about how to extend those ideas to full agent workflows. This post shares some early experiments using DSPy to optimize routing and structure end-to-end for a sample customer service agent workflow. Feedback welcome!

viksit · 2025-07-08T16:02:39 1751990559

would you have a link?

digitcatphd · 2025-07-11T18:09:39 1752257379

https://arxiv.org/pdf/2506.02153