Hacker Newsnew | past | comments | ask | show | jobs | submit | AugustoCAS's commentslogin

Dora released a report last year: https://dora.dev/research/2025/dora-report/

The gains are ~17% increase in individual effectiveness, but a ~9% of extra instability.

In my experience using AI assisted coding for a bit longer than 2 years, the benefit is close to what Dora reported (maybe a bit higher around 25%). Nothing close to an average of 2x, 5x, 10x. There's a 10x in some very specific tasks, but also a negative factor in others as seemingly trivial, but high impact bugs get to production that would have normally be caught very early in development on in code reviews.

Obviously depends what one does. Using AI to build a UI to share cat pictures has a different risk appetite than building a payments backend.


The full report can be found here: https://services.google.com/fh/files/misc/2025_state_of_ai_a...

That 17% increase is in self-reported effectiveness. The software delivery throughput only went up 3%, at a cost of that 9% extra instability. So you can build 3% faster with 9% more bugs, if I'm reading those numbers right.


Those aren't even percentage increases, but standardized effect sizes. So if you take an individual survey respondent and all you know is that they self-reported higher AI usage, you can guess their answers to the self-reported individual effectiveness slightly more accurately, but most of the variation will be due to unrelated factors.

The question that people are actually interested in, "After adopting this specific AI tool, will there be a noticeable impact on measures we care about?" is not addressed by this model at all, since they do not compare individual respondents' answers over time, nor is there any attempt to establish causality.


And 3% difference is at "the new coffee in office is kinda shit and developers are annoyed" level of difference

I think for myself, it's close to 25% if I only take my role as a dev. If I take my 'senior' role it's less, because I spend way more time in reviews or in prod incident meetings.

Three months ago, with opus4.5, I would have said that the productivity improvement was ~10% for my whole team.

I now have to contradict myself: juniors and even experienced new hires with little domain knowledge don't improve as fast as they used to. I still have to write new tasks/issue like I would have for someone we just hired, after 8 months. I still catch the same issues we caught in reviews three months ago.

Basically, experience doesn't improve productivity as fast as it used to. On easy stuff it doesn't matter (like frontend changes, the productivity gains are extremely high, probably 10x), and on specific subjects like red teaming where a quantity of small tools is better than an integrated solution I think it can be better than that.

But I'm in a netsec tooling team, we do hard automation work to solve hard engineering issues, and that is starting to be a problem if juniors don't level up fast.


For me it is a 2x or 5x or something, "but high impact bugs get to production that would have normally be caught very early in development on in code reviews" is what takes it back down to a 1.5x.

There are genuinely weeks where I go 5x though, and others where I go 0.5x.


It's not so valuable to assess the current state - what the impact of using AI is today. From personal experience it feels like overall impact on productivity was not positive a couple of years ago, might be positive now and will be positive in a couple of years. That means by assessing the current state of impact on product where just finding where we are on that change curve. If we accept that trend is happening then we know at some point it will (or has) pass the threshold where our companies will fall behind if they're not using it. We also know it takes a while to get up to speed and make sure we're making the most of it so the earlier we start the better. That's the counter arguement that we could wait for a later wave to jump on but that's risky and the only potential reward is a small percentage short-term productivity gain.

So you're saying instead of assessing the current capabilities of the technology, we should imagine its future capabilities, "accept" that they will surely be achieved and then assess those?

I would assess the directionality and rate of the trend. If it's getting better fast and we don't see a limit to that trend then it will eventually pass whatever threshold we set for adoption.

Of course, if stability is part of what you're supposed to be delivering, then you can't be 17% more effective.

This was easy because it's a Chinese company.

The largest companies in this space that do similar this (oxylabs, brighdata,etc) have similar tactics but are based in a different location.


brighdata = Israel i think oxylabs = Lithuanian, child of NordVPN


This is not an issue for me due to my workflow.

I have a script for each of my projects that I run when I open a new terminal window (Alacritty). The scripts set up tmux with 3-8 terminals, each terminal launches a components, utility or just sits in a folder from which I later run commands.

Having said that, I use only a few zsh plugins, and have a theme configured to not run commands that add extra latency.


A chunk of the internet is down for me. So far Perplexity, AWS (VPN) and vercel.

Seems AWS is limping: https://news.ycombinator.com/item?id=45640772


Something that I find amusing in the Java community is that a good number of senior developers, with anything from 5-20 years of experience, who do 'tdd' have never heard of the concept of test doubles and religiously think that a class must be tested in complete isolation mocking everything else.

The saddest one I saw was a team trying to do functional programming (with Spring). The tech lead was a bit flummoxed when I asked why mocks are not used in functional languages and continued to think that 'mocking functions' is the correct way to do TDD.


Tests knowing about implementation details and testing the implementation details (which is the case 99.999% of the time if you use mocks) is more common than not. Even when the main value of automated testing is being able to change those very implementation details, that you now cannot do.

A whole bunch of work spent for no benefit or negative benefit is pretty common.


I use Java and Spring extensively. If I am in a lead role and have say over the code base I won't allow mocking frameworks to be used in tests. If you want a good way to shine a light on poorly structured code, disallow mocking frameworks.

Java added the Funcional interface in v8 making it quite easy to code to an interface and yet in Spring everyone's go to is just to slap a @Component on a concrete class.


A quick google shows this for FF (taken from a thread in StackOverflow):

> In Firefox you can completely disable beforeunload events by setting dom.disable_beforeunload to true in about:config. Extensions may be needed for other browsers.

A word of caution: I'm not 100% sure, but I wonder if some web collaboration tools might use this to ensure data has been synced with a server.


It surely has a lot of legitimate uses, even if it is primarily abused. I’ve used it before to do various cleanup tasks, to have a more timely “user disconnected” event, rather than waiting on some timeout to occur server side.

Having said that, it should never be the end of the world to disable, sites should never have data loss due to this event missing, because if so, they already have a data loss problem when for instance the power goes out.


I am not sure if this is implemented using this functionality but when I am on a console session on proxmox and hit ctrl+w due to muscle memory, it's nice to have a warning telling me the tab will be closed. Same with all kinds of remote access tools. One legit use case I can think of.


[posted this in another thread, but maybe the author can clarify this]

I wonder how this works when one runs test in parallel (something I always enable in any project). By this I mean configuring JUnit to run as many tests as cores are available to speed up the run of the whole test suite.

I took a peek at the code and I have the impression it doesn't work that well as it hooks into when a thread is started. Also, I'm not sure if this works with fibers.


Yes, Fray controls all application threads so it runs one test per JVM. But you can always use multiple JVMs run multiple tests[1].

Fray currently does not support virtual threads. We do have an open issue tracking it, but it is low priority.

[1]: https://docs.gradle.org/current/userguide/java_testing.html#...


I wonder how this works when one runs test in parallel (something I always enable in any project). By this I mean configuring JUnit to run as many tests as cores are available to speed up the run of the whole test suite.

I took a peek at the code and I have the impression it doesn't work that well as it hooks into when a thread is started. Also, I'm not sure if this works with fibers.


A side comment, I have found that configuring a few live templates in IntelliJ helps me to write a lot of the repetitive code just a handful of keystrokes regardless of the language.

Structural refactoring is another amazing feature that is worth knowing.


I've also got some mileage from live templates for repetitive code. However, at some point I built[0] an IntelliJ IDEA plugin to help me generate setters and field assignments that I felt live templates weren't a good solution for (for my case). I don't know if JavaFactory solves this kind of problem, keen to try it out.

[0]: https://github.com/nndi-oss/intellij-gensett


I think IntelliJ is a great tool on its own. Recently, they even added a feature that auto-injects dependencies when you declare them as private final — super convenient.

I can’t help but wonder if the folks at JetBrains are starting to feel a bit of pressure from tools like Cursor or Windsurf


Would love to know if you found it's a real key or a fake one.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: