AI-generated bug reports are becoming a big waste of time for developers

Alfonso Maruccia

Posts: 823   +266
Staff
A hot potato: Generative AI services can be used to generate snippets of generic text, uncanny images, or even code scripts in various programming languages. But when LLMs are employed to fake actual bug reports, the result can be largely detrimental to a project's development.

Daniel Stenberg, the original author and lead developer of the curl software, recently wrote about the problematic effects LLMs and AI models are having on the project. The Swedish coder noted that the team has a bug bounty program offering real money as rewards for hackers who discover security issues, but superficial reports created through AI services are becoming a real problem.

Curl's bug bounty has so far paid $70,000 in rewards, Stenberg said. The programmer received 415 vulnerability reports, with 77 of them being "informative" and 64 that were ultimately confirmed as security issues. A significant number of the reported issues (66%) were neither a security problem nor a normal bug.

Generative AI models are increasingly used (or proposed) as a way to automate complex programming tasks, but LLMs are well-known for their exceptional ability to "hallucinate" and provide nonsensical results while sounding absolutely confident about its output. In Stenberg's own words, AI-based reports look better and appear to have a point, but "better crap" is still crap.

The better the crap, Stenberg said, the more time and energy the programmers have to spend on the report before closing it. AI-generated crap doesn't help the project at all, as it takes away developer time and energy from something productive. The curl team needs to properly investigate every report, while AI models can exponentially reduce the time needed to write a report on a bug that could ultimately be just thin air.

Stenberg quoted two bogus reports that were likely created by AI. The first report claimed to describe an actual security vulnerability (CVE-2023-38545) before it was even disclosed, but it reeked of "typical AI style hallucinations." Facts and details from old security issues were mixed and matched to make up something new that had "no connection" with reality, Stenberg said.

Another recently submitted report on HackerOne described a potential Buffer Overflow flaw in WebSocket Handling. Stenberg tried to post some questions about the report, but he ultimately concluded that the flaw wasn't real and that he was likely talking to an AI model rather than a real human being.

The programmer said that AI can do "a lot of good things," but it can also be exploited for the wrong things. LLM models could theoretically be trained to report security problems in productive ways, but we still have to find "good examples" of this. As AI-generated reports will become more common over time, Stenberg said, the team will have to learn how to trigger "generated-by-AI" signals better and quickly dismiss those bogus submissions.

Permalink to story.

 
Nothing new here, AI is just trash most of the time, all those so called help bots companies use that simply tell customers the same they have tried before lol, and they hide consumer service numbers/email too

its annoying
 
I still think it could be very helpful, with proper training and thousands of human work to help it become proficient.
Someone has to work on it like it is an app or service.
Will see.
 
Of course, as the AI becomes more "intelligent" it will write glowing reports about itself no matter how deep the crap gets .... the first rule of self-preservation ...... LOL
 
Back