Why artificial intelligence is a brutal tool for developing software, but it cannot replace human review in security, architecture and business logic.
Copilot, Cursor, Claude, ChatGPT, Gemini… By 2026, anyone with an idea and a prompt can generate working code in minutes. And it works. Sometimes it’s even elegant.
The problem isn’t that the AI writes code. The problem is that the code looks correct. It compiles. It passes basic tests. And it hides vulnerabilities that no one has reviewed.
At Undercoverlab we use AI every day. It speeds us up. It allows us to explore solutions faster. But every line it generates goes through human review. Here we explain why.
The data: what the studies say
We’re not talking about opinions. There are recent studies with compelling data on the security of AI-generated code:
| Source | Key data | Detail |
|---|---|---|
| Veracode (2025) | 45% of AI code has bugs | 100+ tested models, 4 languages |
| CodeRabbit (2025) | 1.57x more security issues | 2.74x more XSS vulnerabilities |
| Opsera (2026) | 15-18% more vulnerabilities | 250,000+ developers analyzed |
| OWASP LLM Top 10 | Prompt injection, supply chain, data poisoning | Global reference framework |
| Georgetown CSET | 40% of Copilot vulnerable | CWE Top 25 MITRE |
The most worrying fact: larger, more modern models do not generate significantly safer code than older ones. The problem is not solved with more parameters or more training. It is structural.
4 real risks we see every week
These are not theoretical risks. They are problems we find when auditing code that clients bring to us after working with unsupervised AI:
1. Outdated dependencies with known vulnerabilities
The AI suggests packages and libraries based on its training, which can be months or years old. This means it often recommends versions with already documented vulnerabilities (CVEs). The code works, but you’re essentially opening an open door.
Real-world example: A client brought us an AI-generated app that used a version of PyTorch with a known exploit (Shadow Ray). The app worked perfectly. It was also perfectly attackable.
2. Unsanitized SQL and XSS injections
According to Veracode’s report, 86% of AI-generated code samples failed cross-site scripting (XSS) attacks and 88% were vulnerable to log injections. AI generates queries to databases that work, but it does not protect them against malicious input.
What this means: Anyone with basic knowledge can manipulate a form on your website to access data they shouldn’t see, modify, or delete.
3. Hardcoded API tokens and keys within the code
AI tends to put credentials directly into the code for simplicity. If that code is uploaded to a repository (GitHub, GitLab), your keys are publicly exposed. Automated bots constantly scan GitHub looking for exactly that.
Consequence: Unauthorized access to payment services, databases, cloud services. We have seen cases where an exposed token has cost thousands of euros in fraudulent charges to AWS.
4. Incorrect business logic that “works”
This is perhaps the most subtle risk. The code compiles, passes tests, the app works… but it doesn’t do what the business needs. The AI doesn’t understand the context of your business. It doesn’t know that that price calculation should include the Andorran IGI, or that that approval workflow needs double validation for amounts over €10,000.
The most expensive code is the one that solves the wrong problem.
Why does AI make these mistakes?
It’s not that AI is “bad.” It’s just that it’s trained to predict the most likely code, not the safest. There’s a fundamental difference:
- AI’s goal: to generate code that compiles and does what you ask it to.
- A developer’s goal: to generate code that works, is secure, is maintainable, scales well, and solves the right problem.
AI optimizes for the first condition. A developer must optimize for all five.
Furthermore, the models are trained with public code from GitHub and StackOverflow. Much of this code does not follow good security practices. AI reproduces the same mistakes that humans already make, but now on an industrial scale.
How we do it at Undercoverlab
We’re not anti-AI. We use it every day. But we use it as a tool within a process, not as the entire process. This is how our workflow works:
- Definition of requirements with the client. Before touching code, we understand the business. What the product should do, for whom, with what restrictions.
- Architecture and design. We decide the structure, technology and integrations. The AI does not make architecture decisions.
- AI-assisted development. We use AI to accelerate the generation of codebase, tests, and documentation. But each output is reviewed.
- Security review. Dependency analysis, input sanitization, secret management, business logic validation.
- Testing and QA. Automated tests + manual review of limit cases and edge cases.
- Controlled deployment. CI/CD with automatic security checks before each deployment.
AI participates in step 3. A developer participates in all six .
What you should do if you use AI for your project
If you are developing a digital product and using AI (directly or through your team), here is a minimum security checklist:
- Audit dependencies — Use tools like Snyk, Dependabot, or npm audit to detect packages with known vulnerabilities.
- Sanitize all input — Any user input (forms, URLs, APIs) must be validated and escaped.
- Never hardcode secrets — Use environment variables and secret management services (Vault, AWS Secrets Manager, etc.).
- Review business logic — Don’t trust AI to understand your business context. Manually validate that calculations, workflows, and rules are correct.
- Implement mandatory code review — No code should reach production without being reviewed by a human.
- Put security checks in the pipeline — Integrate SAST/DAST into your CI/CD to detect vulnerabilities before every deployment.
Conclusion
AI is not the problem. The problem is trusting it without supervision.
A developer is not someone who writes code. He is someone who understands why that code exists, what happens if it fails, and how to protect it. No language model, no matter how advanced, will do that.
Use AI. Take advantage of it. But put a human in the middle.
────────────────────────────────────────
Do you have a digital project and want to make sure the code is correct?
At Undercoverlab we offer a free 20-minute consultation where we review your case and guide you without obligation.
Write to us: hello@undercoverlab.com
────────────────────────────────────────
Sources and references
- Veracode GenAI Code Security Report (2025) — 45% of AI-generated code contains security flaws
- CodeRabbit AI vs Human Code Report (December 2025) — 1.57x more security issues in AI code
- Opsera AI Coding Impact Benchmark Report (2026) — 15-18% more vulnerabilities in AI-generated code
- OWASP Top 10 for LLM Applications (2025) — Framework for security risks in AI applications
- Georgetown CSET (2024) — 40% of Copilot-generated programs with CWE Top 25 vulnerabilities