Can AI Write Secure Code? Evaluating AI-Powered Secure Development

Written By:
Founder & CTO
June 25, 2025

Artificial Intelligence has firmly embedded itself into the modern software development lifecycle. From autocompleting functions to building out entire microservices, the breadth of AI capabilities has expanded significantly. As developers increasingly rely on AI-based tools for scaffolding applications and automating tedious workflows, a pressing concern emerges, can AI write secure code?

This blog aims to explore this question in-depth, examining the current state of AI-powered secure development, its underlying limitations, the technical requirements of secure software engineering, and how AI fits into (or disrupts) that narrative.

1. The Rise of AI in Software Development

AI models, especially transformer-based large language models (LLMs), have drastically changed how developers approach day-to-day programming. Tools like GitHub Copilot, Amazon CodeWhisperer, GoCodeo, and ChatGPT's coding assistant are now commonplace in the developer's toolkit. These agents:

  • Generate application logic based on natural language prompts

  • Suggest code completions in real-time as developers type

  • Offer boilerplate code snippets for common APIs or libraries

  • Refactor legacy code with modern paradigms

  • Scaffold backend services, frontend components, and integration tests

However, the bulk of these capabilities are oriented around functional correctness and syntactic fluency. These tools are trained on massive datasets scraped from public repositories, data that is often inconsistent in quality, sometimes outdated, and rarely annotated with security metadata.

So while AI can write code that "runs," the more critical question is whether it can write code that defends itself against known and emerging threat models. This is where secure development diverges from general-purpose coding.

2. What Is Secure Code? A Developer-Centric View

At its core, secure code is designed not just to perform a task, but to do so in a way that anticipates, resists, and gracefully fails under malicious or unintended misuse.

From a developer's perspective, secure code should:

  • Rigorously validate all inputs and enforce type and boundary constraints

  • Implement robust authentication and authorization flows, adhering to the principle of least privilege

  • Avoid reliance on security through obscurity

  • Safely manage cryptographic operations with respect to key storage, nonce reuse, and cipher strength

  • Prevent leakage of sensitive information such as PII, API keys, and internal schema structure

  • Be resilient against common classes of vulnerabilities like SQL injection, XSS, CSRF, IDOR, deserialization flaws, and memory corruption bugs

Furthermore, secure code must be auditable. That means maintaining traceability, comments explaining intent, predictable control flow, and minimal side effects, traits that AI-generated code often lacks.

3. Can AI Understand Security? Analyzing LLM Behavior

To answer whether AI can write secure code, we must first examine whether it understands what "security" even means. Current LLMs operate on statistical correlations. When you prompt an LLM with "write a secure login route," it attempts to surface token sequences that have historically been associated with similar prompts.

This pattern-matching approach means that LLMs often reproduce code that "looks" secure but lacks any deeper awareness of:

  • Dynamic threat surfaces: Is the endpoint exposed to the public internet? Is rate limiting configured?

  • Business logic flaws: Can users escalate their privileges by tampering with JWT claims?

  • Regulatory compliance: Does the code handle GDPR/CCPA-sensitive data appropriately?

Security is not a discrete concept like sorting or rendering HTML. It is a continuous function of context, architecture, deployment environment, and business risk. As such, LLMs can emulate patterns but cannot reason about security unless explicitly augmented with external rulesets or real-time environment data.

4. AI Coding Tools and Security: Strengths and Limitations
Strengths:
  1. Code Pattern Recognition: AI tools excel at recognizing and reproducing common patterns. If there's a known secure pattern for validating JWTs or hashing passwords with bcrypt, the model will likely recommend it.

  2. Speed: For developers who are familiar with security principles, AI can speed up the writing of secure code by automating boilerplate, scaffolding, and API call setup.

  3. Linting and Static Analysis Integration: Some modern AI coding tools, like GoCodeo, incorporate built-in static analysis tooling. These features allow the agent to validate AI-generated code against predefined secure coding policies.

  4. Autofix Capabilities: When paired with static analysis or linters, AI can suggest one-click fixes for known vulnerable code blocks.

Limitations:
  1. Lack of Architectural Context: AI often cannot see the broader system it is coding for. Without full project context, it cannot ensure that code integrates safely into its environment.

  2. Incorrect Default Behaviors: Many LLMs default to insecure patterns (e.g., interpolated SQL strings, permissive CORS configurations) because those patterns dominate public codebases.

  3. No Awareness of Runtime Behavior: AI does not analyze runtime performance, memory usage, race conditions, or asynchronous flaws.

  4. Opaque Decision-Making: When AI suggests a piece of code, it rarely explains why, a problem for auditing and verification in high-assurance systems.

5. Common Vulnerabilities: How Well Does AI Detect or Avoid Them?

Let’s evaluate how well LLMs handle security across some of the most common CVE classes:

  • SQL Injection: AI may suggest parameterized queries in languages like Python (via psycopg2) or Java (via PreparedStatement), but it’s not consistent. Many examples still use unsafe string interpolation.

  • Cross-Site Scripting (XSS): AI may or may not escape user-generated content depending on the framework and rendering context. It struggles with client-side nuances like DOM-based XSS.

  • Cross-Site Request Forgery (CSRF): AI rarely includes CSRF tokens unless prompted explicitly and almost never configures CSRF middleware correctly.

  • Authentication Flaws: AI often uses outdated or insecure token patterns (e.g., symmetric JWTs with hardcoded secrets, no token rotation, poor session expiration handling).

  • Sensitive Data Exposure: AI models often suggest hardcoding secrets, neglecting to use environment variables, or storing tokens in plaintext.

6. Case Study: Secure Code Suggestions from ChatGPT, Claude, and Copilot

We conducted a controlled experiment with three AI tools: ChatGPT (GPT-4), Claude, and GitHub Copilot. Each was asked to generate a login route in Node.js using Express, complete with secure password handling, token-based authentication, and rate limiting.

  • ChatGPT (GPT-4): Suggested bcrypt for hashing, used JWTs, and recommended express-rate-limit. However, it failed to implement token rotation and didn’t separate access and refresh tokens.

  • Claude: Included middleware for input validation and rate limiting. However, the token generation used a hardcoded secret, and error messages exposed stack traces.

  • Copilot: Generated a syntactically valid login flow but included insecure patterns like logging raw passwords or storing them in memory without hashing.

The conclusion? AI can scaffold a superficially secure implementation but requires manual review and hardening to be production-ready.

7. AI in the SDLC: Where It Fits in a Secure Development Lifecycle

Secure development is not confined to code generation. It's a pipeline involving multiple stages:

  • Design: Threat modeling, data classification, attack surface analysis

  • Development: Secure coding, peer reviews, dependency hygiene

  • Testing: Static (SAST) and dynamic (DAST) analysis, fuzzing, penetration tests

  • Deployment: Secure build pipelines, artifact signing, infrastructure-as-code auditing

  • Monitoring: Runtime anomaly detection, log aggregation, SIEM integration

AI can be injected into each stage to:

  • Assist in writing secure infrastructure-as-code (e.g., Terraform with policy-as-code)

  • Autogenerate threat models or abuse cases

  • Review pull requests with security-focused heuristics

  • Perform regression testing of security assumptions

But AI must act as an augmentation layer, not a replacement. Its value is in making secure development faster, more consistent, and more accessible, not automatic.

8. MCPs, Toolchains, and Policy Integration

GoCodeo and other advanced AI platforms are exploring the concept of Model Context Protocols (MCPs). MCPs enable AI agents to:

  • Ingest real-time context from the developer's environment (e.g., framework, DB, auth provider)

  • Integrate with policy engines like OPA, Rego, or custom DSLs for validating AI output

  • Chain tools such as secret scanners, SAST linters, CI/CD validators to form secure feedback loops

This turns the AI from a stateless predictor into a context-bound agent, capable of:

  • Adapting code to internal standards

  • Enforcing consistency with secure patterns across services

  • Connecting with secret managers (e.g., HashiCorp Vault) and compliance toolchains

9. Human-in-the-Loop: Why AI Still Needs You

Security is a function of intent and trade-offs. AI cannot currently:

  • Reason about business-specific risks

  • Evaluate long-term maintainability vs. short-term patching

  • Detect subtle flaws in authorization logic

  • Ensure adherence to industry-specific standards (e.g., HIPAA, PCI-DSS)

Human expertise is essential for:

  • Conducting security architecture reviews

  • Performing red/blue team exercises

  • Configuring secure-by-default pipelines

  • Validating edge case scenarios and complex attack chains

AI is a productivity tool, not a firewall.

10. Best Practices for Using AI in Secure Coding
  1. Treat AI-generated code as untrusted input. Always review it before merging.

  2. Integrate SAST/DAST tools to validate outputs during CI/CD runs.

  3. Avoid prompt ambiguity. Be specific about desired security practices in prompts.

  4. Define reusable secure patterns for your AI agent to draw from.

  5. Monitor model drift. AI behavior may change with updates; pin versions when possible.

  6. Use secure sandboxes or linters to restrict AI outputs that violate policy.

11. Final Verdict: Can AI Write Secure Code?

In 2025, the answer is: Partially.

AI is capable of generating code that implements security best practices at a surface level. But without:

  • Contextual awareness of the system architecture

  • Knowledge of the application domain

  • The ability to adapt to business-specific risk thresholds

  • Integration with a robust policy enforcement pipeline

…it cannot guarantee end-to-end secure software development.

To achieve AI-powered secure development, we must:

  • Augment LLMs with security context through MCPs and structured inputs

  • Build human-in-the-loop review processes

  • Enforce policy compliance automatically via integrated CI/CD steps

  • Continuously test, validate, and observe AI-generated code in runtime

AI doesn’t eliminate the need for secure development skills; it amplifies their reach.