The Lost Art of Testing Code in the Age of LLMs and Vibe-Coding
Published:
Santiago Valdarrama recently called out a trend many of us have felt firsthand: serious testing is quietly slipping out of software development. As he noted, we’re in an age of impressive demos, where slick presentations often matter more than whether the code is actually solid. This shows up especially when working with Large Language Models (LLMs) or doing what’s commonly called “vibe-coding”—an exploratory, trial-and-error style of writing code.
There’s no denying that LLMs and rapid prototyping can speed things up dramatically. But they also add a layer of unpredictability. Without proper testing, it becomes very easy to ship something that looks fine but behaves unreliably. The real question, then, is how we stay confident in our code. The answer isn’t complicated: disciplined testing practices still matter—arguably more than ever—especially when AI tools are in the mix.
Why Testing Matters More Than Ever
With LLM-generated code, correctness can’t be assumed. The output may look reasonable and even pass a quick manual check, yet fail in subtle or edge-case scenarios. Manually verifying everything doesn’t scale, which is why automated testing becomes essential rather than optional.
Key Testing Principles:
- Assume it’s broken until proven otherwise – Treat LLM output as a draft, not a guarantee.
- Test early and often – Problems are easier to fix before they spread through the codebase.
- Automate as much as possible – Manual checks are slow, inconsistent, and easy to miss.
Setting Up Tests in VS Code
VS Code makes it relatively painless to build testing into your day-to-day workflow. Here’s a straightforward way to get started:
1. Choose a Testing Framework
- Python:
pytest(lightweight, expressive, widely adopted) - Java:
JUnit - JavaScript/TypeScript:
JestorMocha
Install your framework using the relevant package manager (for example, pip install pytest or npm install jest).
2. Configure VS Code for Testing
- Open the Command Palette (
Ctrl+Shift+PorCmd+Shift+P). - Search for “Python: Configure Tests” (or the equivalent for your language).
- Select your framework (e.g.,
pytest). - VS Code will then auto-detect and run tests directly from the editor.
3. Write Your First Test
For example, using Python with pytest:
# test_example.py
def add(a, b):
return a + b
def test_add():
assert add(2, 3) == 5
assert add(-1, 1) == 0
You can run tests by:
- Clicking Run Tests in VS Code’s testing sidebar, or
- Using the terminal:
pytest test_example.py
4. Test LLM-Generated Code Thoroughly
Since LLMs can introduce subtle inconsistencies, edge cases deserve extra attention:
def test_add_edge_cases():
# Test zero, negative, large numbers
assert add(0, 0) == 0
assert add(-5, -3) == -8
assert add(1e6, 2e6) == 3e6
Documenting and Saving Tests
Tests only add value if they’re easy to understand and kept up to date. A bit of structure goes a long way.
1. Follow a Clear Structure
project/
├── src/
│ └── your_code.py
└── tests/
├── test_unit.py
└── test_integration.py
2. Use Descriptive Test Names
- Bad:
test1() - Good:
test_user_login_fails_with_invalid_password()
3. Document Test Intentions
Short comments explaining why a test exists can save time later:
# This test ensures the API handles null inputs gracefully
def test_api_null_input():
response = call_api(None)
assert response.status_code == 400
4. Version Control Your Tests
- Commit tests alongside the code they validate.
Use clear, purposeful commit messages, such as:
"Add tests for user authentication""Fix edge case in payment processing (with regression test)"
Automating Test Execution: Ensuring Reliability at Every Stage
Automation is the backbone of reliable testing. Manually running tests doesn’t scale, and it’s easy to forget. When tests are part of the pipeline, quality checks happen automatically—before code ever reaches production.
1. Pre-commit Hooks: Catch Errors Early
- What it does: Runs tests before a Git commit is finalized, blocking commits if tests fail.
Tools:
Python:
pre-commit(pip install pre-commit) with.pre-commit-config.yaml:repos: - repo: local hooks: - id: pytest name: Run tests entry: pytest language: system stages: [commit]JavaScript:
Husky(npm install husky --save-dev), configured inpackage.json:"husky": { "hooks": { "pre-commit": "npm test" } }
- Why it matters: Broken code never even makes it into the repository.
2. CI/CD Pipelines: Gatekeeping Deployment
- What it does: Automatically runs tests on every push or pull request.
Implementation:
GitHub Actions (
.github/workflows/tests.yml):name: Run Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' - name: Install dependencies run: pip install -r requirements.txt - name: Run tests run: pytestGitLab CI (
.gitlab-ci.yml):test: image: python:3.11 script: - pip install -r requirements.txt - pytest
- Why it matters: Every contribution, from every team member, is held to the same standard.
System Design: Where Testing Fits In
In a scalable system, testing isn’t a single step—it’s layered throughout the architecture:
Local Development
- Pre-commit hooks serve as the first line of defense.
- Developers run tests directly from VS Code or the CLI.
Version Control
- CI pipelines trigger on pull requests and block merges on failure.
- Tests run in clean, isolated environments.
Pre-Deployment
- CD pipelines execute smoke and integration tests.
- Rollbacks trigger automatically if something goes wrong.
Monitoring (Post-Deployment)
- Synthetic monitoring simulates real user behavior.
- A/B testing compares new and existing versions in production.
Example Architecture Diagram:
Developer Laptop (VS Code)
│
├── Pre-commit Hook → Fail → Fix code
│ └── Pass → Git Commit
│
└── CI Server (GitHub/GitLab)
│
├── Unit Tests → Fail → Notify team
│ └── Pass →
│
└── Integration Tests → Fail → Block Merge
└── Pass → Merge to Main
│
CD Pipeline (AWS/GCP/K8s)
│
├── Deployment → Post-Deploy Tests → Fail → Rollback
│ └── Pass → Traffic Shift
│
└── Monitoring → Alerts → Hotfix
Key Takeaways
- Pre-commit hooks catch issues before code is shared.
- CI/CD pipelines are essential for safe collaboration and deployment.
- Testing should be designed into the system, not bolted on later.
Final Thoughts
LLMs and vibe-coding can dramatically speed up development, but they don’t remove the need for strong testing discipline. Treat automated tests as first-class citizens in your workflow. When tests run at every stage, unpredictability—especially from AI-generated code—becomes something you can measure, manage, and control rather than something you hope won’t break in production.
