Paul's setup is simpler than you'd expect. He uses one tool: Cursor.
No stack of five AI assistants. No rotating between tools depending on the task. Just Cursor, running OpenAI's Codex models, open in his editor all day.
Paul, Web Developer at hosting.com:
I only use AI in work. I generally try and use the Codex models. I feel they are better for coding, and I don't really like how Anthropic is run as a company, and they close source everything.
You won't read that opinion in most tool comparison articles. But it's the kind of honest preference that shapes real tool choices. Developers don't just pick tools on benchmarks. They pick them on trust, values, and what fits their workflow. The feature Paul uses most isn't code generation:
Recently I have been using Plan Mode a lot, especially if something we've been asked to do is complex and risky. That way we can write down a lot of our reasoning for going down a path, and also highlighting risks.
The biggest value Paul gets from his AI coding tool isn't the AI writing code. It's the AI helping him think through complex changes before touching anything. Planning, risk assessment, documenting reasoning. The stuff that happens before a single line gets written.
We currently don't have any AI rules set up in an Agent.md or in Cursor rules, so I do rewrite a lot of stuff to get more into our way, but this happens over many iterations.
He's not unusual. 66% of developers told Stack Overflow that "almost right, but not quite" is their single biggest frustration with AI code. Close enough to look useful. Wrong enough to cost you time fixing it.
We followed up and asked: "Has AI-generated code ever introduced a bug that made it past review?"
Not yet. We're quite thorough with testing. Edge cases can obviously slip through, but most of our work is on static websites. So most code issues can be easily and quickly resolved even when it's a manually coded bug.
Not a blanket endorsement of AI reliability. Just an observation that good testing practices still catch problems regardless of where the code came from. As you'll see later, the research backs him up on this.
What developers have quietly dropped
Remember vibe coding? Describing an app in plain English and letting AI build the whole thing? Great demo. Less great in production, because nobody on the team can read or maintain what comes out. Stack Overflow found nearly 77% of respondents said vibe coding is not part of their professional development work.
Autonomous agents haven't landed either.
Only 31% of developers currently use them, and 38% say they've got no plans to. The pitch is compelling: an AI that plans, executes, and ships without you hovering over it. The reality is that most developers still want to see what's happening before it goes anywhere near production.
Seems that although developers adopted the assistive features (autocomplete, chat, inline suggestions), they are skeptical or just straight up rejecting the autonomous ones.
There are dozens of AI development tools available right now. However, here are the ones we think are worth knowing about, with honest assessments rather than just bullet point feature lists. If you're looking for broader trends beyond AI tooling, we covered the full web development landscape for 2026 separately.
Cursor
An interesting history, Cursor took VS Code, forked it, and rebuilt the whole thing around AI. Not a plugin. Not an extension. A full IDE where AI is part of every interaction:
Tab completions that predict your next edit
Inline edits triggered by natural language
Composer for multi-file changes in a single prompt
Background agents that can run tasks autonomously
Plan Mode for mapping out complex changes before writing code
The growth has been rapid, with Cursor becoming one of the most talked about AI coding tools in a short space of time. It's earned a strong reputation among developers who want deep AI integration in their editor.
Paul runs Codex models within Cursor rather than the default Anthropic models. He's also aware of the standalone Codex CLI tool but prefers staying in his editor: "I've heard good things about the Codex harness over Claude Code, so maybe if I was to use one of those I'd go that way, but I still prefer to be in my editor, so I'm sticking with Cursor for now."
The catch? Well i June 2025, Cursor moved from a simple "500 fast requests per month" model to a credit-based system. Costs now vary depending on which AI model your request touches and how complex it is. Cursor's own guidance suggests that daily tab completion users usually stay within the included usage, but daily agent users often land around $60 to $100 per month in total usage. Power users regularly blow past the $20/month plan before the month ends.
I use Cursor at work and it helps me a lot, especially Plan Mode. It's not just the writing of code, Plan Mode does help with planning out tasks and understanding risks of making changes, so this has been a big help for me.
GitHub Copilot
Copilot remains the most widely adopted AI coding assistant, and what started as a glorified autocomplete has grown into something more interesting:
Inline code suggestions across all major IDEs
Copilot Chat for natural language questions about your code
Agent Mode, now available to paid users in VS Code
MCP support, rolled out separately to VS Code users
GitHub's coding agent, available to paid Copilot users for autonomous task handling
Workspace-level context awareness
A free tier that's actually useful (2,000 completions + 50 chat or agent requests per month)
A monthly or the Pro plan, Copilot is the best value entry point in this space. Nothing else comes close at that price. Agent Mode is still playing catch-up to Cursor on raw capability, but for most developers who just want solid autocomplete and the ability to ask questions about their codebase, it does the job. Overages on premium requests cost $0.04 each, so at least the pricing surprises stay small.
Cognition AI (the team behind Devin) agreed to acquire Windsurf in July 2025, though the financial terms were not publicly disclosed. The feature that sets it apart is Cascade, which blends the chat interface and the editor into one flow. The AI reads your entire codebase, plans changes, explains its reasoning, and executes while you steer.
It sits at #1 in the LogRocket AI Dev Tool Power Rankings as of February 2026. Windsurf's public pricing now lists Pro at $20/month and Teams at $40/user/month, with a March 2026 move to new self-serve usage-based plans.
Good first AI coding tool if you're getting started with AI-assisted development. The model selection is more restricted than Cursor's, custom rules are more basic, and the community is smaller than Cursor's or Copilot's. The Cognition acquisition also raises questions about long-term direction that nobody can answer yet.
v0 by Vercel
Different category to the others. v0 generates React and Next.js components from natural language prompts. It rebranded from v0.dev to v0.app in August 2025, and its February 2026 update added a new sandbox runtime, Git panel, and database integrations.
The output quality is good. Really good. Components look like they were built by a senior frontend developer with good design taste: proper architecture, Tailwind utilities, shadcn/ui patterns. You can drop them straight into a Next.js project. Vercel now presents v0 as handling UI, backend logic, and team collaboration, so it's broader than the original frontend-only tool.
The limitation: a full-stack generation can burn through your $20/month credits in a handful of prompts. Brilliant for prototyping. Less suited as a primary production workflow.
CodeRabbit (AI code review)
CodeRabbit is an AI-first pull request reviewer. Line-by-line feedback. Most-installed AI app on GitHub. Running on over 2 million repositories. Free for public and open-source repos, with a broader free plan also available. Paid plans start at $24/dev/month (billed annually) or $30 month-to-month.
Not sure we can use AI Code Review because we self-host GitLab behind IP restrictions, not sure our infrastructure team would allow that yet. A couple of YouTubers that I follow rave about CodeRabbit, especially as it's free if your GitHub repo is public and open source.
Worth noting: this isn't just Paul being cautious. Plenty of agencies and in-house teams operate under security accreditations that dictate exactly what software can touch the codebase, and AI tools are still being assessed against those frameworks.
Useful as a first-pass reviewer, especially for teams where PRs sit in queues. Business logic review has limitations since AI review tools work primarily from the diff context, and the self-hosted Git restriction isn't just Paul's problem. Plenty of agencies and studios run setups that cloud-based review tools can't easily reach.
Claude Code
Claude Code scores near the top of benchmarks (80.9% on SWE-bench Verified using Claude Opus 4.5) and is used by about 10% of developers among the newly tracked AI-enabled IDE tools in Stack Overflow's survey, but Paul avoids it on principle. Codex CLI is a terminal-based agent, though Codex also exists via web, app, and IDE routes. Both are legitimate tools, but neither fits how Paul prefers to work.
Pricing comparison: what you'll actually pay
Flat monthly fees are mostly gone. Credits, tokens, request-based billing. Here's what you're actually looking at as of March 2026:
Tool | Free tier | Individual | Team | Best for |
Cursor | Hobby (limited) | $20/mo (Pro) | $40/user/mo | Daily driver for VS Code users |
GitHub Copilot | 2,000 completions + 50 chat/agent requests | $10/mo (Pro) | $19/user/mo | Best value entry point |
Windsurf | Limited free tier | $20/mo (Pro) | $40/user/mo | AI beginners, guided workflow |
v0 by Vercel | $5/mo in credits | $20/mo (Premium) | $30/user/mo | UI prototyping, React/Next.js |
CodeRabbit | Free (public repos + broader free plan) | $24/dev/mo (annual) | Custom | Code review automation |
Sources: Cursor, GitHub Copilot, Windsurf, v0, CodeRabbit. All prices verified March 2026.
Watch the real cost, not the sticker price. Two years ago, $20/month meant $20/month. Now that same plan uses credits that drain faster depending on which AI model your request touches. Cursor's own documentation notes that heavy agent users often end up at $60 to $100 per month. Multiply that across a team of five or ten developers and budgeting becomes a proper headache.
For freelancers, it's annoying. For agencies and teams trying to budget across multiple developers, it's worth running your actual usage for a full month before committing.