Alchemy RecipeAdvancedworkflow

Cursor vs Claude Code on a 40-file TypeScript refactor: what actually happens

A real-world comparison on a moment.js to date-fns migration across 40 files. Where Cursor's codebase index wins, where Claude Code's explore-and-edit loop wins, and where both quietly miss things you have to catch.

Time saved
Saves 3-8 hrs on large refactors
Monthly cost
~Cursor £16 / $20 flat, or Claude Code API metered/mo
Published

You have a TypeScript codebase with 40 files importing moment. Your bundle analyser is shouting about moment's 70 KB footprint and your lead has blessed a migration to date-fns, which tree-shakes properly. The call patterns vary: some files use moment().format('YYYY-MM-DD'), some chain .add(7, 'days'), some use moment for timezone conversion, some only for duration maths. You can't grep-and-replace your way through it because every usage needs a different date-fns function and some need an import for formatInTimeZone from date-fns-tz.

This is a real refactor, the kind that takes two afternoons if you do it by hand and 20 minutes if an AI tool handles it properly. Cursor and Claude Code are both pitched at exactly this job. I ran the same refactor through both and watched how they differ. The short answer: they are good at different parts of it, and picking between them comes down to which failure mode you would rather catch.

What each one actually does

Cursor is a VS Code fork with an AI layer baked into the editor itself. It indexes your whole codebase locally with embeddings when you open the project. When you open Composer (cmd-I) and describe a multi-file change, it uses the embedding index to find the relevant files, proposes a diff across all of them in one view, and waits for you to accept or reject. You see the full blast radius before anything is written.

Claude Code is a terminal agent. It does not pre-index your codebase. When you give it a task, it starts with Glob and Grep to find the relevant files, Reads the ones it thinks matter, and then Edits them one at a time. You see each edit as it happens, with the ability to approve or reject individually. For a 40-file refactor, this means it is going to make 40 or more tool calls before it is done.

Both of them are ultimately driven by Claude (Cursor routes to several backends but most users are on Claude under the hood). So the underlying reasoning is similar. What differs is how the tool gives the model access to your files and how it shows you the changes.

How the moment.js refactor played out in each

Cursor

I opened Composer, picked the project root as the context, and typed: "Replace all moment usages with date-fns. Import the specific date-fns functions needed per file. Keep behaviour identical. Update any test fixtures."

Cursor took about 30 seconds to think, then showed me a single Composer diff covering 37 of the 40 files. It had found the relevant imports across the codebase using its embedding index, proposed the date-fns equivalents, and grouped the changes by file. The missing three files were utility helpers that imported moment transitively via a barrel export, which the embedding retrieval had missed. Composer presented all 37 files as a single atomic change I could accept or reject.

The wins: the review experience was excellent. Seeing all 37 files of proposed changes in one tab, with clear per-file sections, meant I could spot the mistakes quickly. It caught that moment().startOf('day') should become startOfDay(new Date()) and not startOfDay(Date.now()), which would have been wrong. It also caught the date-fns-tz imports needed for two files that did timezone conversion.

The misses: three files were silently not included. This is the thing to watch for with Cursor. Because the retrieval is embeddings-based, if a file does not textually match the query tokens well, it can get skipped. I had to run grep -rn "moment" src/ afterwards to catch the three it missed, then re-run Composer with those paths explicitly added.

It also produced the wrong replacement for one specific pattern: moment.duration(ms).humanize(). Cursor replaced it with formatDistance which returns the distance between two dates, not a humanised duration. date-fns has no direct equivalent for duration.humanize(), so the correct answer is either to keep a small helper or use the humanize-duration package. Cursor did not flag this as uncertain, it just wrote something that compiled but did not do the same thing.

Claude Code

Same refactor, same codebase, same prompt. Claude Code started by running grep -rn "from 'moment'" src/ and grep -rn "require('moment')" src/, got the full list of 40 files, and began reading them one at a time. After reading each file, it either edited it inline or added a note to a plan it was building.

Three things stood out:

First, it caught all 40 files because it used grep, not embeddings. If moment is imported, grep finds it. No retrieval gap.

Second, when it hit the .humanize() pattern, it stopped, wrote a comment in its plan that said "date-fns has no direct equivalent for moment.duration().humanize(). Options: (1) install humanize-duration, (2) write a 10-line helper. Recommending option 2 for bundle size, proceeding with helper.", and then wrote the helper and used it. It flagged the uncertainty instead of silently guessing.

Third, it was slow. Reading 40 files and editing each one took about 8 minutes of real time, and the API calls added up. Cursor took 30 seconds because the embedding index was already built.

The wins: completeness and willingness to flag uncertainty. Claude Code's explore-and-edit loop means it will not miss a file that grep can find, and it tends to ask or note when it hits an ambiguous case.

The misses: speed, and the review experience is worse. You are watching edits land one at a time in your editor. There is no "see all 40 changes as a diff then approve" step, so if you realise at file 25 that the first 24 had a systematic mistake, you are now unwinding via git rather than rejecting a proposal.

Cost over the project

Cursor is a flat £16/mo (about $20) for Pro. For a TypeScript developer doing at least a few refactors a month, this pays for itself in one refactor. The flat fee means you do not think about cost per task.

Claude Code is metered. A 40-file refactor with reads and edits typically runs 200k to 400k tokens end-to-end, which at Sonnet pricing is somewhere between 60p and £1.20 per refactor. Over a month of heavy use, a developer running 20 large tasks might spend £15-30, similar in the end to Cursor. But the cost is visible per task, which can be off-putting even when it is cheaper overall.

When to pick which

Pick Cursor when

  • You want to see the full proposed change before anything is written
  • You are doing a refactor where speed matters (under a minute vs several minutes)
  • You work primarily in an IDE and want the AI embedded there
  • You prefer flat monthly pricing

Pick Claude Code when

  • You want completeness over speed, and the refactor touches files grep can find but embeddings might miss
  • You want the tool to flag ambiguous cases rather than silently guess
  • You are comfortable in the terminal and work across a repo, not just within one folder
  • The task involves running commands (tests, builds, linters) as part of the refactor loop
  • You are doing something the IDE isn't the natural home for (CI config changes, multi-package monorepo coordination, migrations that touch generated files)

The thing neither of them will do for you

Both tools are good at the mechanical rewrite. Neither one will tell you whether the refactor is a good idea, whether date-fns actually saves you the bundle weight you hoped, or whether your test coverage is sufficient to catch regressions. Run your bundle analyser before and after. Run your full test suite. Read the diff yourself. The AI tool is taking away the typing, not the thinking.

A snapshot caveat

This comparison ran in Q1 2026 on a single codebase. Both tools ship weekly and the exact failure modes above may already be fixed by the time you read this. Cursor Composer's retrieval has improved in every release since I started using it. Claude Code's speed on bulk edits is noticeably better than when it launched. If you are picking today for serious production use, run the same refactor through both on a throwaway branch before committing to either. The specific numbers matter less than the question of which failure mode you would rather be debugging at 5pm on a Friday.

More Recipes