Integrating AI into Your Development Workflow: Part 3 - Managing Cost and Complexity
Author’s Note: I’m a Director of Engineering who’s spent the past year exploring AI coding tools with my team, and personally on nights and weekends. I’m not an AI researcher, or a full-time coder these days, but I’ve logged many hours tinkering with these assistants, and discussing pros and cons with developers on our codebase. This guide shares what I’ve learned - the good, the bad, and the practical - for busy developers who want to get up to speed with AI in their workflow.
Why This Matters
AI assistants can save serious time, but not if you spend 20x the tokens you need, or wind up with half-working code that takes longer to clean up than write. If you want long-term success using AI, the key is sustainable use.
1. Choose the Smallest Capable Agent
Don’t always default to the biggest or smartest model. For most developer tasks, fast and cheap is good enough.
- Use Grok or Haiku for:
- Small file edits
- Adding tests
- Explaining code
- Scaffolding
- Use Composer or Sonnet when:
- Multiple files are involved
- You’re dealing with subtle logic
- You need more architectural reasoning
- Use Codex (in ChatGPT) when:
- You want to offload a multi-file task (and don’t want to burn Cursor tokens)
- You already have a ChatGPT Pro subscription and the context isn’t too large
2. Know When to Use MAX Mode
MAX Mode in Cursor unlocks a higher-context version of the Claude model. It brings in more from your repo, searches your cache more aggressively, and allows the model to reason with broader awareness.
Why that matters:
- In larger or more tangled codebases, MAX can reduce the need for repeated context-setting.
- Even though it consumes more tokens per request, it often results in fewer requests and better answers, saving time and total cost. This isn’t theoretical, we’ve seen this in practice: MAX generally ends up cheaper per request than standard mode on large, complex codebases.
Use MAX when:
- Your task spans multiple files or layers
- You’re debugging something that keeps slipping through smaller models
- You want to understand an unfamiliar system without feeding file after file
- You are working in a large, complicated codebase and struggle to keep context sizes manageable
Avoid MAX when:
- You’re working on a small fix or change
- You want to minimize spend and already know what needs doing
- You need total control over which files are loaded
Like most advanced tools, MAX works best when used intentionally; not by default, but as a fallback when standard context runs out of runway.
3. Monitor and Manage Spend
Cursor and other tools often give usage stats. Make them part of your routine.
- Track model usage. Cursor tracks this per session. If you’re burning through Sonnet tokens, switch models.
- Use ChatGPT for planning. If you already pay for it, Codex + ChatGPT can offload a lot of thinking and planning work.
- Use Cursor for targeted execution. It’s better at diffs, tests, and applying structured changes.
4. Understand the Architecture Risk
Large monoliths with organic growth and less than ideal domain boundaries can present real limitations:
- Tight coupling means agents can’t easily isolate scope and keep context at a reasonable size.
- Poor boundaries can cause unintentional breakage across modules.
- Even smart models will sometimes hallucinate connections that don’t exist.
These problems challenge AI in the same ways they challenge developers. As these architectural concerns are addressed, the quality of AI assistance will go up, especially for tasks with clearer contracts.
So for now:
- Scope your asks tightly.
- Stick to one file or one module at a time.
- Avoid letting agents touch too many areas unless you’re reviewing every line.
5. Review All AI Output
No matter which model wrote the code:
- Assume it has bugs.
- Assume it missed edge cases.
- Assume it ignored some context.
AI is a time-saver, not a QA replacement; just like any other developer on your team, it will make mistakes. Code generated by agents should be reviewed just as rigorously as any PR from a developer on your team. Part of that review needs to be that the code was written to be testable.
Of special note here are tests. Without guidance and iteration, AI engines can often produce brittle, flaky, or even non-sensical tests, giving developers a false sense of security in their code base and regression protection. This is another area where you only get out what you put in - make sure that you are clear in your explanations for how to test, and that you review the actual tests being written. I’ve also had great success in making sure that tests are reviewed by a different agent to help highlight some of these questionable practices that I may have missed.
6. Build Reusable Workflows
Once you’ve figured out:
- which agents to use when
- which prompts consistently get good results
- which types of requests are risky
…capture it. Document your prompt templates. Share them in Notion. Create “agent recipes” your team can reuse.
This is how AI becomes a team tool, not just a solo experiment.
7. One Last Practical Tip
Before running an expensive request, ask:
- Can I do this in two smaller prompts?
- Can I pre-generate a plan with ChatGPT first?
- Is the scope tight enough for a cheap agent?
Parting Words…
I want to finish by reiterating something I said in Part 1 - using AI for coding is FUN! Focus first on how to get it to do the things you enjoy the least about being a developer - that might be documentation, test case analysis, writing fully fleshed out specs - and go from there. AI is a true enabler, it will help you move faster, it will support you along the way, and take a lot of burden off your shoulders; if you use it properly. There is a learning curve, but it is worth the deep dive, and it will provide value from the very first step if you’re intentional about it, and go in with the right mindset. It’s not magic, you still need to do work to get the best out of it, but the results are worth the journey.
It’s not about replacing developers, it’s about freeing them up to do better work, faster, and with more enjoyment.