OpenAI Codex Comparison!

vs Cursor(auto) vs Claude Code vs a local Granite3.3 model!

Apr 17, 2025

OpenAI just dropped OpenAI Codex, and naturally I had to try it out. I compared it with Claude Code, Cursor, and of course a local LLM through Ollama(IBM Granite 3.3).

I crafted a challenge prompt that tested three things in one go:

Calls a public API
Loads an image
Builds a one-shot GUI app

The GitHub repo for all examples is here — but here's the breakdown!

Claude Code

The first big AI vendor CLI agent to go mainstream — it got traction for being fast, vibe-friendly, and having codebase-wide visibility.

Getting Started:
(screenshot/code snippet placeholder)

Result:
✔️ Worked smoothly
✅ Clean interface
🟰 Moderate code complexity

Result:

OpenAI Codex

🖼️ It generated a cat image!
❌ But the nose-click didn’t work on first try.
↩️ It did fix itself after a prompt nudge — nice!
⚙️ Most concise code of the bunch — promising for long-term efficiency.

✍️ Note: The repo had updates 9 minutes ago — this team is working fast!

Result:

Cursor

Still my go-to tool for daily coding — Cursor just feels natural. I let it auto-choose the model.
✔️ Functional app
👻 Cat nose button was hidden — oops on UI matching
📏 Longest code — over 70 lines!

Getting started:

Result:

Local AI Model (Granite 3)

I couldn’t resist grabbing the latest IBM Granite 3.3 (9B) model and try to have it create something too! Of course I had to copy and paste the code to get it to run, but…

Getting Started

Results

Sadly, probably because it’s not an Agentic coding system, it wasn’t able to get the cat image in there, nor was it able to retrieve a joke. I have had better luck

Conclusion

I love the idea of CLI coding systems, but for most things still prefer seeing the changes so I can better learn how it’s all working in Cursor. But, as agents get more complex, CLI is probably going to be a very efficient way to go as well. Excited to see what’s up ahead.

Run Your Own AI

Discussion about this post