The Tester Ai - Can AI do this?

Ai robots being tested_edited_edited.jpg

Can AI do this? We test it.

Work related AI tests

To all AI tests

Home related AI tests

Can AI gather financial information?

Can AI gather financial data reliably? I tested Gemini, ChatGPT, and Claude on pulling key metrics for major tech companies. All delivered usable results with solid tables—but with some caveats around reporting periods and market cap accuracy. Verdict: AI works well here, but still needs human oversight.

Work related AI tests

Apr 27

AI robot editor is writing a press release. AI is really good at writing press releases but human oversight is still important.

Can AI write a press release?

Press releases are structured, predictable, and language-heavy—exactly where AI excels. Testing Gemini and Claude confirmed it: both produced high-quality drafts with minimal effort required. Human review still matters, but AI gets you most of the way there instantly.

Work related AI tests

Apr 21

Can AI do my Geometry homework?

I tested Gemini, ChatGPT, and Claude on simple 5th-grade geometry worksheets. Claude delivered perfect, consistent results across both tests, while Gemini and ChatGPT stumbled on triangles. ChatGPT recovered after a retry, but Gemini doubled down on errors. The takeaway: AI can solve geometry, but reliability is still a real issue.

Home related AI tests

Apr 16

Can AI Replace a Controller?

Can AI replace a controller? I tested ChatGPT on a simple parent-subsidiary scenario. It looked confident—but failed key steps, double-counted equity, and broke the balance sheet. It eventually fixed itself after multiple prompts. Verdict: AI still can’t replace a controller.

Work related AI tests

Apr 9

Can AI do your math homework? I guess it can!

Can AI do my math homework?

Can AI handle basic math? I tested ChatGPT, Gemini, and Claude on 5th-grade word problems. All three delivered perfect results—accurate answers, clear explanations, and zero errors. A simple test. Results can still vary

Home related AI tests

Apr 7

Unicorns in a pasture with San Francisco in the background. 4 different AI chat depictions of the same prompt. Gemini, Grok, Claude and Chat GPT.

Image AI Test: same prompt, different chats

Simple AI image test: ChatGPT, Claude, Grok, and Gemini all given the exact same prompt with no optimization. The differences were striking from cartoonish interpretations to near-photorealistic scenes revealing each model’s instincts, strengths, and blind spots right out of the box.

Home related AI tests

Apr 5

The Tester AI Score card of evaluating AI performance

How We Score AI

The Tester AI scoring explained: At The Tester AI, every test is built around a simple principle: Can AI actually do the job—not just in theory, but in practice? Each test is evaluated across five core categories, scored on a scale of 1 to 5: Output Delivered – Did the AI complete the task? Accuracy – How correct was the result? Quality – Is the output usable in a real-world setting? Ease of Use – How much effort, prompting, or iteration was required? Reliability – Was the be

Work related AI tests

Apr 4

Chat GPT, Gemini and Claude failed at turning a trial balance into a P&L and Balance sheet. AI can't replace accountants just yet.

Can AI replace an accountant?

Can Chat GPT, Gemini or Claude replace accountants? Can AI turn a simple trial balance into a P&L and Balance Sheet? Verdict: Chat GPT showed a low effort, Gemini didn't even try and Claude tried hard but failed harder.

Work related AI tests

Apr 4

Can AI create a logo kit for my site?

Test: Can Gemini AI create a logo and full logo kit from an existing style? It generated a solid concept but failed on execution, no true transparent PNGs, inconsistent outputs, and repeated errors. ChatGPT partially fixed it but wasn’t reliable. Final score: 3/5.

Work related AI tests

Apr 3

Work related AI tests

To all AI tests

Home related AI tests