Can AI do my math homework?
- Niv Nissenson
- 4 days ago
- 2 min read
Updated: 2 days ago

5th grade basic level (U.S.) .
Test Intro:
Probably one of the first use cases for computers was helping with math. I imagine many resisted the introduction of the first calculators in the same way people might have lamented the printing press killing of the scribing profession. And now we're in the era of AI... Let's see how far math solving technology has come.
The test: Can AI reliably solve basic 5th grade level math problems.
Success criteria:
Correct answers on all questions
Clear, step-by-step explanations
Proper use of equations alongside verbal reasoning
AI Tested: Gemini, ChatGPT and Claude
Weighted result: Gemini 5/5, ChatGPT 5/5, Claude 5/5
Verdict: AI can do your math homework
Execution:
Setup: I downloaded a couple of math worksheets from Worksheetskids.com and math-salamanders.com (verbal multiplication problems and verbal additions/subtraction problems)
Easy presentation verbal multiplication AI test
Prompt: Please solve the following verbal math questions. The solution should be presented verbally and with the equation

All three chats chewed this up with no hitches and provided clear precise answers with explanations.
ChatGPT solution:

Nuanced presentation verbal additions/subtractions AI test
I then wanted to see if the AI can handle a mathematically easier task but with a presentation that may be more complicated for an AI to figure out with an info table and more nuanced phrasing.

All three models:
Parsed the data correctly
Understood the context
Solved everything instantly
No hallucinations. No confusion.
Gemini:

Claude:

Model | Result |
Gemini | ✅ 5/5 |
ChatGPT | ✅ 5/5 |
Claude | ✅ 5/5 |
While all 3 models were excellent if I had to choose one I would write that Claude was the most concise.
Verdict: AI can do your math homework (at least 5th grade level).
Can a human do it better?: No, but a human has to know how do this level of math well.
Caution: I would caution that even though this AI test result was spotless, AI’s built in non-deterministic outcome design may cause results to vary so you should always spot test the answers.
For a follow up test I'll look to Geometry.
Full score card:
Category | Gemini | ChatGPT | Claude | Notes |
Output Delivered | 5 | 5 | 5 | Exact and precise |
Hallucinations | 5 | 5 | 5 | No hallucinations reported |
Quality | 5 | 5 | 5 | Full responses |
Ease of Use | 5 | 5 | 5 | Took single prompt, quick response |
Reliability | 5 | 5 | 5 | All answers correct |
Bottom line | 5/5 | 5/5 | 5/5 | Excellent |


