AI Chatbots Miscalculate US Tax Returns Routinely

With the April 15 filing deadline nearing, tax experts and tests warn against using general-purpose AI chatbots like xAI’s Grok to prepare or review tax returns due to accuracy and privacy concerns. New York Times testing and TaxCalcBench found chatbots miscalculate refunds by an average over $2,000 and often score below 50% on full returns. Practitioners are advised to use purpose-built tax software and avoid pasting W-2s or 1099s into chatbots.
Key Points
- 1Finds general-purpose chatbots miscalculate refunds by average more than $2,000 in NYT tests
- 2Highlights privacy risk as major providers may use user inputs for model training without default opt-outs
- 3Advises practitioners to prefer purpose-built tax software and avoid submitting W-2s or 1099s to chatbots
Scoring Rationale
Timely, well-sourced consumer warning on LLM tax risks; limited novelty beyond existing accuracy and privacy concerns.
Sources
Public references used for this report.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

