Grok Defeated by Basic Math
On 14Mar2025, around 5:30pm, I went out to dinner and it was a windy, chilly and rainy evening. When I got to the parking lot, I wondered what the temperature and windspeed was (yeah, I’m a nerd.) 52F, 10mph wind at the airfield near where I was at. I asked my Android phone to do the math for windchill and it came up with 43F, which seemed reasonable.
When I got home, I thought I’d see what different sources would say about the windchill. First, I asked Grok (X’s AI assistant) and it came up with an astonishing 34F. I tried the NOAA’s online windchill calculator (https://www.weather.gov/epz/wxcalc_windchill) and it wouldn’t work with 52F, so I changed the temp to 50F and got a result of 46F, which seemed more reasonable.
I asked Gemini, Google’s AI assistant ant it came up with 40F, a very different result than NOAA’s.
So, I wondered why there was such a radical difference between the three methods. So, I used the formula provided by NOAA at their link and started plugging the numbers and noticed a glaring error. The formula uses W0.16 (or W^.016 if you prefer) and I got approximately 1.44 on my calculator, Grork a staggeringly different number, 2.5119. I even challenged Grok and it STOOD BY ITS WRONG ANSWER. Then I asked ti calculate 10^0.16 separately and it got 1.44, which is correct. I then told it that it’d used the wrong numbers in its windchill calculation and it got the right answer (compared to NOAA’s 46F).
So, what does this tell me?
I already had a healthy skepticism about AI’s abilities, but this goes beyond cementing my doubts turning it into a full-blown mistrust of AI. If AI can’t do simple math, then it shouldn’t be relied upon for anything more complex and it definitely cannot be relied upon to make decisions where lives or real money is at stake. I wouldn’t want an AI making medical decisions or diagnoses for me or anyone I cared about if it can’t get easy things correct.
Notes:
Here is the link to the Grok chat, I have it saved as plain-text in case Grok nukes it later: https://x.com/i/grok/share/BiJO90hN48Aer5k8DLPKOyCUL
I use the term AI here instead of LLM (Large Language Model) because AI is more colloquial and most non-IT/tech people will recognize the term AI instead of LLM.
I wrote up this webpage using Google docs, so the formatting might be a bit wonky.