Grok Defeated by Hockey
I will start with the link to the conversation with Grok first https://x.com/i/grok/share/bpcSSPTfCFwVmi8HSppEv1VZM
I asked Grok “List all of the NHL players currently in the playoffs that have played for the Detroit Red Wings” and it churned a bit and came up with this list:
WRONG. The Canucks, Blackhawks and Rangers are not in the playoffs, no idea where it hallucinated that from, given that the playoffs have been going for over a week; it made a bad assumption.
I specifically noted that Glendening and I asked about him, spelled it wrong, my fault. So, I asked the question again, got a DIFFERENT answer:
Where are Veleno, Kane and Mrazek? Granted the teams aren’t in the playoffs, but it DID change its answer and got the right teams in the playoffs the 2nd time, so maybe it learned something? It seems like it might have as its selection of teams was correct as opposed to hallucinating teams that weren’t in because the first time it went with “Common teams include…”
I asked it again about Glendening and Grok decided to add him into its final final final answer:
So, overall a mixed bag. The 2nd time I asked it I told it to “start over” since I’d bungled my spelling of Glendenning and it gave me the “correct” answer but it totally blew the context that I was talking about hockey. I’ve noticed that AI can be VERY bad about understanding context, see my note below about context. I’m almost willing to give it a pass on this, but not quite since the answer was out of context. I also have no idea if the 3rd final answer is even correct, since I had to tell it to add in Glendening. If I had to give it a letter grade, it’d be a D, maybe even an F since I’m not going to go through the rosters of 16 hockey teams and see who has played for my used-to-be-mighty Red Wings.
Both Gemini (Google) and Copilot (Microsoft) AIs wouldn't even attempt to answer the question.
Context note:
I have Gemini AI on my phone and it’s also hit or miss (mostly miss.) I asked it to “Text wife” and it asked “What do you want to send?” and I said “Please start a pot of coffee in 15 minutes.” It replied “I’m sorry, I can’t control kitchen appliances yet.”