Cursor AI, a programming tool designed to generate and fix source code, has experienced a significant issue with its AI support bot. The bot, which is intended to assist users with queries, hallucinated a policy limitation that doesn’t actually exist, causing confusion among users.
The Incident
Users of Cursor AI reported being unable to maintain multiple logins for the subscription service across different machines. When they inquired about this issue, they received a response from the support email that suggested this was expected behavior. However, it was later revealed that the response was generated by an AI support bot, not a human.
Response from Cursor AI
Michael Truell, co-founder of Anysphere, the company behind Cursor AI, acknowledged the issue and apologized for the confusion. He stated that there is no policy limiting multiple logins and that users are free to use Cursor on multiple machines. Truell explained that the AI support bot had provided an incorrect response and that the company was investigating the matter.
Investigation and Resolution
Truell revealed that the issue was related to a change made to improve session security, which may have caused problems with session invalidation. He also announced that any AI responses used for email support would be clearly labeled as such moving forward. The session logout issue has since been fixed, and it was determined to be the result of a race condition that occurred on slow connections.
Expert Insights
Marcus Merrell, principal technical advisor for Sauce Labs, commented on the incident, highlighting two main problems with the support bot: hallucinations and non-deterministic results. He emphasized that for a support bot, such issues are unacceptable and that simply labeling AI-generated responses as such may not be enough to recover user loyalty.
Broader Implications
The incident highlights the ongoing challenge of AI hallucinations in AI development. As noted in Nature, hallucinations cannot be stopped but can be managed. The HuggingFace AI model repository documents this phenomenon in its Hallucination Leaderboard, comparing how different AI models perform on various benchmark tests.