Searched for

AI BENCHMARKS IN MATHEMATICS

Humans outperform AI at this highly rigorous mathematics test
A new AI test called First Proof used unpublished math problems. AI systems were tested against human mathematicians. While AI showed some ...
14 Jun, 2026, 07.36 PM IST
Google unveils Gemma 4, expands lightweight open model lineup for developers
The Gemma 4 model offers capabilities such as advanced reasoning, agentic workflows, coding, and support for over 140 languages. The models...
25 May, 2026, 04.04 PM IST
US stock market crashes today: Dow Jones, S&P 500, and Nasdaq fall hard as bond yields hit 19-year high - Nvidia, Broadcom, AMD, and Amazon lead massive Wall Street selloff
US stock market crashes today as Dow Jones, S&P 500 and Nasdaq plunge. The Dow Jones Industrial Average dropped more than 237 points, while...
19 May, 2026, 09.30 PM IST

- Elon Musk earned more last year than most countries — Tesla's $158 billion pay filing has everyone stunned
  Elon Musk Tesla compensation $158 billion stunned markets with a record-setting payout tied fully to performance. On paper, it is the large...
  01 May, 2026, 07.40 PM IST
- OpenAI launches GPT-5.5 with API pricing starting at $5 per 1 million tokens
  OpenAI has launched GPT-5.5, its most advanced AI model yet, designed for complex, multi-part tasks and acting as an active collaborator. T...
  24 Apr, 2026, 12.09 PM IST
- Researchers unveil ‘Humanity’s Last Exam,’ it's so difficult that today’s AI systems consistently fail it
  Humanity’s Last Exam (HLE) is a groundbreaking 2,500-question assessment created to reveal the limits of advanced AI systems. Developed by ...
  26 Feb, 2026, 08.51 PM IST
- Sarvam unveils two new large language models focused on real-time use, advanced reasoning
  The company said the model is optimised for “efficient thinking”, delivering stronger responses while using fewer tokens — a key factor in ...
  18 Feb, 2026, 02.15 PM IST
- 'Basically zero, garbage': Renowned mathematician Joel David Hamkins declares AI Models useless for solving math. Here's why
  One of the world's biggest mathematicians Joel David Hamkins has slammed AI models used for solving mathematics and called them basically z...
  06 Jan, 2026, 10.20 AM IST
- Anthropic launches new vibe coding model Claude Sonnet 4.5: All you need to know
  Anthropic launched Claude Sonnet 4.5, its latest AI coding model, which it claims has improved coding, reasoning, and mathematical skills. ...
  30 Sep, 2025, 12.57 PM IST
- High school maths trips Olympiad gold medalist AI models: Google Deepmind CEO answers why
  Demis Hassabis, Google Deepmind CEO, notes AI models like Gemini paradoxically struggle with basic math despite acing advanced Olympiad pro...
  13 Aug, 2025, 11.30 PM IST
- Award-winning variant of Gemini's AI model is live, confirms CEO Sundar Pichai
  The Gemini 2.5 Deep Think model has been rolled out to a small group of mathematicians and academics to take early feedback and enhance the...
  01 Aug, 2025, 09.13 PM IST
- Musk says xAI will make kid-friendly app 'Baby Grok'
  Elon Musk’s AI firm xAI is set to launch "Baby Grok," a separate, kid-friendly chatbot app. Announced via Musk’s X post, the move follows e...
  20 Jul, 2025, 09.50 AM IST
- Can AI quicken the pace of math discovery?
  DARPA’s new Exponentiating Mathematics programme aims to accelerate pure mathematics research by developing AI tools capable of high-level ...
  21 Jun, 2025, 10.30 AM IST
- Telegram, Elon Musk's xAI partner to distribute Grok to messaging app's users
  Telegram CEO Pavel Durov announced a partnership with Elon Musk's AI company xAI to distribute its chatbot Grok to Telegram’s billion-plus ...
  28 May, 2025, 10.05 PM IST
- Sarvam AI unveils multilingual LLM; tepid response poses questions over India’s AI chops
  Sarvam AI launched Sarvam M, a 24-billion-parameter model, built on top of French AI company Mistral's model Small. The startup said the mo...
  27 May, 2025, 01.43 PM IST
- Google unveils Gemini 2.5, claims enhanced reasoning and coding capabilities
  Google describes Gemini 2.5 as a “thinking model”, capable of reasoning through its processes before responding, leading to enhanced accura...
  25 Mar, 2025, 11.46 PM IST
- Alibaba shares surge after it unveils reasoning model
  Qwen, the ecommerce leader's artificial intelligence unit, said on X that its QwQ-32B, with 32 billion parameters, can achieve performance ...
  06 Mar, 2025, 01.17 PM IST
- AI war heats up: OpenAI’s GPT-4.5 tops charts, but Elon Musk says its reign won’t last — Is xAI ready to strike?
  Elon Musk has responded to a report by AI benchmarking platform LMArena, which ranked OpenAI’s GPT-4.5 as the top-performing AI model acros...
  04 Mar, 2025, 08.30 PM IST
- Grok to be a standalone app for MacOS, Windows
  The chatbot operated by Musk's AI company xAI has been in the news for releasing its latest version, Grok 3, which boasts to be the world's...
  19 Feb, 2025, 04.42 PM IST
- Alibaba releases QwQ-32B-Preview, an AI rival to OpenAI's o1
  This model is focused on advancing AI reasoning capabilities. In contrast to most AI, QwQ-32B-Preview and similar models can fact check the...
  28 Nov, 2024, 04.33 PM IST
Load More