DeepSeek's AI models explained: Find out how V3, R1, Janus-Pro-7B function

DeepSeek unveiled its new multimodal AI model Janus-Pro-7B on Monday. This model can handle various types of media, including images.

AP
Chinese AI startup DeepSeek is making waves with its latest trio of AI models—V3, R1, and Janus-Pro-7B, as per reports. DeepSeek is pushing the boundaries of artificial intelligence while challenging Silicon Valley's dominance. Here is how these models function.

DeepSeek AI V3 Model

DeepSeek's V3 model was launched in late December. According to NBC News, the V3 model was developed with limited computing power due to restrictions on the US export of top-tier chips. V3 proved that AI progress doesn't always require billions in investment or the most advanced hardware. V3 took just two months and less than $6 million to develop, a fraction of what American tech giants spend on similar projects, according to NBC News. This model was performing on par with Claude 3.5 Sonnet, added the report.

Also Read : Single’s Inferno Season 4: Part 3 release date, time, episode schedule and where to watch


DeepSeek R1 Model

Building on V3's success, DeepSeek released its next major model, R1 in January. R1 is powered by the V3 large language model, reported Business Plus. It adds advanced reasoning capabilities, particularly in logical inference and problem-solving. With its ability to articulate reasoning before providing answers, R1 can tackle complex tasks like math equations and decision-making, as per the Business Plus report. This model has outperformed OpenAI's o1 in benchmarks such as the American Invitational Mathematics Examination (AIME) 2024 and the Chatbot Area leaderboard at UC Berkeley, according to the report. The open-source nature makes it a game-changer, allowing anyone to study, replicate, or improve upon its capabilities—potentially shaking up the entire AI revenue model, reported NBC News.

DeepSeek's Janus-Pro-7B Model

After R1's success, DeepSeek launched Janus-Pro-7B on Monday, according to NBC News. It is a multimodal model that can process a variety of media types, including images. Janus-Pro-7B takes things a step further by combining the power of unified models with task-specific performance, outpacing previous unified models and challenging specialized ones, as per the report.

Also Read: Meta threatened by DeepSeek? Mark Zuckerberg scrambles war rooms of engineers to figure out how the Chinese startup outsmarted AI titans at a fraction of the price

FAQs


ADVERTISEMENT
Q1. How much did DeepSeek spend on building the V3 model?
A1. DeepSeek developed the V3 model in just two months and spent less than $6 million to develop, a fraction of what American tech giants spend on similar projects.

Q2. What is different in DeepSeek’s Janus-Pro-7B model?
A2. Janus-Pro-7B is a model that can handle different types of media, like images. It combines the flexibility of general models with the accuracy of specialized ones, making it faster and more efficient than earlier versions.
Download
The Economic Times Business News App
for the Latest News in Business, Sensex, Stock Market Updates & More.
Download
The Economic Times News App
for Quarterly Results, Latest News in ITR, Business, Share Market, Live Sensex News & More.
READ MORE
ADVERTISEMENT

READ MORE:

LOGIN & CLAIM

50 TIMESPOINTS

More from our Partners

Loading next story
Business News › News › International › US News › DeepSeek's AI models explained: Find out how V3, R1, Janus-Pro-7B function
Text Size:AAA
Success
This article has been saved

*

+