AI/ML
DeepSeek AI Versions Breakdown : Everything You Need to Know
Free Installation Guide - Step by Step Instructions Inside!
Overview
DeepSeek has developed multiple iterations of its large language models (LLMs) every iteration with its purpose and improvement. Here, we are going to describe the various versions, their description and github link.
DeepSeek Coder
Release Date: November 2023
Purpose: The first open-source model focused on programming tasks.
Description: DeepSeek Coder is an open source series of code language models built for increasing the code comprehension in software engineering. These models are built from ground up, trained on a dataset that is 87% code and 13 % natural language, amounting to 2 trillion tokens. The training data is in English and Chinese.
Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder
DeepSeek LLM
Release Date: December 2023
Purpose: The first multi purpose model of DeepSeek.
Description: Presenting DeepSeek LLM, a state of the art language model with 67 billion parameters. It has been built from the ground up on a dataset with 2 trillion tokens in English and Chinese. We have also open sourced DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat to encourage research.
Github Repository: https://github.com/deepseek-ai/DeepSeek-LLM
DeepSeek V2
Release Date: May 2024
Purpose: Not indicated, but it is expected to achieve more and cost less to train than its predecessor.
Description: DeepSeek V2 is a sophisticated open-source Mixture of Experts (MoE) model developed by DeepSeek AI. It is capable of delivering maximum output with minimal training inputs and makes inference efficient. The model consists of 236 billion parameters in total out of which 21 billion are activated per token in each processing.
Github Repository: https://github.com/deepseek-ai/DeepSeek-V2
DeepSeek Coder V2
Release Date: July 2024
Parameters: 236 billion
Context Window: 128,000 tokens
Purpose: For difficult programming challenges.
Description: DeepSeek Coder V2 is a high quality open source Mixture of Experts (MoE) coders language model developed by DeepSeek AI for the purpose of reaching GPT-4 Turbo level in all code oriented tasks.
Github Repository: https://github.com/deepseek-ai/DeepSeek-Coder-V2
DeepSeek V3
Release Date: December 2024
Parameters: 671 billion
Context Window: 128,000 tokens
Purpose: Mixture of experts, allowing for versatile task handling.
Description: DeepSeek V3 is the latest version of an open-source Mixture of Experts MoE language model held by DeepSeek AI. This model has been built with a focus on high performance while ensuring the training and inference processes are less tedious. DeepSeek V3 has 671 billion parameters in total and only 37 billion out of the 671 billion will be triggered with every token during processing.
Github Repository: https://github.com/deepseek-ai/DeepSeek-V3
DeepSeek R1
Release Date: January 2025
Parameters: 671 billion
Context Window: 128,000 tokens
Purpose: Advanced reasoning tasks, competing directly with OpenAI's models while being more cost effective.
Description: DeepSeek R1 is an open source reasoning model created by the Chinese AI company DeepSeek. It is aimed at a range of lower cost text tasks, such as logical inference, mathematics problem solving and decision making processes. With the help of this model, DeepSeek has developed a chatbot called DeepThink which positions them as competitors to ChatGPT.
Github Repository: https://github.com/deepseek-ai/DeepSeek-R1
Janus Pro 7B
- Release Date: January 2025
- Purpose: A vision model capable of understanding and generating images.
- Description: JanusPro 7B is a state of the art open source multimodal AI model created by DeepSeek, which performs well in both image comprehension and text to image generation tasks. It has demonstrated better performance in the benchmarks compared to the likes of OpenAI’s DALLE3 and Stability AI’s Stable Diffusion.
- Github Repository: https://github.com/deepseek-ai/Janus
Ready to transform your business with our technology solutions? Contact Us today to Leverage Our AI/ML Expertise.
Comment