DeepSeek

DeepSeek is a Chinese artificial intelligence company that develops large language models and conducts research toward artificial general intelligence. Founded in July 2023 by [[liang-wenfeng]] as a spin-off of the quantitative hedge fund [[high-flyer]], the company is headquartered in Hangzhou, Zhejiang, and is privately held. DeepSeek's models are notable for achieving performance competitive with leading closed-source systems at a fraction of the development cost, using older, export-restricted hardware and novel architectural innovations.

The company's model lineage began with [[deepseek-coder|DeepSeek Coder]] and [[deepseek-llm|DeepSeek-LLM]] in late 2023 and progressed through [[deepseek-v2]] (May 2024), which introduced the [[multi-head-latent-attention]] mechanism that dramatically reduced inference memory requirements. [[deepseek-v3]] (December 2024), a 671-billion-parameter mixture-of-experts model trained for approximately $5.6 million, matched the performance of leading proprietary models. [[deepseek-r1]] (January 2025) demonstrated that advanced reasoning capabilities could be elicited through pure reinforcement learning using [[group-relative-policy-optimization]], and its release as a mobile app triggered a historic one-day selloff in U.S. technology stocks that erased approximately $600 billion from Nvidia's market capitalization alone. The R1 research paper became the first large language model publication to appear on the cover of Nature after peer review. Later releases include [[deepseek-v3|DeepSeek-V3.1 and V3.2]] with hybrid thinking modes and [[deepseek-sparse-attention]]. In January 2026, DeepSeek introduced the [[technologies/mhc-architecture|mHC architecture]], which was adopted in [[deepseek-v4]] (April 2026), a 1.6-trillion-parameter open-weight model. V4 Flash, the smaller variant with 284 billion parameters, went on to become the most-used model on the OpenRouter platform, processing 7.99 trillion tokens in its first month and achieving 96.7% accuracy on the AIME 2026 mathematical reasoning benchmark.

In June 2026, DeepSeek completed its first external funding round, raising approximately 51 billion yuan ($7.4 billion) at a valuation exceeding $50 billion — making it the first external capital the company had accepted since its founding and China's highest-valued AI startup. The round featured a notable investor lineup including Tencent, CATL, NetEase, and JD.com, with founder Liang Wenfeng personally contributing 20 billion yuan. All external investors (except a state-backed fund) were required to invest through a limited partnership managed by Liang, with no voting rights and a five-year lock-up period, preserving his absolute control. The funding shift from DeepSeek's long-standing self-funded model was driven by three pressures: rising computing costs for frontier models, talent retention challenges after the departure of several core researchers between late 2025 and early 2026, and the need to productize technology into revenue-generating services.

DeepSeek's open-source strategy, releasing models under the MIT License, has reshaped global AI competition by making frontier-level capabilities freely available and driving down industry pricing. The company complements this with a paid API service that maintains a significant cost advantage over Western rivals. In May 2026, the company formed a "Harness" coding agent engineering team led by former Jane Street engineer [[cui-tianyi|Cui Tianyi]], marking its strategic move from pure model research into end-user productivity tools. In June 2026, DeepSeek launched an image recognition mode on its web and mobile platforms — its first step into multimodal AI — and began grayscale testing of V4.1, an upgraded version with enterprise tools and MCP protocol support.