DeepSeek
DeepSeek is a Chinese artificial intelligence company that develops large language models and conducts research toward artificial general intelligence. Founded in July 2023 by [[liang-wenfeng]] as a spin-off of the quantitative hedge fund [[high-flyer]], the company is headquartered in Hangzhou, Zhejiang, and is privately held. DeepSeek's models are notable for achieving performance competitive with leading closed-source systems at a fraction of the development cost, using older, export-restricted hardware and novel architectural innovations.
The company's model lineage began with [[deepseek-coder|DeepSeek Coder]] and [[deepseek-llm|DeepSeek-LLM]] in late 2023 and progressed through [[deepseek-v2]] (May 2024), which introduced the [[multi-head-latent-attention]] mechanism that dramatically reduced inference memory requirements. [[deepseek-v3]] (December 2024), a 671-billion-parameter mixture-of-experts model trained for approximately $5.6 million, matched the performance of leading proprietary models. [[deepseek-r1]] (January 2025) demonstrated that advanced reasoning capabilities could be elicited through pure reinforcement learning using [[group-relative-policy-optimization]], and its release as a mobile app triggered a historic one-day selloff in U.S. technology stocks that erased approximately $600 billion from Nvidia's market capitalization alone. The R1 research paper became the first large language model publication to appear on the cover of Nature after peer review. Later releases include [[deepseek-v3|DeepSeek-V3.1 and V3.2]] with hybrid thinking modes and [[deepseek-sparse-attention]], and [[deepseek-v4]] (April 2026), which at 1.6 trillion parameters is the largest open-weight model available.
DeepSeek's open-source strategy, releasing models under the MIT License, has reshaped global AI competition by making frontier-level capabilities freely available and driving down industry pricing. The company complements this with a paid API service that maintains a significant cost advantage over rival providers. In May 2026, DeepSeek made its V4 Pro price cut permanent, making the model 7 times cheaper on inputs and 17 times cheaper on outputs than comparable Western frontier models. In June 2026, Tencent Cloud became the first major Chinese cloud platform to align pricing with DeepSeek's official rates, slashing V4-Pro cache-hit prices by up to 97.5 percent. By late May 2026, the V4 Flash model had become the most-used model on the OpenRouter platform, and three DeepSeek models ranked in the top nine by token consumption.
In April 2026, DeepSeek began its first external funding round. By June 2026, the round had reached $7.4 billion at a valuation between $52 billion and $59 billion — among the largest first-time financing rounds in technology history. Participants included Tencent, CATL, and the state-backed National AI Industry Investment Fund, with founder Liang Wenfeng personally contributing approximately $2.94 billion. The funding shift from DeepSeek's long-standing self-funded model was driven by rising computing costs, talent retention challenges, and the need to productize its technology. In May 2026, the company confirmed the formation of a "Harness" coding agent engineering team, marking its strategic move from pure model research into end-user productivity tools.
DeepSeek maintains a small team that grew from around 160 employees in 2025 to approximately 317 by April 2026, with a flat, academy-inspired culture that prioritizes talent density. The company's models have drawn scrutiny for systematic content censorship of politically sensitive topics and for alleged involvement in model distillation disputes with U.S. AI firms, which the company formally addressed in the peer-reviewed supplementary materials of its R1 paper. The V4 release included a significant technical investment in reducing dependence on NVIDIA's ecosystem: over 200 CUDA operators were rewritten for Huawei's Ascend platform over 30 person-years, and the model achieved native FP4 inference on the Ascend 950 series.