[In-depth] Behind DeepSeek's Success: A Profile of China's Most Technical AI Company
"A Shot Was Fired in the AI Price War": Interview with DeepSeek CEO
xyzlabs.substack.com
I am not sure about that. The conspiracist in me says that somebody profited from a panic selling of overvalued AI companies' stock. All the big players in AI: Open AI, Anthropic, Palantir, Microsoft, Tesla, and AI hardware related Nvidia, AMD, SMCI are all vastly overvalued to the tune of trillions. So if you can trigger a panic selling, you can make hundreds of billions in a day. The mechanics of how to create another DeepSeek R1 is well understood for months. The cleverness of what happened is they documented step by step an easy recipe using only open source ingredients that any average computer science undergraduate can follow. You can build your own 'DeepSeek Rx' using a good home PC with 16GB+ vram and Ollama or similar. The recipe is on many GitHub sites. One guy on youtube did just that as a demo. He trained his 1.5B Qwen model for reasoning in just 15 minutes on his PC.
GitHub - chrishayuk/chuk-math
quote:
@HaraldEngels
2 days ago
I am using DeepSeek releases since over 9 months. The results have been great all the time but are getting better and better. I am running locally on my Linux PC all Qwen based DeepSeek R1 models and they are all great. The 1.5B model works fantastic when you are using it in the q16 variant. It is really killer. Inference is not very fast since I am running all models (from 1.5B up to 32B) on my CPU Ryzen5 8600G WITHOUT a dedicated GPU adapter. The CPU uses up to 40GB of my 64GB RAM for the 32B model. With good prompting the results are fantastic and save me hours of work every day. The dynamic memory allocation of the 8600G is great and allows me to run powerful LLMs with a small budget. My PC has cost me $900.