তথ্যপ্রযুক্তি

A New Revolution in the AI Race: Deepseek vs. America’s Dominance!

Share
Share

Rafiul Sabbir

America has left no stone unturned in flexing its power to compete with China. China was about to capture the networking-related market through Huawei—banned. TikTok became the top social media choice among American youth, but since it’s not an American company and thus beyond their monitoring and informational control—banned. Meanwhile, China has also banned most popular American apps in their country, and they have Chinese versions of practically every app.

But America’s biggest blow to China has come over semiconductors. To ensure China doesn’t advance in the semiconductor (and AI) race, American giants (Nvidia, AMD, Intel) are banned from selling chips to China. On top of that, to prevent China from manufacturing their own chips, extensive restrictions have been placed on TSMC (Taiwan Semiconductor Manufacturing Company Limited), the world’s largest chip manufacturer in Taiwan, and the Dutch company ASML, which produces the most sophisticated chip manufacturing technology. In short, America is using every tool at its disposal to block China’s progress.

Of course, China isn’t lagging behind either; they’ve set up shell companies to buy chips from these American companies in other countries, brought them into China, and through reverse engineering, managed to crack a lot of things. They’ve turned reverse engineering into an art form. But this time, what they’re doing might be the biggest and most impactful feat they’ve accomplished so far.

A Chinese startup has developed an LLM (Large Language Model) that outperforms contemporary LLMs (like ChatGPT, Claude, Llama), and the company only spent $6 million over two months developing it. In contrast, the leading LLMs mentioned above have each cost several billion dollars, meaning they’ve built a superior model at a fraction of the cost of American models.

The model is called Deepseek-R1, and the company behind it is Deepseek.

There are many reasons why Deepseek is causing such a stir.

1) As I mentioned above, since Nvidia can’t sell their latest GPUs in China, Deepseek is training its model using older Nvidia chips. Instead of focusing on hardware optimization, they’ve optimized the software—writing code in such a way that it consumes less memory and fully utilizes the capabilities of older chips.

Since there’s no opportunity for hardware-level optimization, they’ve tackled it through application-level optimization.

2) Normally, when training an AI model, everything is updated—even the parts that aren’t being used—which results in massive resource loss. Deepseek is pioneering here. Rather than updating everything, they only update what’s necessary. They’re using a technique (Auxiliary-Loss-Free Load Balancing) that requires updating just 5% of parameters during training, meaning GPU usage drops by 95%. Less GPU usage means less cost, and less heavy processing means faster model training.

3) Inference—the process where an AI model generates outputs—requires a huge amount of memory, which is expensive. Deepseek uses a data compression technique (Low-Rank Key-Value (KV) Joint Compression) to minimize memory usage, speed up output, and cut costs; it’s a win-win all around.

4) Instead of training the Deepseek model with all types of tasks in the traditional way, they trained it on tasks where results can be verified. For example, if you ask it to write a piece of code, it provides the code as output. If correct, it’s praised—“good job”—and next time, it’s told to solve similar problems the same way. If incorrect, the mistakes are pointed out, and it’s told to try again until it gets it right.

Doesn’t this seem like a common process? It’s just like how we learn through trial and error from childhood! When we solve a math problem, we remember the process so we can solve similar ones in the future. If we make a mistake, someone points it out and we try again until we get the right answer. This is called reinforcement learning.

Deepseek uses this trial-and-error method to train its model. This not only makes the model smarter quickly, but also boosts its reasoning and problem-solving abilities over time.

5) They’ve open sourced the model under the MIT license. This means anyone in the world—individuals or organizations—can use the LLM for free, modify it, and develop their own products. That’s a big deal. When such a powerful model is publicly available, the biggest beneficiaries are small entrepreneurs and researchers who can carry out their projects or research at little to no cost.

Ironically, when people like Elon Musk started OpenAI, their goal was similar—to democratize AI, as reflected in their name. Yet, not long ago, OpenAI flipped from being a non-profit to a for-profit company.

What Deepseek has done will go down as a major milestone in history. They have shown, with crystal clarity, how incredible things can be accomplished with just what you have—a feat that can amaze the entire world, at a fraction of the cost. And instead of commercializing this extraordinary achievement 100%, they’ve open sourced it so everyone around the globe has access.

The owner of this company is Liang Wenfang, a 40-year-old who also owns a quant trading hedge fund. They initially developed this model to automate some of the mathematical aspects of their own quant trading operations. Later, when the model achieved stability and started outperforming other models, they released it to the world as open source. The mathematical brains behind the model are from two of China’s top universities—Peking University and Tsinghua University. So, this is not some haphazard creation; there are some very sharp minds behind it.

With the arrival of Deepseek, the two companies facing the most immediate risk are: OpenAI and Nvidia.

Deepseek’s operating cost is several times lower than what OpenAI charges for their premium models. That means OpenAI will need to lower prices to remain competitive. However, since OpenAI’s costs for training and inference are much higher, lowering the price will not reduce their losses, but rather increase them. How OpenAI handles this moving forward is crucial; if they can’t adapt, they’re in trouble.

Nvidia also faces a setback in selling expensive GPUs. Nvidia is an overvalued company, and their sky-high valuation is because of the monopoly their GPUs hold in the AI world. Now that Deepseek has demonstrated you don’t necessarily need fancy GPUs for everything, many other companies are likely to follow suit, and we can expect even better, more affordable models in the future. Going forward, companies will focus on how to train models using affordable GPUs. If that happens, will Nvidia’s revenue remain as high as before?

Since OpenAI is not a public company, Nvidia is facing the brunt of the public market reaction. Yesterday, their stock price dropped 17% and their valuation shrank by $593 billion in a single day—the largest one-day drop in stock market history. Of course, this is a short-term market reaction, and in my personal (not financial) opinion, Nvidia’s price will bounce back. The real question is how OpenAI responds.

Now, some musings about the future:

1) America will be relentless in trying to give Deepseek trouble. Since Deepseek is open source and under an American public license (MIT), it will be difficult to ban it on security grounds like they did with Huawei or TikTok, but they will try everything.

If nothing else works, they might try to shut it down by brute force. This might also lead to increased anti-Chinese sentiment in America.

2) As soon as Trump took office, OpenAI announced a $500 billion ‘Stargate’ project with Oracle and SoftBank. No matter how it looks from the outside, Microsoft is mainly behind getting this project through the Senate. Deepseek’s emergence will put even more pressure on OpenAI’s parent company, Microsoft, to make the Stargate project a success.

There are also rumors that Microsoft’s AI Chief Mustafa Suleyman and Sam Altman don’t get along, so it’ll be interesting to see how Microsoft manages this personality clash and builds something better than Deepseek. Maybe we’ll see one of them leaving Microsoft/OpenAI in the coming days.

3) This is a huge deal for AI researchers. With a model of this caliber being open source, AI research will advance much faster and we’ll soon see research on how to train and infer large-scale AI using commodity hardware.

AI will become even cheaper, more affordable, and widely accessible.

4) There will be a flood of garbage and scam apps claiming to use Deepseek, or actually using it, as a way to swindle people for money with promises involving crypto, day trading, forex trading, multi-level marketing, and more.

5) China still (apparently) hasn’t managed to produce 3-nanometer chips due to TSMC and ASML restrictions. If they can crack this, and do so within America’s constraints, that will be another huge milestone.

What China will do next if that happens remains to be seen.

6) China has made significant progress in another technology (the Next Big Thing): quantum computing. Since they’re a black box, it’s hard to know from the outside exactly what they’re up to. Unless China announces it, it’s tough to say, but if China achieves a breakthrough in quantum computing by 2025/26, that would be even more threatening for the giant companies (Google, IBM) and the United States itself.

I’ll finish with a story shared by an older brother. One day, in conversation, he said—with America bossing China around like this, it reminds him of how some people in our country treat house helpers or stray cats and dogs—being rude just because they can, even when it’s unnecessary. The day China gets hold of the power baton, they’ll dish it back to America so strongly that the U.S. won’t even have time to recover. China will repay America for all this— with interest.

The question is—when will the baton be passed? And even if it is, will we ordinary people see real peace? Not the American-style “Middle Eastern peace,” but genuine peace?

NB: Collected from Facebook:——–https://www.facebook.com/share/p/17UVqrJsSh/

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

ফ্রি ইমেইল নিউজলেটারে সাবক্রাইব করে নিন। আমাদের নতুন লেখাগুলি পৌছে যাবে আপনার ইমেইল বক্সে।

বিভাগসমুহ

বিজ্ঞানী অর্গ দেশ বিদেশের বিজ্ঞানীদের সাক্ষাৎকারের মাধ্যমে তাদের জীবন ও গবেষণার গল্পগুলি নবীন প্রজন্মের কাছে পৌছে দিচ্ছে।

Contact:

biggani.org@জিমেইল.com

সম্পাদক: মোঃ মঞ্জুরুল ইসলাম

Biggani.org connects young audiences with researchers' stories and insights, cultivating a deep interest in scientific exploration.

নিয়মিত আপডেট পেতে আমাদের ইমেইল নিউজলেটার, টেলিগ্রাম, টুইটার X, WhatsApp এবং ফেসবুক -এ সাবস্ক্রাইব করে নিন।

Copyright 2024 biggani.org