Is Deepseek's Breakthrough the Sputnik Moment for AI?

---------------------------------------------------------

Update on February 2nd 2025

---------------------------------------------------------

In the original version of this blog, I shared my brief take on Deepseek's breakthrough—here’s the longer version. I thought it was important to write this text not so much as a market update, especially since the S&P 500 has already recovered more than half of the move it made on Monday, but because this event may represent a paradigm shift for AI investors. And as I mentioned in my original post, while I’d have nothing to say about a novel oil drilling technology, I do have some opinions when it comes to AI.

Recap

As a recap: over the last weekend, DeepSeek AI, a Chinese AI startup, took the tech world by surprise with the release of its new AI model, R1. This model was not only claimed to be highly competitive in performance compared to the top of the line reasoning models from OpenAI but was also developed and operated at a fraction of the cost and with significantly less computational power.

The buzz around DeepSeek's capabilities was amplified by high-profile endorsements on social media. Tech investor Marc Andreessen described DeepSeek's R1 model as "one of the most amazing and impressive breakthroughs" he had seen, while Chamath Palihapitiya praised its ability to reason step-by-step without massive supervised datasets.

DeepSeek's model rapidly gained popularity, surpassing ChatGPT in downloads on Apple's U.S. App Store. This led to a significant market reaction on Monday, with a broad sell-off in tech stocks. Nvidia, a key player in AI hardware, saw its stock plummet nearly 17%, wiping out almost $600 billion in market value—marking the largest one-day loss for any company in Wall Street history at that time. The Nasdaq fell by 3.1%, reflecting widespread investor concerns over U.S. tech companies' dominance in AI.

The event raised questions about the sustainability of the AI boom, particularly the high costs associated with AI development in the U.S., and whether cheaper, effective alternatives could emerge from China.

More importantly, DeepSeek’s low-cost development demonstrated that the barrier to entry in AI is far lower than previously thought, shaking the conviction that only mega-cap giants like Google or Meta were poised to dominate the space. If advanced AI can be built with fewer resources, it could fundamentally alter the competitive landscape, attracting new players and challenging the monopolistic grip of tech giants.

It was seen as a potential challenge to the narrative that AI development requires enormous resources. DeepSeek's approach suggested that advanced AI capabilities might be achievable with less computational power, potentially affecting the demand for high-end chips and the infrastructure built around AI.

AI Breakthrough?

One of the first things that spread across the web Monday morning was: Did DeepSeek really use a very low amount of GPU power to train their latest model (10000 last generation Nvidia A100 chips)? Some respected figures claimed they actually used five times more than what was disclosed. This theory gained traction when Elon Musk retweeted that story on X.

I have no way to verify this claim, and it could very well be pure speculation. I understand where the skepticism comes from—doubting Chinese achievements when they seem too good to be true is perhaps a natural reflex, given that China hasn’t historically been the most transparent nation. However, we live in an era where conspiracy theories are everywhere, so let’s not fall into that trap. There is simply no evidence to support claims of a fraudulent breakthrough, so let’s assume the disclosed information is accurate—which, in fact, I genuinely hope it is (you will see why later).

So does this mean DeepSeek has upended the current AI market thesis?

Same story ever again

AI is not at is first improvement. If I would simplify as much as possible AI history this would be this:

AI's key milestones started with the artificial neuron and the perceptron, introduced in 1958 by Frank Rosenblatt. But skepticism kicked in when MIT professors Marvin Minsky and Seymour Papert pointed out in their 1969 book Perceptrons that single-layer perceptrons couldn’t even solve something as simple as XOR (a simple logical problem where It outputs true or 1 only if the inputs differ (one is true, the other is false), and outputs false or 0 if the inputs are the same). That pretty much killed neural network research funding for a while.

A few years later, in 1975, Paul Werbos introduced multi-layer perceptrons and developed the concept of backpropagation, proving that neural networks could handle any non-linear boundaries. However, training anything deeper than two layers remained a nightmare due to the vanishing gradient problem. Then came Geoffrey Hinton, Yoshua Bengio, and Yann LeCun, who cracked the code with improvements to backpropagation, launching what we now refer to as deep learning.

Fast forward to the transformer architecture that fall inside of deep learning, and boom—generative AI was born: a very expensive sport reserved for Big Tech, loaded with free cash flow. That was the reality—until we all woke up last weekend to the news that very powerful generative AI can now be done at a low cost.

As you can see, the history of AI is a series of waves, each sparked by a breakthrough in learning and then stalled by a technical limitation in learning—until the next breakthrough comes along. So, seeing a new advancement that dramatically reshapes neural network training efficiency isn’t surprising at all. In fact, the training process, or what we call "learning," has always been the bottleneck of every AI wave.

The last major non-incremental breakthrough happened when Geoffrey Hinton, Yoshua Bengio, and Yann LeCun revolutionized deep learning. Now, the story is more complex than saying they "invented" it together, but their individual contributions unlocked the ability to train neural networks with more than two layers of connected neurons—something previously out of reach due to the vanishing gradient problem. Their work was so transformative that they received the Turing Award in 2018, often called the "Nobel Prize of Computing." And last year, Geoffrey Hinton was awarded the Nobel Prize in Physics for his foundational discoveries and inventions that enable machine learning with artificial neural networks.

So, what happened last time we made a huge technical leap in the efficiency of AI training? Did we continue to do the same things with AI but faster and at a lower cost? No, we used this extra power to train AI on more complex problems. We started to dream about solving so many new things. We envisioned having AI everywhere.

As a robotic researcher, I remember around 2013 when the deep learning movement really started to gain traction and when Google released TensorFlow, their free AI library for deep learning, people began using AI everywhere. We even started to see papers at academic conferences presenting AI solutions to problems that were already solved decades ago using a deterministic approach. I was also guilty of that.

In other words, the improved training capabilities unlocked around 2008 boosted AI for more than a decade, which led to ChatGPT. This time will be no exception. I have no doubt about this because we simply haven’t reached the AI endgame yet.

Not the Endgame

Indeed, the evolution of a technology has consistently led to transformative changes, shaping how we interact with devices and services, until reaching a saturation point where incremental improvements become less noticeable to the majority of users. This pattern is evident across various technological milestones.

Consider the PC: When we moved from Pentium 1 to Pentium 2, developers leveraged this increased computing power to enhance the PC experience. This trend continued with Pentium 3 and Pentium 4, each allowing us to do new things with our PCs. It was around the end of the Pentium 4 era that we sort of reached an endgame; most of the population stopped seeing the benefit of upgrading, turning it into a pricing game to attract people to change their computers since the need wasn't as pressing.

Then there's the internet: When providers started offering high-speed internet with larger usage limits around 1998, people began downloading music. As bandwidth increased, we moved to downloading movies and eventually to streaming music and TV. It was only around that moment, or perhaps after Zoom perfected video conferencing—a technology that Skype had invented but never managed to execute flawlessly—that most of us reached what seemed like an endgame and began to care less about faster internet speeds. Nowadays, there are probably several different offers from my carrier for faster connections, but I don't really care about upgrading.

Remember the leap from basic Nokia phones to BlackBerry? It was transformative. But then came the iPhone in 2007, which completely revolutionized the game. We eagerly upgraded from the iPhone 3G to the 4 and then the 5 because each improvement felt revolutionary. Now, I'm still using an iPhone 13, although Apple released the iPhone 16 months ago. And I would have kept my iPhone 12 mini if not for the battery issues. It seems that Apple has reached what feels like an end-game here too.

You see the pattern?

The cost of AI was already in freefall even before the DeepSeek breakthrough. Indeed, the rate at which AI technology is becoming more affordable is unprecedented. For example, the price for GPT-4 tokens dropped from $36 per million tokens at its launch in March 2023 to just $4 per million tokens within 17 months, marking an annual decrease of approximately 79%. Although this breakthrough is a significant advancement, it occurred within an already accelerating trend.

This progress may now be sufficient to achieve low-cost, efficient chatbots, potentially bringing us close to that very specific endgame. But that endgame was, in any case, already on its way, as several experts in the AI field had predicted that we would run out of high-quality written data by 2026. However, this potential scarcity might signify the endgame for chatbots as we currently know them, but certainly not for AI as a whole. Recall the Encyclopédie Universalis on a CD-ROM in 1997, which represented all human knowledge on a disk—was that the endgame for PCs? No it was the start ! AI is expected to follow a similar trajectory. It has yet to process even a fraction of all the sounds we've recorded. While AI have been train on image, it has barely scratched the surface of all available imagery, let alone the vast amount of video content. More efficient and lower-cost training should begin to unlock these capabilities and propel us into the future of AI. Soon, we might even look back and laugh at how primitive ChatGPT 4 seems by then, given how unnatural it was to interact with an AI at that stage.

End of US exceptionalism in AI ?

The DeepSeek AI breakthrough has notably shaken the market for two main reasons: 1. It raised a very rational question about whether the lower computational need for training a highly performant AI is reshaping the market's thesis about AI technology. This is what we just talked about. But it also sparked an irrational fear about the potential end of U.S. leadership in AI.

Indeed, seeing significant U.S. economic competitors (Scoop it's not Mexico or Canada, as both are strong historical allies) achieve such a breakthrough despite being under U.S. advanced chip restrictions has drawn comparisons to historical technological shocks, such as the U.S. reaction to the USSR's launch of Sputnik—the first human-made satellite to orbit the earth. I even encountered a discussion last week pondering whether this marks the end of U.S. exceptionalism in AI.

It's important to note that this breakthrough did not come as a surprise to those well-acquainted with the field. Even at the onset of the current trend, when OpenAI first made waves with ChatGPT, a leaked memo from a Google executive revealed a significant insight: “We Have No Moat (and Neither Does OpenAI).” This statement highlighted that while Google and OpenAI were preoccupied with each other, open-source projects were actually solving major AI problems more rapidly and efficiently. At the start of 2023, we were already witnessing models that focused on significantly reducing training costs and the price per token. DeepSeek's model fell within this trend of open-source, lower-cost models. The primary difference was that most of these open-source models performed slightly worse than ChatGPT4 in terms of performance, whereas the DeepSeek model was at least on par with ChatGPT's most advanced reasoning model.

This leap was anticipated, but many expected it would come from a young U.S. tech startup rather than from a country where advanced GPU training was highly restricted. But is this that surprising?

It reminds me of an example I recently shared with my class about a startup that failed because it had too much money. I discussed Rethink Robotics, a company founded by Rodney Brooks, an MIT professor and the founder of iRobot (inventor of the robotic vacuum cleaner). Despite receiving over $150 million in funding, Rethink Robotics was ultimately outperformed and shut down, overtaken by Universal Robots, a startup launched on a shoestring budget by three students from Denmark.

One key reason for Rethink Robotics' failure was its excessive funding, which perhaps obscured the essentials and led them to try to build everything in-house. This is interesting considering that their founder often used the PC industry as an analogy for his vision, where the success of the PC was built on IBM's strategy of integrating the best available components—like Intel’s processors and Microsoft’s OS—rather than developing everything internally. In contrast, Rethink Robotics tried to do it all: they developed their own vision cameras and software libraries instead of partnering with established leaders like Cognex, who had decades of expertise.

This reflects a broader trend we see in the US AI industry, where companies full of free cashflow might not focus enough on increasing training efficiency but instead on developing superior AI to stay ahead in the race. DeepSeek, restricted by US regulations from accessing massive GPU farms, was forced to innovate on efficiency. They didn’t invent new types of models but improved the efficiency of training them significantly.

As for whether this marks the end of US exceptionalism in AI, as some have speculated, I believe this is wrong. US exceptionalism in AI isn’t solely about being the first or the best in every aspect. Notably, none of the pioneers of deep learning were from the US: Geoffrey Hinton was a British citizen teaching in Toronto, Yoshua Bengio is a Canadian based at the University of Montreal, and Yann LeCun, a French citizen who worked at AT&T (before joining NYU). Additionally, Andrew Ng, another significant contributor to deep learning, was born and raised in the UK.

Don't get me wrong, the US has had its fair share of strong contributors to AI, both in foundational and modern contexts. However, my point is to highlight that US exceptionalism isn't necessarily about creating these advancements, but rather in attracting the right people and building commercial value from these developments. For instance, Yann LeCun joined Facebook's AI division, Geoffrey Hinton and Andrew Ng went to Google, while only Yoshua Bengio stayed in academia, rejecting multi-million dollar offers. Yet, some of his best students, like Hugo Larochelle, moved on to high-profile roles at Twitter, Facebook, and Google, drawing salaries comparable to professional athletes.

With the political tensions between China, it's maybe unlikely that Liang Wenfeng, CEO of DeepSeek, will join a US company. But that doesn’t really matter also. Kunihiko Fukushima, the inventor of the Neocognitron—a foundational technology for deep learning—remained in Japan, which didn't stop researchers elsewhere from building on his work or US companies from commercializing it. At the current pace, it will only take a few months, perhaps even weeks, before AI departments in the US understand how DeepSeek accomplished its breakthrough, and soon this too will be a story of the past.

Conclusion

The market sell-off in response to DeepSeek's outstanding new model was a natural reaction as it shifted the investment thesis. However, far from breaking the AI trend, this development has likely elevated it to another level. Major companies are expected to maintain their leads. Once they decipher how DeepSeek achieved its breakthrough—which I believe will happen sooner rather than later—they should be able to shift into second gear and build even better AI systems. In fact, there was some talk that we were further from Artificial General Intelligence (AGI) than initially thought, but this breakthrough might have nudged us in the right direction again.

The significant shift in the AI investment thesis is I think that the reduced cost of training opens a path to success for smaller companies. In my summer blog post comparing the dotcom bubble to the current AI trend, I noted that we hadn't yet reached the point where all AI companies were seeing their valuations skyrocket, nor were we witnessing a surge of new market participants in AI, similar to what occurred during the 1998-2000 phase of the dot-com bubble. This breakthrough could be the spark that ignites this phase.

In my view, smaller cap companies within the AI sector should begin to perform better, provided the market trend remains upward. As for Nvidia, I understand that a breakthrough in lowering training costs might initially seem like a setback for a company that benefits from high computing costs. However, not only do I think this breakthrough will translate into the training of more complex models by big tech companies, as we are still far from the AI endgame, but it could also bring new customers to Nvidia if it enables smaller companies to succeed. We should also remember that AI computation involves two parts: training and inference. Even if Nvidia experiences a slowdown in GPU sales for training, faster AI adoption will likely lead to much more inference processing.

I am more excited than ever about what we might soon witness with AI—a strong shift in AI capabilities and new ways to use AI that we haven’t yet foreseen, likely emerging from young, innovative companies enabled by the current breakthrough in AI training. After all, it was young startups that, between 1998 and 2010, introduced the most innovative ways of using the internet. I anticipate we will see a similar trend with AI.

P.S. By the way, thanks to everyone who wished me a speedy recovery after my concussion. I am feeling much better now. The neck pain, confusion, and tunnel vision have all cleared up. There's just a very minor headache remaining, which might now be related to the tariffs that were imposed on Canadian goods!

--------------------End of Update-------------------------

Well… yesterday saw a significant gap down, particularly on the Nasdaq. The reason was the release of the new Deepseek open-source model from a Chinese startup. According to its founders, the model was developed in just two months at an incredibly low cost (around $6 million) compared to market-leading models. This model is reportedly on par with, or even outperforms, OpenAI's latest O1 reasoning model—a breakthrough that could drastically lower the barriers to building advanced AI systems.

Lowering these barriers has the potential to threaten the monopoly of major companies that have been the only ones able to afford training such models. It could also reduce the demand for Nvidia’s GPU servers, which are currently indispensable for AI development. As a result, the market has interpreted this breakthrough as a paradigm shift that challenges the existing investment thesis—that AI development was reserved for an exclusive “rich boy club.” Instead, it opens the door to a new reality. Consequently, some major players in the field experienced a sell-off.

If your portfolio was heavily weighted in big AI names like Nvidia, AMD, or Vertiv, yesterday was probably not a good day.

Is this really the beginning of the end for Nvidia?

If there were a paradigm shift in how we drill oil, I’d probably have nothing to say. But as a university professor specializing in AI, I do have an opinion on what happened yesterday.

I had a little ski accident on Sunday night that resulted in a concussion and some bruises. I tried to write this update yesterday to stay in sync with the market turmoil, but I was far too foggy to make it coherent. This morning, I’m attempting to write this update before the market opens to share my take on the situation, though I likely won’t have time to dive into all the implications of this big AI news for the market.

In short, I really don’t think this is the end of the AI trend. In fact, this breakthrough could serve as the catalyst for the next phase of the AI bubble, where more companies join the party. While it may provide an excuse for a rotation out of big tech—especially given the historically high market concentration—on a fundamental level, the cash flow positions of these major companies should keep them in pole position in the AI race.

Regarding Nvidia, which was one of the most affected by yesterday’s drama, I believe its business will continue to grow. Deepseek used GPUs for training its model, and the basis of its model is still the artificial neuron. As I’ve explained in my blog posts about AI, GPUs are perfectly suited for this purpose. This news might imply a reduced need for GPU time for some aspects of AI, but we haven’t reached the “endgame” of AI development yet. A lower barrier to entry for AI could result in even greater adoption and use, which would ultimately lead to increased demand for GPUs—especially for inference.

So, Nvidia should be fine. That doesn’t mean its stock will immediately recover; the market may keep Nvidia in the penalty box for a while until it proves that demand remains strong. But fundamentally, I don’t see Deepseek’s breakthrough as an immediate threat to Nvidia’s business.

I’ll provide a more detailed explanation of my views on this topic later today or tomorrow.

Market Update

Despite yesterday’s market drama, I don’t currently see anything that points to a change in trend. All indicators suggest that the market remains strong. The breakthrough by Deepseek does represent a paradigm shift, but it’s not something that fundamentally alters the market like a trade war would. In fact, it might have the opposite effect—helping the market broaden out, particularly in the AI sector. Good news for my recent UiPath position!

The options market barely flinched yesterday, even though SPY was down -2.25% at one point during the day. Our NYSE and Nasdaq derivative volume remained incredibly bullish,

as did our implied correlation model:

Additionally, our options model only experienced a slight dip, similar to what you’d expect on a typical trading day.

I know the AI news from the weekend was somewhat of a black swan event for the stock market. In such cases, due to lags in some metrics, it might seem normal for everything to remain bullish initially. However, here’s what our options model did on the day of the last FOMC press conference when we had a big red candle:

On that day, it went from a very bullish level to triggering a bearish flag almost immediately. The same thing happened with the implied correlation model:

What this tells us is that the options market didn’t interpret yesterday’s event as a serious threat to the bull market, unlike Powell’s speech, but rather as a potential market shift.

Market breadth also held up fairly well despite how red the markets were at times in the morning:

In fact, this indicator accounts for the worst moments of the day. If it were calculated using only daily candle data with the end-of-day snapshot, I bet it would show improved market breadth. By the end of the day, while Nvidia and a few other stocks stayed deeply in the red, several other tickers turned green.

The VIX ribbon also began to invert (9-day VIX vs. longer-term VIX) and could continue this move until market gains confidence that the tremor is over. However, I wouldn’t be surprised to see it drop quickly.

Skew, which remains elevated, actually fell by 5 points yesterday. This is excellent news, as black swan events typically push skew higher. For instance, when the yen carry trade unwound on August 5th, skew rose by 6 points. Yesterday’s 5-point drop is a stark contrast to that, suggesting a more contained reaction.

Conclusion

In conclusion, all the current data indicate that the bull market remains intact. While the Deepseek AI breakthrough has been hyped as a "Sputnik AI moment," it doesn’t seem to be something that will permanently disrupt business. It may cause some sector rotation and minor market ripples, but my thesis is that it could ultimately help broaden the market—not just within AI, but across other sectors as well.

By demonstrating that it’s possible to develop a competitive AI solution with less time and money, Deepseek creates opportunities for smaller AI companies. At the same time, it makes investing in big tech seem slightly less appealing, which could reinject capital into other parts of the market. Additionally, the lofty valuations of some companies have been a challenge for the market to rally significantly. Yesterday’s repricing has leveled things out somewhat, and I wouldn’t be surprised if strong earnings reports this week, combined with a less hawkish Jerome Powell compared to December, could spark a market rally.

Apologies for posting this late—I'll provide a more detailed explanation soon about what I believe Deepseek's breakthrough means for AI.

PS. BTW, here was my prompt in ChatGPT for making the cover image: “Could you make me an illustration of Chinese scientists unveiling a super neat panda robot on a stage.” As you see, I didn’t mention Deepseek, and yet it was mentioned in the image. Troubling! I guess ChatGPT is scared of it and have made deepseek image too many time yesterday...

43 Comments

Unknown member

Feb 12

Did the signal WU go on Margin show up? I see it on tradingview.

Feb 13

Replying to

Yes, this is what I see. If you go back to Nasday Bear end, Wu on Margin appeared before Wu in.

Feb 10

Hi Vincent! When you have the occasion, it would be interesting to have your input on why the Hedge signal is still at WU OUT. I don't think we have the underlying dataset to indicate why or when we should expect the WU IN signal. Thanks in advance!

Feb 09

Is it possible to create a discord server for the WU family for discussion?

Feb 27

Can you create a discord for people to talk more frequently? :-)

Hi Vincent,

First of all, great job with all the analysis.

I am a little confused by the indications and the current market conditions.

On TradingView, I see that we went into a WU Out condition on Dec 14, which is confirmed by our dashboard. According to your blog posts—"WU SP500 Bear Market Signal" and "WU SP500 Hedging Signal"—this indicates a Hedge In strategy. Your blog states that in such cases, one should implement hedging strategies, such as shorting or buying inverse ETFs, or moving into cash.

However, when I look at the current portfolio, I see the opposite. Your market position is long, with 33% allocated to SPY, TQQQ, and UPRO.

So, I’m a bit confused. Could you please clarify? Also, is…

Feb 06

I am so glad you are improving.

With all these great models, which signal (or signals) should I follow to convert new cash into the stock market? There are so many wonderful models that I am now having trouble seeing the forest for the trees.

While the hedge signal was primarily built to highlight bigger market drops, the purpose of the Risk Index is to spot smaller market bumps. This doesn’t mean that every time the hedge signal flags, we will see a major drawdown. However, the probability is higher than when only the Risk Index is triggered. In other words, the Risk Index will likely spike at any small market dip, making it a good indicator of when it might be time to de-risk a portfolio or, conversely, to put new cash to work. This was the intended purpose to help being less binary (in or out of the market).

At the moment, the Risk Index is at 0, but some underlying metrics are…

Is Deepseek's Breakthrough the Sputnik Moment for AI?

Recap

AI Breakthrough?

Same story ever again

Not the Endgame

End of US exceptionalism in AI ?

Conclusion

Market Update

Conclusion

Related Posts

43 Comments

Contact us
info@thewealthumbrella.com

Save and secure check out

Recap

AI Breakthrough?

Same story ever again

Not the Endgame

End of US exceptionalism in AI ?

Conclusion

Market Update

Conclusion

43 Comments

Contact us info@thewealthumbrella.com

Save and secure check out

Contact us
info@thewealthumbrella.com