The 10x Factors
Posts
Level 10 AI Earthquake

Level 10 AI Earthquake

Your Life Is Changing — Again

Sen Ze
February 02, 2025 • Reading Time: 17 minutes

The DeepSeek AI system hogged the headlines last week with its release that crashed the stock price of NVIDIA, the darling of the AI world with its powerful computer chips designed to specifically run huge AI systems.

In just a few hours, almost $600 billion was erased from NVIDIA’s market cap. This is the biggest single day loss for any publicly traded company in the US. Some other tech companies with investments in AI were also affected, with the tech sector’s combined loss of market cap totalling a whopping $1 trillion.

The reason for this historic event is the fact that DeepSeek AI, a small company based in China, was trained for less than $6 million but produces outputs that are comparable or better than ChatGPT and other leading AI systems in the market today that cost more than $100 million each to train.

In addition, DeepSeek’s code can be run on far less powerful computers. In fact, not only was DeepSeek operating on NVIDIA chips that are inferior to the state-of-the high-end, expensive NVIDIA chips designed for heavy AI computations, it is operating on the unused sessions of the inferior chips that is being used for something else!

Naturally, this spooked investors in NVIDIA and other AI companies because it appears that there’s no huge need for top-of-the-line expensive NVIDIA chips or the billions of dollars in investment for future AI systems as these companies have projected.

But there’s still the icing on the cake:

DeepSeek has been released as open source software.

This means that anyone can use it, modify it, or build on top of it — for free!

Yes, anyone can download and install DeepSeek on their own servers and run their AI systems completely offline, at the back of their offices, and using a fraction of the power and with less powerful and cheaper computer chips it would normally take to run existing AI systems today.

This is an extremely huge deal since we can now have huge cost savings with our own AI systems that we can run using regular computers, and without any loss in performance — while protecting our data’s privacy and confidentiality at the same time (if you run it offline)!

And the market reacted accordingly by selling their stocks in NVIDIA in huge numbers.

AI systems or “AI Agents” are already becoming a huge part of our lives — where you realise it or not. ChatGPT is an AI Agent that chats with you and provide you with the information you want much quicker than you can do it on your own following links that Google provides you on a keyword search. You may already be using it everyday.

Note that there will be millions and millions of AI Agents that we will all be using in our lifetime. Some of your AI Agents that you will be using will be different from mine, and some will be the same.

For example, if I’m a frequent traveller, I’ll be using AI Agents that will find the cheapest air fares and hotels for me and book them automatically without my supervision as the AI will follow my rules for the AI Agents to operate in — like a maximum amount that I’m willing to pay for air fare and hotel bookings, and the countries I want to travel to.

If I don’t travel much, this AI Agent will be useless to me.

But if I cook often, my AI Agent can be one that creates new recipes for me based on my preferences, allergies, etc and then buys the ingredients for me automatically. This AI Agent wouldn’t be useful to someone who doesn’t cook at all.

As you can imagine, the number of AI Agents that can exist is only limited by your imagination and use cases.

We are entering into an era of ultimate efficiency and efficacy, the likes of which we have never experienced before.

But to get the most out of AI Agents and trust them fully, we need to understand the basics of how AI works.

When you understand how it works, and how it can 10x or even 100x our efforts, you can deploy them with full confidence.

This is also important if you want to use AI to make money with, because you will then know what AI Agents can and can’t do and where to deploy them strategically.

Just like you wouldn’t hand a chainsaw to someone who’s never seen one, you shouldn’t deploy AI blindly. Learn the basics, and you’ll unlock its power to scale your business, save time, and create revenue streams that you would never have before.

Let’s begin.

Artificial intelligence today is largely powered by something called Large Language Models, or LLMs. These models are at the heart of many popular AI applications such as chatbots, translation tools, and content generators.

To understand how they work, it helps to start with the basics: neural networks.

A neural network is a computer system inspired by the human brain. It is made up of many simple processing units called neurons that are connected together in layers.

When a computer receives data — such as an image or a sentence — it sends that information into the input layer. Each neuron in this layer takes a small piece of the data and passes it on to the next layer after doing some simple math.

In the hidden layers, the network processes the data by adjusting tiny numbers called weights. These weights determine how much importance each piece of data should have as it moves forward.

Then the processed data reaches the output layer, which produces a final result.

For example, a neural network designed to recognise handwritten numbers might look at the pixels in an image and decide whether the number is a “5” or an “8” based on patterns it has learned.

So Large Language Models are huge neural networks that have learned about language by reading massive amounts of text in that language. These models have billions of weights and biases, which are small numbers that have been carefully adjusted during training.

During this training process, the model learns how language works, which means it can predict what word should come next in a sentence or even create entire paragraphs that sound natural. It’s like your text app that predicts the spelling of a word before you have finished typing it, but applied beyond just a single word.

When you interact with an AI chatbot, your text is fed into this neural network. The network processes your input through many layers and then generates a response based on patterns it has seen during training.

The training of these models involves several methods of machine learning. The most common approaches are supervised learning, unsupervised learning, and reinforcement learning.

Supervised Learning

Here the model is trained with examples that include both the input and the correct output. For instance, an image labeled“cat” helps the model learn what features define a cat. The model adjusts its internal numbers so that its predictions match the provided answers as closely as possible.

Unsupervised Learning

Unsupervised learning, on the other hand, deals with data that does not have any labels. In this case, the model looks for patterns and structures on its own. It might group similar pieces of data together or detect recurring themes, much like noticing that certain groups of customers have similar buying habits without being told explicitly which group they belong to.

Reinforcement Learning

In this method, an agent learns by interacting with an environment. It takes actions, observes the outcomes, and receives rewards or penalties as feedback.

Sometimes this feedback comes automatically from the environment, like in a video game where points are awarded for successful moves.

Other times, especially with language models, real humans provide feedback. This is called Reinforcement Learning by Human Feedback (RLHF). In these cases, humans review the model’s responses and assign scores that guide the model toward better performance over time.

This combination of automatic rewards and human feedback allows reinforcement learning to handle tasks where the correct answer isn’t known in advance.

Beyond these primary methods, there are other techniques that enhance learning and efficiency.

Semi-Supervised Learning

This method uses a mix of labeled and unlabeled data to improve the model when only a small amount of labeled data is available.

Imagine you’re building a model to identify tumours in medical scans. Labeling each scan is time-consuming and expensive, so you might only have a few hundred scans that are clearly marked as “tumour” or “no tumour.”

With semi-supervised learning, you combine these labeled scans with thousands of unlabeled ones. The model uses the labeled examples to guide its understanding and then leverages patterns in the unlabeled data to improve its accuracy.

Self-Supervised Learning

This method lets the model generate its own labels by setting up tasks within the data — for example, hiding a word in a sentence and asking the model to predict it.

A popular example comes from natural language processing. Models like BERT are trained by taking sentences, randomly masking some words, and then asking the model to predict the missing words. For example, given the sentence "The cat sat on the ___," the model learns to fill in the blank ("mat") by understanding the context. This process lets the model generate its own “labels” from the data itself without manual annotation.

Transfer Learning

This method involves taking a model that has already learned from one task and fine-tuning it for a related task, saving both time and resources.

For example, a neural network might be pre-trained on ImageNet, a massive dataset containing millions of images across thousands of categories. Once trained, this network has learned to extract useful features from images. Later, it can be fine-tuned to perform a specific task — like identifying types of plant species — using a much smaller, specialised dataset. The network’s prior knowledge from ImageNet helps it quickly adapt to the new task.

Meta-Learning

This means “learning to learn”. It prepares models to quickly adapt to new tasks by training them on a wide range of problems so they can learn new tasks with minimal data.

An example of meta-learning is the Model-Agnostic Meta-Learning (MAML) algorithm. In MAML, a model is trained across a variety of different tasks so that when it encounters a new task (for instance, classifying a new set of handwritten digits), it can quickly adjust using only a few examples. It’s like a student who, after studying many different subjects, becomes exceptionally good at learning new topics rapidly because they have developed an overall strategy for learning.

Chain-of-Thought Prompting

Here, the model is encouraged to generate a step-by-step explanation before arriving at an answer, similar to how a human might talk through their reasoning process. For instance, when solving a math problem, the model is trained to show its intermediate steps rather than just giving the final answer.

Here’s an example, when I asked DeepSeek to complete a sequence, it started thinking as follows. I mistyped “Complete” by unintentionally omitting the “c”, but it recognises the pattern so it knows what I mean:

And here’s how it “reasons” or “thinks” before giving me a final output. It is trained to create this reasoning or thinking:

And after all that thinking, it gives me the answer:

Reinforcement Learning From Human Feedback (RLHF) is used to provide feedback or rewards for clear, logical steps. Over time, the model adjusts its behaviour to generate reasoning that aligns better with human expectations.

This process helps the model learn not only what the correct answer is but also how to arrive there in a way that makes sense.

In addition, models are often fine-tuned on data specifically designed to teach reasoning. These data include complex, multi-step problems from areas like mathematics, logic puzzles, or even language-based scenarios where the correct sequence of thoughts leads to the answer.

By training on these examples, the AI model learns patterns and structures that are similar to human reasoning.

As you can see by now, the methods above help train AI systems to reason more like a human when solving problems.

Some of them are actually not too different from a human teaching another human to reason, or humans teaching themselves to reason.

Each of the methods shown above provides a unique way to improve learning efficiency and effectiveness, making it possible to build AI systems that perform well even when data is limited, tasks vary, or the model must adapt quickly to new challenges.

Now that you understand how AI works, what then makes Deep Seek AI so much more efficient and cheaper compared to ChatGPT that shook the entire AI industry to its core — while either matching ChatGPT’s performance or even outperforming it?

Remember, DeepSeek is restricted to using less powerful NVIDIA chips due to the US banning sales of advanced NVIDIA chips to China.

How would its team then still produce the same or better results from DeepSeek with this seemingly huge restriction?

Furthermore, as stated earlier, the less-powerful NVIDIA chips are actually being used by DeepSeek’s founder’s other company to perform trading algorithms. Those tasks are not using the chips to the fullest, so they have some spare capacity for those chips which he then used for DeepSeek.

DeepSeek’s team used some clever programming techniques with some clever hardware optimisations, to achieve its goals. They include the following:

Lower-Precision Math

Instead of performing every calculation with 32‑bit numbers, DeepSeek decided to work with 8‑bit numbers. While the difference may seem large, 8-bit numbers are often good enough for its purpose.

For example, when processing images, each pixel of an image is commonly represented with 8‑bit values (ranging from 0 to 255) for each colour channel. This is standard practice because the detail provided by 8‑bit values is sufficient for most image tasks, such as object detection or classification.

Other examples include real-time audio processing or video encoding, where 8‑bit or similar lower-precision representations often provide the required fidelity. In these cases, the slight reduction in numerical detail doesn’t affect the final outcome in a meaningful way, making 8‑bit numbers a practical choice that keeps systems much faster and much cheaper to operate.

So using 8‑bit numbers are acceptable for many tasks because they provide enough precision for many applications while using much less memory and processing power compared to 32‑bit numbers used by other AI systems currently (they may now have adjusted their algorithms to use lower-bit numbers by now).

Knowledge Distillation

In this process, a large, complex model, known as the “teacher,” is used to train a smaller, simpler model called the “student.”

The student model learns to mimic the teacher’s behavior, enabling it to achieve similar results with far fewer resources. This approach is especially valuable when deploying models in resource-constrained environments.

Mixture Of Experts (MoE)

This method divides the model into many smaller, specialised “experts.” When a query is received, only the experts that are most relevant are activated, much like calling in just the right specialist for a specific problem rather than mobilising an entire team, which is what the other AI systems do currently (again, this may have changed now after DeepSeek’s breakthroughs).

This targeted use of resources ensures that the model doesn’t waste energy on parts of the network that aren’t needed for the current task.

Hardware Optimisations

DeepSeek optimises how work is distributed among its multiple chips. It overlaps computation and communication. So while one chip is busy crunching numbers, another chip can simultaneously transfer data or handle other tasks. This minimises idle time and ensures that every chip is used to its full potential.

DeepSeek also uses load-balancing techniques, dynamically assigning tasks to different chips so that no single chip becomes a bottleneck. This efficient scheduling and sharing of work means that even with the same physical hardware, the system can operate much faster and at a lower cost.

There are more, but you now have an idea of some of the things that have been done to enable DeepSeek to run at a fraction of the cost of the more well-known AI systems, without sacrificing its output quality.

Remember - you can fully host DeepSeek on your own modest servers offline, so your data remains private and confidential to you and your company.

DeepSeek is completely free to use as it is released as an open source project. This means anybody can look into its source code and improve upon it in any way they choose, without having to pay a license fee to the DeepSeek team.

In the words of Marc Andreessen, the Venture Capitalist:

Cheers!

P.S. I’m in the midst of celebrating the Chinese New Year — it lasts 15 days in total, although most of us will be back to work tomorrow (Monday). Wishing everyone who celebrates, a great new year of the Snake!

P.P.S. My new book, "The 10x Cryptocurrency Investing Strategies" will be launched on Amazon.com next week. This is your last chance to get it FREE. Simply reply this email with “10x Crypto Book” and I’ll email you a copy:

NOTE:

The 10x Factors for investors’s content is educational in nature, with examples used to illustrate the learning points. We are not financial advisors and do not provide financial advice. Please speak to your financial advisor before making any investment decision. Note that every investment comes with its own risks and drawbacks. Past results cannot guarantee future returns. Do not invest with money you cannot afford to lose.

This content may contain affiliate links. When you click on these links and make a purchase, we may receive a commission at no additional cost to you. We only promote companies that we have personally used or researched and believe will add value to our readers.

Reply

or to participate.