Elad Raz on Navigating the AI Infrastructure Stack

By 
F2 Team
March 21, 2024
Operator Series
5 min read
Share this post
Elad Raz on Navigating the AI Infrastructure Stack

We were thrilled to host Elad Raz, the Founder and CEO of NextSilicon, for an in-depth session on the AI infrastructure layer and predictions for the future of chip companies. Elad also shared general insights on the fundraising process and go-to-market strategies for technology hardware companies.

About Elad Raz:

Elad Raz is the founder and CEO of NextSilicon, a leading Israeli semiconductor company. With over fifteen years of experience in the software and hardware industries, Elad has held leadership positions at various startups and larger companies. He was the Founder and CTO of Integrity Integrity Project, which was acquired by Mellanox in 2014. In July 2017, Raz left Mellanox to establish NextSilicon, a company that develops processing technologies capable of accelerating computing applications at high speeds with maximum energy efficiency, ideally suited for the artificial intelligence revolution.

Watch the full video or read through the highlights below:

Understanding why hardware is capital intensive and how to handle long sales cycles (00:33)

“In any company, the most important thing is money in the bank, right? How much runway do I have? How do I utilize that spend? And the most important thing is obviously to get revenues from the end customers. If you can do that, you sell some equity, right? That’s the reality.


Silicon is much more notorious when it comes to spending, as you mentioned. In the past, when there was more emphasis on scale at the beginning, it was easy to go and tape out. "Tape out" refers to the process of silicon fabrication at various foundries. The most important and well-known one is TSMC in Taiwan - there are many other global foundries, such as UMC Global, and more.


But today, just the cost of making the mask on the wafer during manufacturing is almost $10 million on the low end. And when you go to three or two nanometers without - revealing specific numbers - it's somewhere between $20 to $30 million. And that doesn't provide you with a large amount of silicon or many computing chips. It only provides you with the mask and the ability to produce more. Each one of them can cost a bomb cost of thousands and you have high bandwidth memory. And that is just for manufacturing 100 units. The initial investment is almost like a seed investment in some companies. And then you need to add to that the fact that you need three years to plan the chip because of the cost of silicon. If there is a malfunction, that means you need to pay another $20 million. So you want to make sure that everything works.


With software and hardware, you easily reach $130 million, and $160 million when you're going to advance node. So you need to be very thoughtful and diligent in the go-to-market strategy and how you're operating.


On the other hand, it is important to understand that in the compute market the end goal is significant.

None of the compute companies aim to be acquired for $100 or $200 million, which is the amount of investment needed just to reach the go-to-market stage.

However, if you succeed in capturing that market, you have the potential to reach a market cap of tens of billions of dollars, even hundreds of trillions of dollars because the outcome is very binary.

Many investors, especially in the early years of the company, do not focus on ARR, which is almost non-existent in semiconductor companies. They are not focusing on revenue multipliers, but rather on the technology roadmap and how to go to market.


Another important factor to consider is the difference between the public and private sectors. In the private sector, the main focus is on finding oil and gas, developing new drugs, or building rockets. They are not concerned with building compute infrastructure. Therefore, it is essential to be in the production phase, which is the second generation and requires an additional $150 million investment. This is because it is rare to achieve success with the first attempt.


In the public sector, it is crucial to identify potential customers who can pay for the product and devise strategies to generate initial revenues. This is important because once you reach this milestone, you can engage in co-design and secure customers for future generations, even before the first generation is fully developed.”

What early funding rounds look like for hardware companies (5:39)

In the seed round, we raised $5 million. However, the most challenging fundraising was not the seed round, but the A round, which amounted to $26 million. The reason for this is that in the A round, you need to provide proof of concept. You need customers who express their interest and are willing to pre-order your product. Although we had pre-orders, we hadn't generated actual revenues at that point. This was right after the A round.

We had to convince investors, ‘Hey, you need to put $25 million to allow us to get to a tape out. This is our novel architecture, the answer roadmap, but a lot of capital may fail.’

So that was the hardest one.

INo matter how you flip it, your customers, at least your early adopters, will not finance your company. But they can show investors that, ‘Hey, we have here an opportunity in the 2024, 2025, 2026 timeframe. It can be a $200 million computer.’ So roadmap and technology are important, at least until you're able to secure hundreds of millions of dollars in investment, and then you can execute your plan.

How Next Sillicon articulated its competitive advantage to investors (7:10)

You've met many founders, so you know how it goes. They deliver their first pitch, which is often less than stellar, a disaster even. But then comes the second one, and they begin to understand what sets them apart.


With us, as we always say, we're looking at a binary outcome. We often state that we represent the next generation of compute. It will either be a complete disaster or the next big thing. This was a pivotal point and very appealing because it justifies investing in companies with the potential for 10X, 20X returns in a few years. It's about taking a leap of faith and acknowledging that you might end up with nothing. Otherwise, the mean value doesn't make sense, right?


With Next Silicon, we are never trying to do more of the same. In many computer companies, they say, ‘I'm going to build a better CPU. I'm going to build a better GPU. I'm going to do a tailor-made chip just for the machine learning and just for computer vision classification.’ Many companies have said this, but failed to think about that in a few years, there would be transformers and LLMs. So we said, 'Hey, we are a general-purpose chip, which means that we can run anything.' We can run transformers, CNN. We can run future neural network models. We can run HPC (High Performance Computing) workloads. And we also focus on software.


All the other companies that exist up until today, everyone has had fixed hardware. They say, ‘This is my chip. And inside the chip, those are the functions that I can run. Try to program it.’ And at Next Silicon, we decided to pivot into this software play - into a runtime algorithm that understands what's going on, understands how the application behaves - what matters, what doesn't matter.


So we speak in terms of flows - which flows matter the most - and then we try to optimize what runs the majority of the time. So it's a different play. Baruch Hashem (praise God), we've got help. Now we have all the proof points and we’ve started executing on the plan. It's an exciting period for us.


And one thing that I want to circle back on regarding the challenges of fundraising is that the challenges are usually related to the go-to-market. We decided to go to the high-performance computing (HPC) market. Most of you have probably never heard about that market. Luckily for me, I came from Mellanox. Mellanox acquired my previous company, so I knew that market. It's a massive market, a half-billion-dollar machine driving science and much more, answering crucial questions about the origins of the universe and the discovery of drugs. Both the public and private sectors evenly dominate it.

Many companies have invested in dedicated AI machine-learning chips. However, the problem is not just the cost of spending $300 million on multi-generation chips, but also the requirement of reaching a high level of maturity to gain recognition. Plus, everyone tried to build their own chip, making it a very crowded market. All the hyperscalers - Google with their TPU, AWS with their inference and training chips, Meta and Microsoft developing their own chips - contribute to this crowded market. Despite this, the market demand is huge. We decided to first conquer HPC, which has always been the foundation. We refer to HPC as the second generation and AI machine learning as the third generation.

How generative AI changed the competitive landscape for chip companies (11:48)

In 1960, the invention of backpropagation revolutionized the field. People successfully applied it to optical character recognition (OCR). However, this progress was followed by a period known as the AI winter, during which advancements in AI were stagnant. It wasn't until 2012, with the introduction of AlexNet and image recognition, that significant breakthroughs were made.

So, why did it take so long? The answer lies in the limitations of computing power. The computational resources required for running complex neural networks were not readily available until recent years.

When you have a neural network that requires around nine billion floating-point operations to calculate, but you're running it on a megahertz processor, even a Raspberry Pi, which now costs $10, can handle multiple gigahertz and calls. This means that running today's classification would take almost an hour, and you would need approximately 300 megabytes of memory just for the weights. So we experienced an AI winter, which led to a better understanding of computing. Accelerated computing has always been synonymous with HPC, where Nvidia has thrived. Nowadays, there are numerous compute options available, allowing for Terra and Peta operations to be run in FP16. As a result, models have become more complex.

But the fun part is that we are only at the beginning.

Everyone needs to understand this because all the generative AI models, such as Chat GPT, Midjourney, and Dall-E, are foundation models. These models can be used to generate output based on input without the need for training or fine-tuning. To create a prompt, you can go to ChatGPT, Midjourney, or Discord.

However, at this stage, we are still in the early phases. The gold wave is not the foundation model.

While ChatGPT may have its own advantages, the focus is primarily on the model and creating an API for it. If ChatGPT is unable to fulfill that role, Linux, specifically Red Hat, can step in with a vector database. What is a vector database? It involves taking all organizational and enterprise knowledge, applying access control, converting it into embeddings, and storing it in a database. This enables the neural network to access the knowledge with appropriate permissions, allowing inference whether it's on-premises or remote, and providing an output. Then we'll begin to explore what the future has in store.

This is particularly exciting because in vector search, you can work with large amounts of data, similar to how Google processes vast amounts of data on their network. The key is to crunch and process the data efficiently. And I will say, I'm not a science fiction enthusiast, but rather a software engineer with a strong background in computational skills. But I think that it will be an exciting period, at the very least. And just to be clear, it has nothing to do with the PyTorch.

Nvidia's market dominance and opportunities for startups in the AI infrastructure layer

Let's quickly analyze the history and understand why Nvidia has become a trillion-dollar company, while AMD has surpassed Intel. The answer lies not only in Nvidia being a market leader but also in being innovative and flexible. They focused on graphics initially and then transitioned to HPC.

We have reached a point where NVIDIA's GPU is not only great in terms of hardware but also in terms of the software ecosystem. They have developed a domain-specific language, which is generally seen as a disadvantage because no one wants to become dependent on a single vendor. However, NVIDIA has become a singularity point by making CUDA the de facto programmable language for accelerated computing.

So, in essence, all future AI missions are programmable through NVIDIA's investment in academia and their annual billion-dollar investment in software frameworks. These frameworks include not only CUDA, but also libraries like cuDNN for Neural Networks, cuSparsity and cuBLAS for linear algebra.

NVIDIA has made significant investments that allow people to utilize their resources and conduct their own scientific research, including work on Neural Networks. While other companies may claim to have superior chips with more floating point units, NVIDIA's impact is evident by the fact that a single token on LLM in Lama now has 270 billion parameters, even though the input context is only 26 tokens or roughly 15 words. To obtain 300 trillion floating points, you would need an immense amount of computational power. Just acquiring a single token requires a considerable amount of resources, equivalent to half the world's capacity. And you would need numerous tokens to achieve this. It involves processing a vast amount of data. However, when users express their satisfaction with a chip designed for Transformers, it opens up possibilities for the future, even if we are uncertain about what that future holds.

I’ll make a prediction - even though prophecies are usually given to the fools. AMD also has a general-purpose processor, and while their software and hardware may be two years behind, it is still quite impressive. Therefore, I predict they can capture 20% of the market. As for startups, I would advise against entering the CPU or GPU market. Instead, I would tell startups to focus on something unique where they have a competitive advantage.

Advice for Founders Looking to Innovate With AI (18:25)

I wouldn't make a silicon chip. As a software engineer, when I first attempted to design a chip, I went through two generations. The amount of pain I had to endure was enormous. However, if you're passionate about it, I’ll provide a general answer that applies to any situation.

First and foremost, choose the right team. Even we have made mistakes in that aspect. It took us a considerable amount of time to go from three people to 10 people and eventually 20 people within a year. These individuals are the ones who will remain committed to the company through both good and bad times. This is particularly crucial in the field of deep tech, where it is easy to veer off track. Therefore, it is essential to have individuals who will not get lost in the process.

The cost of a mistake is higher than, let's say, a storage company that has the wrong team. If you're doing block storage, file system, or object storage, there's a reliable playbook for that. The team is really important in a deep tech company because you need innovation.

It's not about a better CPU and GPU, as I said, it's about looking for something that has never existed today and asking yourself, 'Why? Why has no one done it?’ If you hear that many have tried and failed, don't go there. If you hear, "Ah, it's simple. I don't know why no one thought about it." That's a winner.

And you know, there is runtime software that can alter hardware. Some people argue that it is logical, but not many have explored this area due to its demanding nature. It requires a software engineer who understands compilers, and hardware experts who know how to execute instructions efficiently. So, innovation matters, and the story matters. When thinking about the go-to-market, de-risk the go-to-market. I say this all the time and I will keep saying it. Think about who your first customers are and laser focus there. Listen to what makes sense and make sure you can justify it.

Share this post