Building Computers for the Cloud with Steve Tuck

Published: Sept. 21, 2023, 10 a.m.

b'

Steve Tuck, Co-Founder & CEO of Oxide Computer Company, joins Corey on Screaming in the Cloud to discuss his work to make modern computers cloud-friendly. Steve describes what it was like going through early investment rounds, and the difficult but important decision he and his co-founder made to build their own switch. Corey and Steve discuss the demand for on-prem computers that are built for cloud capability, and Steve reveals how Oxide approaches their product builds to ensure the masses can adopt their technology wherever they are.


About Steve

Steve is the Co-founder & CEO of Oxide Computer Company.\\xa0 He previously was President & COO of Joyent, a cloud computing company acquired by Samsung.\\xa0 Before that, he spent 10 years at Dell in a number of different roles.\\xa0


Links Referenced:


Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is brought to us in part by our friends at RedHat. As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.


Corey: Welcome to Screaming in the Cloud. I\\u2019m Corey Quinn. You know, I often say it\\u2014but not usually on the show\\u2014that Screaming in the Cloud is a podcast about the business of cloud, which is intentionally overbroad so that I can talk about basically whatever the hell I want to with whoever the hell I\\u2019d like. Today\\u2019s guest is, in some ways of thinking, about as far in the opposite direction from Cloud as it\\u2019s possible to go and still be involved in the digital world. Steve Tuck is the CEO at Oxide Computer Company. You know, computers, the things we all pretend aren\\u2019t underpinning those clouds out there that we all use and pay by the hour, gigabyte, second-month-pound or whatever it works out to. Steve, thank you for agreeing to come back on the show after a couple years, and once again suffer my slings and arrows.


Steve: Much appreciated. Great to be here. It has been a while. I was looking back, I think three years. This was like, pre-pandemic, pre-interest rates, pre\\u2026 Twitter going totally sideways.


Corey: And I have to ask to start with that, it feels, on some level, like toward the start of the pandemic, when everything was flying high and we\\u2019d had low interest rates for a decade, that there was a lot of\\u2026 well, lunacy lurking around in the industry, my own business saw it, too. It turns out that not giving a shit about the AWS bill is in fact a zero interest rate phenomenon. And with all that money or concentrated capital sloshing around, people decided to do ridiculous things with it. I would have thought, on some level, that, \\u201cWe\\u2019re going to start a computer company in the Bay Area making computers,\\u201d would have been one of those, but given that we are a year into the correction, and things seem to be heading up into the right for you folks, that take was wrong. How\\u2019d I get it wrong?


Steve: Well, I mean, first of all, you got part of it right, which is there were just a litany of ridiculous companies and projects and money being thrown in all directions at that time.


Corey: An NFT of a computer. We\\u2019re going to have one of those. That\\u2019s what you\\u2019re selling, right? Then you had to actually hard pivot to making the real thing.


Steve: That\\u2019s it. So, we might as well cut right to it, you know. This is\\u2014we went through the crypto phase. But you know, our\\u2014when we started the company, it was yes, a computer company. It\\u2019s on the tin. It\\u2019s definitely kind of the foundation of what we\\u2019re building. But you know, we think about what a modern computer looks like through the lens of cloud.


I was at a cloud computing company for ten years prior to us founding Oxide, so was Bryan Cantrill, CTO, co-founder. And, you know, we are huge, huge fans of cloud computing, which was an interesting kind of dichotomy. Instead of conversations when we were raising for Oxide\\u2014because of course, Sand Hill is terrified of hardware. And when we think about what modern computers need to look like, they need to be in support of the characteristics of cloud, and cloud computing being not that you\\u2019re renting someone else\\u2019s computers, but that you have fully programmable infrastructure that allows you to slice and dice, you know, compute and storage and networking however software needs. And so, what we set out to go build was a way for the companies that are running on-premises infrastructure\\u2014which, by the way, is almost everyone and will continue to be so for a very long time\\u2014access to the benefits of cloud computing. And to do that, you need to build a different kind of computing infrastructure and architecture, and you need to plumb the whole thing with software.


Corey: There are a number of different ways to view cloud computing. And I think that a lot of the, shall we say, incumbent vendors over in the computer manufacturing world tend to sound kind of like dinosaurs, on some level, where they\\u2019re always talking in terms of, you\\u2019re a giant company and you already have a whole bunch of data centers out there. But one of the magical pieces of cloud is you can have a ridiculous idea at nine o\\u2019clock tonight and by morning, you\\u2019ll have a prototype, if you\\u2019re of that bent. And if it turns out it doesn\\u2019t work, you\\u2019re out, you know, 27 cents. And if it does work, you can keep going and not have to stop and rebuild on something enterprise-grade.


So, for the small-scale stuff and rapid iteration, cloud providers are terrific. Conversely, when you wind up in the giant fleets of millions of computers, in some cases, there begin to be economic factors that weigh in, and for some on workloads\\u2014yes, I know it\\u2019s true\\u2014going to a data center is the economical choice. But my question is, is starting a new company in the direction of building these things, is it purely about economics or is there a capability story tied in there somewhere, too?


Steve: Yeah, it\\u2019s actually economics ends up being a distant third, fourth, in the list of needs and priorities from the companies that we\\u2019re working with. When we talk about\\u2014and just to be clear we\\u2019re\\u2014our demographic, that kind of the part of the market that we are focused on are large enterprises, like, folks that are spending, you know, half a billion, billion dollars a year in IT infrastructure, they, over the last five years, have moved a lot of the use cases that are great for public cloud out to the public cloud, and who still have this very, very large need, be it for latency reasons or cost reasons, security reasons, regulatory reasons, where they need on-premises infrastructure in their own data centers and colo facilities, et cetera. And it is for those workloads in that part of their infrastructure that they are forced to live with enterprise technologies that are 10, 20, 30 years old, you know, that haven\\u2019t evolved much since I left Dell in 2009. And, you know, when you think about, like, what are the capabilities that are so compelling about cloud computing, one of them is yes, what you mentioned, which is you have an idea at nine o\\u2019clock at night and swipe a credit card, and you\\u2019re off and running. And that is not the case for an idea that someone has who is going to use the on-premises infrastructure of their company. And this is where you get shadow IT and 16 digits to freedom and all the like.


Corey: Yeah, everyone with a corporate credit card winds up being a shadow IT source in many cases. If your processes as a company don\\u2019t make it easier to proceed rather than doing it the wrong way, people are going to be fighting against you every step of the way. Sometimes the only stick you\\u2019ve got is that of regulation, which in some industries, great, but in other cases, no, you get to play Whack-a-Mole. I\\u2019ve talked to too many companies that have specific scanners built into their mail system every month looking for things that look like AWS invoices.


Steve: [laugh]. Right, exactly. And so, you know, but if you flip it around, and you say, well, what if the experience for all of my infrastructure that I am running, or that I want to provide to my software development teams, be it rented through AWS, GCP, Azure, or owned for economic reasons or latency reasons, I had a similar set of characteristics where my development team could hit an API endpoint and provision instances in a matter of seconds when they had an idea and only pay for what they use, back to kind of corporate IT. And what if they were able to use the same kind of developer tools they\\u2019ve become accustomed to using, be it Terraform scripts and the kinds of access that they are accustomed to using? How do you make those developers just as productive across the business, instead of just through public cloud infrastructure?


At that point, then you are in a much stronger position where you can say, you know, for a portion of things that are, as you pointed out, you know, more unpredictable, and where I want to leverage a bunch of additional services that a particular cloud provider has, I can rent that. And where I\\u2019ve got more persistent workloads or where I want a different economic profile or I need to have something in a very low latency manner to another set of services, I can own it. And that\\u2019s where I think the real chasm is because today, you just don\\u2019t\\u2014we take for granted the basic plumbing of cloud computing, you know? Elastic Compute, Elastic Storage, you know, networking and security services. And us in the cloud industry end up wanting to talk a lot more about exotic services and, sort of, higher-up stack capabilities. None of that basic plumbing is accessible on-prem.


Corey: I also am curious as to where exactly Oxide lives in the stack because I used to build computers for myself in 2000, and it seems like having gone down that path a bit recently, yeah, that process hasn\\u2019t really improved all that much. The same off-the-shelf components still exist and that\\u2019s great. We always used to disparagingly call spinning hard drives as spinning rust in racks. You named the company Oxide; you\\u2019re talking an awful lot about the Rust programming language in public a fair bit of the time, and I\\u2019m starting to wonder if maybe words don\\u2019t mean what I thought they meant anymore. Where do you folks start and stop, exactly?


Steve: Yeah, that\\u2019s a good question. And when we started, we sort of thought the scope of what we were going to do and then what we were going to leverage was smaller than it has turned out to be. And by that I mean, man, over the last three years, we have hit a bunch of forks in the road where we had questions about do we take something off the shelf or do we build it ourselves. And we did not try to build everything ourselves. So, to give you a sense of kind of where the dotted line is, around the Oxide product, what we\\u2019re delivering to customers is a rack-level computer. So, the minimum size comes in rack form. And I think your listeners are probably pretty familiar with this. But, you know, a rack is\\u2014


Corey: You would be surprised. It\\u2019s basically, what are they about seven feet tall?


Steve: Yeah, about eight feet tall.


Corey: Yeah, yeah. Seven, eight feet, weighs a couple 1000 pounds, you know, make an insulting joke about\\u2014


Steve: Two feet wide.


Corey: \\u2014NBA players here. Yeah, all kinds of these things.


Steve: Yeah. And big hunk of metal. And in the cases of on-premises infrastructure, it\\u2019s kind of a big hunk of metal hole, and then a bunch of 1U and 2U boxes crammed into it. What the hyperscalers have done is something very different. They started looking at, you know, at the rack level, how can you get much more dense, power-efficient designs, doing things like using a DC bus bar down the back, instead of having 64 power supplies with cables hanging all over the place in a rack, which I\\u2019m sure is what you\\u2019re more familiar with.


Corey: Tremendous amount of weight as well because you have the metal chassis for all of those 1U things, which in some cases, you wind up with, what, 46U in a rack, assuming you can even handle the cooling needs of all that.


Steve: That\\u2019s right.


Corey: You have so much duplication, and so much of the weight is just metal separating one thing from the next thing down below it. And there are opportunities for massive improvement, but you need to be at a certain point of scale to get there.


Steve: You do. You do. And you also have to be taking on the entire problem. You can\\u2019t pick at parts of these things. And that\\u2019s really what we found. So, we started at this sort of\\u2014the rack level as sort of the design principle for the product itself and found that that gave us the ability to get to the right geometry, to get as much CPU horsepower and storage and throughput and networking into that kind of chassis for the least amount of wattage required, kind of the most power-efficient design possible.


So, it ships at the rack level and it ships complete with both our server sled systems in Oxide, a pair of Oxide switches. This is\\u2014when I talk about, like, design decisions, you know, do we build our own switch, it was a big, big, big question early on. We were fortunate even though we were leaning towards thinking we needed to go do that, we had this prospective early investor who was early at AWS and he had asked a very tough question that none of our other investors had asked to this point, which is, \\u201cWhat are you going to do about the switch?\\u201d


And we knew that the right answer to an investor is like, \\u201cNo. We\\u2019re already taking on too much.\\u201d We\\u2019re redesigning a server from scratch in, kind of, the mold of what some of the hyperscalers have learned, doing our own Root of Trust, we\\u2019re doing our own operating system, hypervisor control plane, et cetera. Taking on the switch could be seen as too much, but we told them, you know, we think that to be able to pull through all of the value of the security benefits and the performance and observability benefits, we can\\u2019t have then this [laugh], like, obscure third-party switch rammed into this rack.


Corey: It\\u2019s one of those things that people don\\u2019t think about, but it\\u2019s the magic of cloud with AWS\\u2019s network, for example, it\\u2019s magic. You can get line rate\\u2014or damn near it\\u2014between any two points, sustained.


Steve: That\\u2019s right.


Corey: Try that in the data center, you wind into massive congestion with top-of-rack switches, where, okay, we\\u2019re going to parallelize this stuff out over, you know, two dozen racks and we\\u2019re all going to have them seamlessly transfer information between each other at line rate. It\\u2019s like, \\u201c[laugh] no, you\\u2019re not because those top-of-rack switches will melt and become side-of-rack switches, and then bottom-puddle-of-rack switches. It doesn\\u2019t work that way.\\u201d


Steve: That\\u2019s right.


Corey: And you have to put a lot of thought and planning into it. That is something that I\\u2019ve not heard a traditional networking vendor addressing because everyone loves to hand-wave over it.


Steve: Well so, and this particular prospective investor, we told him, \\u201cWe think we have to go build our own switch.\\u201d And he said, \\u201cGreat.\\u201d And we said, \\u201cYou know, we think we\\u2019re going to lose you as an investor as a result, but this is what we\\u2019re doing.\\u201d And he said, \\u201cIf you\\u2019re building your own switch, I want to invest.\\u201d And his comment really stuck with us, which is AWS did not stand on their own two feet until they threw out their proprietary switch vendor and built their own.


And that really unlocked, like you\\u2019ve just mentioned, like, their ability, both in hardware and software to tune and optimize to deliver that kind of line rate capability. And that is one of the big findings for us as we got into it. Yes, it was really, really hard, but based on a couple of design decisions, P4 being the programming language that we are using as the surround for our silicon, tons of opportunities opened up for us to be able to do similar kinds of optimization and observability. And that has been a big, big win.


But to your question of, like, where does it stop? So, we are delivering this complete with a baked-in operating system, hypervisor, control plane. And so, the endpoint of the system, where the customer meets is either hitting an API or a CLI or a console that delivers and kind of gives you the ability to spin up projects. And, you know, if one is familiar with EC2 and EBS and VPC, that VM level of abstraction is where we stop.


Corey: That, I think, is a fair way of thinking about it. And a lot of cloud folks are going to pooh-pooh it as far as saying, \\u201cOh well, just virtual machines. That\\u2019s old cloud. That just treats the cloud like a data center.\\u201d And in many cases, yes, it does because there are ways to build modern architectures that are event-driven on top of things like Lambda, and API Gateway, and the rest, but you take a look at what my customers are doing and what drives the spend, it is invariably virtual machines that are largely persistent.


Sometimes they scale up, sometimes they scale down, but there\\u2019s always a baseline level of load that people like to hand-wave away the fact that what they\\u2019re fundamentally doing in a lot of these cases, is paying the cloud provider to handle the care and feeding of those systems, which can be expensive, yes, but also delivers significant innovation beyond what almost any company is going to be able to deliver in-house. There is no way around it. AWS is better than you are\\u2014whoever you happen to\\u2014be at replacing failed hard drives. That is a simple fact. They have teams of people who are the best in the world of replacing failed hard drives. You generally do not. They are going to be better at that than you. But that\\u2019s not the only axis. There\\u2019s not one calculus that leads to, is cloud a scam or is cloud a great value proposition for us? The answer is always a deeply nuanced, \\u201cIt depends.\\u201d


Steve: Yeah, I mean, I think cloud is a great value proposition for most and a growing amount of software that\\u2019s being developed and deployed and operated. And I think, you know, one of the myths that is out there is, hey, turn over your IT to AWS because we have or you know, a cloud provider\\u2014because we have such higher caliber personnel that are really good at swapping hard drives and dealing with networks and operationally keeping this thing running in a highly available manner that delivers good performance. That is certainly true, but a lot of the operational value in an AWS is been delivered via software, the automation, the observability, and not actual people putting hands on things. And it\\u2019s an important point because that\\u2019s been a big part of what we\\u2019re building into the product. You know, just because you\\u2019re running infrastructure in your own data center, it does not mean that you should have to spend, you know, 1000 hours a month across a big team to maintain and operate it. And so, part of that, kind of, cloud, hyperscaler innovation that we\\u2019re baking into this product is so that it is easier to operate with much, much, much lower overhead in a highly available, resilient manner.


Corey: So, I\\u2019ve worked in a number of data center facilities, but the companies I was working with, were always at a scale where these were co-locations, where they would, in some cases, rent out a rack or two, in other cases, they\\u2019d rent out a cage and fill it with their own racks. They didn\\u2019t own the facilities themselves. Those were always handled by other companies. So, my question for you is, if I want to get a pile of Oxide racks into my environment in a data center, what has to change? What are the expectations?


I mean, yes, there\\u2019s obviously going to be power and requirements at the data center colocation is very conversant with, but Open Compute, for example, had very specific requirements\\u2014to my understanding\\u2014around things like the airflow construction of the environment that they\\u2019re placed within. How prescriptive is what you\\u2019ve built, in terms of doing a building retrofit to start using you folks?


Steve: Yeah, definitely not. And this was one of the tensions that we had to balance as we were designing the product. For all of the benefits of hyperscaler computing, some of the design center for you know, the kinds of racks that run in Google and Amazon and elsewhere are hyperscaler-focused, which is unlimited power, in some cases, data centers designed around the equipment itself. And where we were headed, which was basically making hyperscaler infrastructure available to, kind of, the masses, the rest of the market, these folks don\\u2019t have unlimited power and they aren\\u2019t going to go be able to go redesign data centers. And so no, the experience should be\\u2014with exceptions for folks maybe that have very, very limited access to power\\u2014that you roll this rack into your existing data center. It\\u2019s on standard floor tile, that you give it power, and give it networking and go.


And we\\u2019ve spent a lot of time thinking about how we can operate in the wide-ranging environmental characteristics that are commonplace in data centers that focus on themselves, colo facilities, and the like. So, that\\u2019s really on us so that the customer is not having to go to much work at all to kind of prepare and be ready for it.


Corey: One of the challenges I have is how to think about what you\\u2019ve done because you are rack-sized. But what that means is that my own experimentation at home recently with on-prem stuff for smart home stuff involves a bunch of Raspberries Pi and a [unintelligible 00:19:42], but I tend to more or less categorize you the same way that I do AWS Outposts, as well as mythical creatures, like unicorns or giraffes, where I don\\u2019t believe that all these things actually exist because I haven\\u2019t seen them. And in fact, to get them in my house, all four of those things would theoretically require a loading dock if they existed, and that\\u2019s a hard thing to fake on a demo signup form, as it turns out. How vaporware is what you\\u2019ve built? Is this all on paper and you\\u2019re telling amazing stories or do they exist in the wild?


Steve: So, last time we were on, it was all vaporware. It was a couple of napkin drawings and a seed round of funding.


Corey: I do recall you not using that description at the time, for what it\\u2019s worth. Good job.


Steve: [laugh]. Yeah, well, at least we were transparent where we were going through the race. We had some napkin drawings and we had some good ideas\\u2014we thought\\u2014and\\u2014


Corey: You formalize those and that\\u2019s called Microsoft PowerPoint.


Steve: That\\u2019s it. A hundred percent.


Corey: The next generative AI play is take the scrunched-up, stained napkin drawing, take a picture of it, and convert it to a slide.


Steve: Google Docs, you know, one of those. But no, it\\u2019s got a lot of scars from the build and it is real. In fact, next week, we are going to be shipping our first commercial systems. So, we have got a line of racks out in our manufacturing facility in lovely Rochester, Minnesota. Fun fact: Rochester, Minnesota, is where the IBM AS/400s were built.


Corey: I used to work in that market, of all things.


Steve: Really?


Corey: Selling tape drives in the AS/400. I mean, I still maintain there\\u2019s no real mainframe migration to the cloud play because there\\u2019s no AWS/400. A joke that tends to sail over an awful lot of people\\u2019s heads because, you know, most people aren\\u2019t as miserable in their career choices as I am.


Steve: Okay, that reminds me. So, when we were originally pitching Oxide and we were fundraising, we [laugh]\\u2014in a particular investor meeting, they asked, you know, \\u201cWhat would be a good comp? Like how should we think about what you are doing?\\u201d And fortunately, we had about 20 investor meetings to go through, so burning one on this was probably okay, but we may have used the AS/400 as a comp, talking about how [laugh] mainframe systems did such a good job of building hardware and software together. And as you can imagine, there were some blank stares in that room.


But you know, there are some good analogs to historically in the computing industry, when you know, the industry, the major players in the industry, were thinking about how to deliver holistic systems to support end customers. And, you know, we see this in the what Apple has done with the iPhone, and you\\u2019re seeing this as a lot of stuff in the automotive industry is being pulled in-house. I was listening to a good podcast. Jim Farley from Ford was talking about how the automotive industry historically outsourced all of the software that controls cars, right? So, like, Bosch would write the software for the controls for your seats.


And they had all these suppliers that were writing the software, and what it meant was that innovation was not possible because you\\u2019d have to go out to suppliers to get software changes for any little change you wanted to make. And in the computing industry, in the 80s, you saw this blow apart where, like, firmware got outsourced. In the IBM and the clones, kind of, race, everyone started outsourcing firmware and outsourcing software. Microsoft started taking over operating systems. And then VMware emerged and was doing a virtualization layer.


And this, kind of, fragmented ecosystem is the landscape today that every single on-premises infrastructure operator has to struggle with. It\\u2019s a kit car. And so, pulling it back together, designing things in a vertically integrated manner is what the hyperscalers have done. And so, you mentioned Outposts. And, like, it\\u2019s a good example of\\u2014I mean, the most public cloud of public cloud companies created a way for folks to get their system on-prem.


I mean, if you need anything to underscore the draw and the demand for cloud computing-like, infrastructure on-prem, just the fact that that emerged at all tells you that there is this big need. Because you\\u2019ve got, you know, I don\\u2019t know, a trillion dollars worth of IT infrastructure out there and you have maybe 10% of it in the public cloud. And that\\u2019s up from 5% when Jassy was on stage in \\u201921, talking about 95% of stuff living outside of AWS, but there\\u2019s going to be a giant market of customers that need to own and operate infrastructure. And again, things have not improved much in the last 10 or 20 years for them.


Corey: They have taken a tone onstage about how, \\u201cOh, those workloads that aren\\u2019t in the cloud, yet, yeah, those people are legacy idiots.\\u201d And I don\\u2019t buy that for a second because believe it or not\\u2014I know that this cuts against what people commonly believe in public\\u2014but company execs are generally not morons, and they make decisions with context and constraints that we don\\u2019t see. Things are the way they are for a reason. And I promise that 90% of corporate IT workloads that still live on-prem are not being managed or run by people who\\u2019ve never heard of the cloud. There was a decision made when some other things were migrating of, do we move this thing to the cloud or don\\u2019t we? And the answer at the time was no, we\\u2019re going to keep this thing on-prem where it is now for a variety of reasons of varying validity. But I don\\u2019t view that as a bug. I also, frankly, don\\u2019t want to live in a world where all the computers are basically run by three different companies.


Steve: You\\u2019re spot on, which is, like, it does a total disservice to these smart and forward-thinking teams in every one of the Fortune 1000-plus companies who are taking the constraints that they have\\u2014and some of those constraints are not monetary or entirely workload-based. If you want to flip it around, we were talking to a large cloud SaaS company and their reason for wanting to extend it beyond the public cloud is because they want to improve latency for their e-commerce platform. And navigating their way through the complex layers of the networking stack at GCP to get to where the customer assets are that are in colo facilities, adds lag time on the platform that can cost them hundreds of millions of dollars. And so, we need to think behind this notion of, like, \\u201cOh, well, the dark ages are for software that can\\u2019t run in the cloud, and that\\u2019s on-prem. And it\\u2019s just a matter of time until everything moves to the cloud.\\u201d


In the forward-thinking models of public cloud, it should be both. I mean, you should have a consistent experience, from a certain level of the stack down, everywhere. And then it\\u2019s like, do I want to rent or do I want to own for this particular use case? In my vast set of infrastructure needs, do I want this to run in a data center that Amazon runs or do I want this to run in a facility that is close to this other provider of mine? And I think that\\u2019s best for all. And then it\\u2019s not this kind of false dichotomy of quality infrastructure or ownership.


Corey: I find that there are also workloads where people will come to me and say, \\u201cWell, we don\\u2019t think this is going to be economical in the cloud\\u201d\\u2014because again, I focus on AWS bills. That is the lens I view things through, and\\u2014\\u201cThe AWS sales rep says it will be. What do you think?\\u201d And I look at what they\\u2019re doing and especially if involves high volumes of data transfer, I laugh a good hearty laugh and say, \\u201cYeah, keep that thing in the data center where it is right now. You will thank me for it later.\\u201d


It\\u2019s, \\u201cWell, can we run this in an economical way in AWS?\\u201d As long as you\\u2019re okay with economical meaning six times what you\\u2019re paying a year right now for the same thing, yeah, you can. I wouldn\\u2019t recommend it. And the numbers sort of speak for themselves. But it\\u2019s not just an economic play.


There\\u2019s also the story of, does this increase their capability? Does it let them move faster toward their business goals? And in a lot of cases, the answer is no, it doesn\\u2019t. It\\u2019s one of those business process things that has to exist for a variety of reasons. You don\\u2019t get to reimagine it for funsies and even if you did, it doesn\\u2019t advance the company in what they\\u2019re trying to do any, so focus on something that differentiates as opposed to this thing that you\\u2019re stuck on.


Steve: That\\u2019s right. And what we see today is, it is easy to be in that mindset of running things on-premises is kind of backwards-facing because the experience of it is today still very, very difficult. I mean, talking to folks and they\\u2019re sharing with us that it takes a hundred days from the time all the different boxes land in their warehouse to actually having usable infrastructure that developers can use. And our goal and what we intend to go hit with Oxide as you can roll in this complete rack-level system, plug it in, within an hour, you have developers that are accessing cloud-like services out of the infrastructure. And that\\u2014God, countless stories of firmware bugs that would send all the fans in the data center nonlinear and soak up 100 kW of power.


Corey: Oh, God. And the problems that you had with the out-of-band management systems. For a long time, I thought Drax stood for, \\u201cDell, RMA Another Computer.\\u201d It was awful having to deal with those things. There was so much room for innovation in that space, which no one really grabbed onto.


Steve: There was a really, really interesting talk at DEFCON that we just stumbled upon yesterday. The NVIDIA folks are giving a talk on BMC exploits\\u2026 and like, a very, very serious BMC exploit. And again, it\\u2019s what most people don\\u2019t know is, like, first of all, the BMC, the Baseboard Management Controller, is like the brainstem of the computer. It has access to\\u2014it\\u2019s a backdoor into all of your infrastructure. It\\u2019s a computer inside a computer and it\\u2019s got software and hardware that your server OEM didn\\u2019t build and doesn\\u2019t understand very well.


And firmware is even worse because you know, firmware written by you know, an American Megatrends or other is a big blob of software that gets loaded into these systems that is very hard to audit and very hard to ascertain what\\u2019s happening. And it\\u2019s no surprise when, you know, back when we were running all the data centers at a cloud computing company, that you\\u2019d run into these issues, and you\\u2019d go to the server OEM and they\\u2019d kind of throw their hands up. Well, first they\\u2019d gaslight you and say, \\u201cWe\\u2019ve never seen this problem before,\\u201d but when you thought you\\u2019ve root-caused something down to firmware, it was anyone\\u2019s guess. And this is kind of the current condition today. And back to, like, the journey to get here, we kind of realized that you had to blow away that old extant firmware layer, and we rewrote our own firmware in Rust. Yes [laugh], I\\u2019ve done a lot in Rust.


Corey: No, it was in Rust, but, on some level, that\\u2019s what Nitro is, as best I can tell, on the AWS side. But it turns out that you don\\u2019t tend to have the same resources as a one-and-a-quarter\\u2014at the moment\\u2014trillion-dollar company. That keeps [valuing 00:30:53]. At one point, they lost a comma and that was sad and broke all my logic for that and I haven\\u2019t fixed it since. Unfortunate stuff.


Steve: Totally. I think that was another, kind of, question early on from certainly a lot of investors was like, \\u201cHey, how are you going to pull this off with a smaller team and there\\u2019s a lot of surface area here?\\u201d Certainly a reasonable question. Definitely was hard. The one advantage\\u2014among others\\u2014is, when you are designing something kind of in a vertical holistic manner, those design integration points are narrowed down to just your equipment.


And when someone\\u2019s writing firmware, when AMI is writing firmware, they\\u2019re trying to do it to cover hundreds and hundreds of components across dozens and dozens of vendors. And we have the advantage of having this, like, purpose-built system, kind of, end-to-end from the lowest level from first boot instruction, all the way up through the control plane and from rack to switch to server. That definitely helped narrow the scope.


Corey: This episode has been fake sponsored by our friends at AWS with the following message: Graviton Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton. Thank you for your l-, lack of support for this show. Now, AWS has been talking about Graviton an awful lot, which is their custom in-house ARM processor. Apple moved over to ARM and instead of talking about benchmarks they won\\u2019t publish and marketing campaigns with words that don\\u2019t mean anything, they\\u2019ve let the results speak for themselves. In time, I found that almost all of my workloads have moved over to ARM architecture for a variety of reason, and my laptop now gets 15 hours of battery life when all is said and done. You\\u2019re building these things on top of x86. What is the deal there? I do not accept that if that you hadn\\u2019t heard of ARM until just now because, as mentioned, Graviton, Graviton, Graviton.


Steve: That\\u2019s right. Well, so why x86, to start? And I say to start because we have just launched our first generation products. And our first-generation or second-generation products that we are now underway working on are going to be x86 as well. We\\u2019ve built this system on AMD Milan silicon; we are going to be launching a Genoa sled.


But when you\\u2019re thinking about what silicon to use, obviously, there\\u2019s a bunch of parts that go into the decision. You\\u2019re looking at the kind of applicability to workload, performance, power management, for sure, and if you carve up what you are trying to achieve, x86 is still a terrific fit for the broadest set of workloads that our customers are trying to solve for. And choosing which x86 architecture was certainly an easier choice, come 2019. At this point, AMD had made a bunch of improvements in performance and energy efficiency in the chip itself. We\\u2019ve looked at other architectures and I think as we are incorporating those in the future roadmap, it\\u2019s just going to be a question of what are you trying to solve for.


You mentioned power management, and that is kind of commonly been a, you know, low power systems is where folks have gone beyond x86. Is we\\u2019re looking forward to hardware acceleration products and future products, we\\u2019ll certainly look beyond x86, but x86 has a long, long road to go. It still is kind of the foundation for what, again, is a general-purpose cloud infrastructure for being able to slice and dice for a variety of workloads.


Corey: True. I have to look around my environment and realize that Intel is not going anywhere. And that\\u2019s not just an insult to their lack of progress on committed roadmaps that they consistently miss. But\\u2014


Steve: [sigh].


Corey: Enough on that particular topic because we want to keep this, you know, polite.


Steve: Intel has definitely had some struggles for sure. They\\u2019re very public ones, I think. We were really excited and continue to be very excited about their Tofino silicon line. And this came by way of the Barefoot networks acquisition. I don\\u2019t know how much you had paid attention to Tofino, but what was really, really compelling about Tofino is the focus on both hardware and software and programmability.


So, great chip. And P4 is the programming language that surrounds that. And we have gotten very, very deep on P4, and that is some of the best tech to come out of Intel lately. But from a core silicon perspective for the rack, we went with AMD. And again, that was a pretty straightforward decision at the time. And we\\u2019re planning on having this anchored around AMD silicon for a while now.


Corey: One last question I have before we wind up calling it an episode, it seems\\u2014at least as of this recording, it\\u2019s still embargoed, but we\\u2019re not releasing this until that winds up changing\\u2014you folks have just raised another round, which means that your napkin doodles have apparently drawn more folks in, and now that you\\u2019re shipping, you\\u2019re also not just bringing in customers, but also additional investor money. Tell me about that.


Steve: Yes, we just completed our Series A. So, when we last spoke three years ago, we had just raised our seed and had raised $20 million at the time, and we had expected that it was going to take about that to be able to build the team and build the product and be able to get to market, and [unintelligible 00:36:14] tons of technical risk along the way. I mean, there was technical risk up and down the stack around this [De Novo 00:36:21] server design, this the switch design. And software is still the kind of disproportionate majority of what this product is, from hypervisor up through kind of control plane, the cloud services, et cetera. So\\u2014


Corey: We just view it as software with a really, really confusing hardware dongle.


Steve: [laugh]. Yeah. Yes.


Corey: Super heavy. We\\u2019re talking enterprise and government-grade here.


Steve: That\\u2019s right. There\\u2019s a lot of software to write. And so, we had a bunch of milestones that as we got through them, one of the big ones was getting Milan silicon booting on our firmware. It was funny it was\\u2014this was the thing that clearly, like, the industry was most suspicious of, us doing our own firmware, and you could see it when we demonstrated booting this, like, a year-and-a-half ago, and AMD all of a sudden just lit up, from kind of arm\\u2019s length to, like, \\u201cHow can we help? This is amazing.\\u201d You know? And they could start to see the benefits of when you can tie low-level silicon intelligence up through a hypervisor there\\u2019s just\\u2014


Corey: No I love the existing firmware I have. Looks like it was written in 1984 and winds up having terrible user ergonomics that hasn\\u2019t been updated at all, and every time something comes through, it\\u2019s a 50/50 shot as whether it fries the box or not. Yeah. No, I want that.


Steve: That\\u2019s right. And you look at these hyperscale data centers, and it\\u2019s like, no. I mean, you\\u2019ve got intelligence from that first boot instruction through a Root of Trust, up through the software of the hyperscaler, and up to the user level. And so, as we were going through and kind of knocking down each one of these layers of the stack, doing our own firmware, doing our own hardware Root of Trust, getting that all the way plumbed up into the hypervisor and the control plane, number one on the customer side, folks moved from, \\u201cThis is really interesting. We need to figure out how we can bring cloud capabilities to our data centers. Talk to us when you have something,\\u201d to, \\u201cOkay. We actually\\u201d\\u2014back to the earlier question on vaporware, you know, it was great having customers out here to Emeryville where they can put their hands on the rack and they can, you know, put your hands on software, but being able to, like, look at real running software and that end cloud experience.


And that led to getting our first couple of commercial contracts. So, we\\u2019ve got some great first customers, including a large department of the government, of the federal government, and a leading firm on Wall Street that we\\u2019re going to be shipping systems to in a matter of weeks. And as you can imagine, along with that, that drew a bunch of renewed interest from the investor community. Certainly, a different climate today than it was back in 2019, but what was great to see is, you still have great investors that understand the importance of making bets in the hard tech space and in companies that are looking to reinvent certain industries. And so, we added\\u2014our existing investors all participated. We added a bunch of terrific new investors, both strategic and institutional.


And you know, this capital is going to be super important now that we are headed into market and we are beginning to scale up the business and make sure that we have a long road to go. And of course, maybe as importantly, this was a real confidence boost for our customers. They\\u2019re excited to see that Oxide is going to be around for a long time and that they can invest in this technology as an important part of their infrastructure strategy.


Corey: I really want to thank you for taking the time to speak with me about, well, how far you\\u2019ve come in a few years. If people want to learn more and have the requisite loading dock, where should they go to find you?


Steve: So, we try to put everything up on the site. So, oxidecomputer.com or oxide.computer. We also, if you remember, we did [On the Metal 00:40:07]. So, we had a Tales from the Hardware-Software Interface podcast that we did when we started. We have shifted that to Oxide and Friends, which the shift there is we\\u2019re spending a little bit more time talking about the guts of what we built and why. So, if folks are interested in, like, why the heck did you build a switch and what does it look like to build a switch, we actually go to depth on that. And you know, what does bring-up on a new server motherboard look like? And it\\u2019s got some episodes out there that might be worth checking out.


Corey: We will definitely include a link to that in the [show notes 00:40:36]. Thank you so much for your time. I really appreciate it.


Steve: Yeah, Corey. Thanks for having me on.


Corey: Steve Tuck, CEO at Oxide Computer Company. I\\u2019m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you\\u2019ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you\\u2019ve hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry ranting comment because you are in fact a zoology major, and you\\u2019re telling me that some animals do in fact exist. But I\\u2019m pretty sure of the two of them, it\\u2019s the unicorn.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

'