Broadcasts.com - "Fixing Whats Broken in Monitoring and Observability with Jean Yang" (Screaming in the Cloud)

Technology
SEE MORE
- classical
- general
- talk
- News
- Family
- Bürgerfunk
- pop
- Islam
- soul
- jazz
- Comedy
- humor
- wissenschaft
- opera
- baroque
- gesellschaft
- theater
- Local
- alternative
- electro
- rock
- rap
- lifestyle
- Music
- como
- RNE
- ballads
- greek
- Buddhism
- deportes
- christian
- piano
- djs
- Dance
- dutch
- flamenco
- social
- hope
- christian rock
- academia
- afrique
- Business
- musique
- ελληνική-μουσική
- religion
- World radio
- Zarzuela
- travel
- World
- NFL
- media
- Art
- public
- Sports
- Gospel
- st.
- baptist
- Leisure
- Kids & Family
- musical
- club
- Culture
- Health & Fitness
- True Crime
- Fiction
- children
- Society & Culture
- TV & Film
- gold
- kunst
- música
- gay
- Natural
- a
- francais
- bach
- economics
- kultur
- evangelical
- tech
- Opinion
- Government
- gaming
- College
- technik
- History
- Jesus
- Health
- movies
- radio
- services
- Church
- podcast
- Education
- international
- Transportation
- Other
- kids
- podcasts
- philadelphia
- Noticias
- love
- sport
- Salud
- film
- and
- 4chan
- Disco
- Stories
- fashion
- Arts
- interviews
- hardstyle
- entertainment
- humour
- medieval
- literature
- alma
- Cultura
- video
- TV
- Science
- en

Fixing Whats Broken in Monitoring and Observability with Jean Yang

Published: April 20, 2023, 10 a.m.

Jean Yang, CEO of Akita Software, joins Corey on Screaming in the Cloud to discuss how she went from academia to tech founder, and what her company is doing to improve monitoring and observability. Jean explains why Akita is different from other observability & monitoring solutions, and how it bridges the gap from what people know they should be doing and what they actually do in practice. Corey and Jean explore why the monitoring and observability space has been so broken, and why it\\u2019s important for people to see monitoring as a chore and not a hobby. Jean also reveals how she took a leap from being an academic professor to founding a tech start-up.\\xa0

About Jean

Jean Yang is the founder and CEO of Akita Software, providing the fastest time-to-value for API monitoring. Jean was previously a tenure-track professor in Computer Science at Carnegie Mellon University.

Links Referenced:

Akita Software: https://www.akitasoftware.com/\\n
Aki the dog chatbot: https://www.akitasoftware.com/blog-posts/we-built-an-exceedingly-polite-ai-dog-that-answers-questions-about-your-apis\\n
Twitter: https://twitter.com/jeanqasaur\\n

Transcript

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: Welcome to Screaming in the Cloud. I\\u2019m Corey Quinn. My guest today is someone whose company has\\u2026 well, let\\u2019s just say that it has piqued my interest. Jean Yang is the CEO of Akita Software and not only is it named after a breed of dog, which frankly, Amazon service namers could take a lot of lessons from, but it also tends to approach observability slash monitoring from a perspective of solving the problem rather than preaching a new orthodoxy. Jean, thank you for joining me.

Jean: Thank you for having me. Very excited.

Corey: In the world that we tend to operate in, there are so many different observability tools, and as best I can determine observability is hipster monitoring. Well, if we call it monitoring, we can\\u2019t charge you quite as much money for it. And whenever you go into any environment of significant scale, we pretty quickly discover that, \\u201cWhat monitoring tool are you using?\\u201d The answer is, \\u201cHere are the 15 that we use.\\u201d Then you talk to other monitoring and observability companies and ask them which ones of those they\\u2019ve replace, and the answer becomes, \\u201cWe\\u2019re number 16.\\u201d Which is less compelling of a pitch than you might expect. What does Akita do? Where do you folks start and stop?

Jean: We want to be\\u2014at Akita\\u2014your first stop for monitoring and we want to be all of the monitoring, you need up to a certain level. And here\\u2019s the motivation. So, we\\u2019ve talked with hundreds, if not thousands, of software teams over the last few years and what we found is there is such a gap between best practice, what people think everybody else is doing, what people are talking about at conferences, and what\\u2019s actually happening in software teams. And so, what software teams have told me over and over again, is, hey, we either don\\u2019t actually use very many tools at all, or we use 15 tools in name, but it\\u2019s you know, one [laugh] one person on the team set this one up, it\\u2019s monitoring one of our endpoints, we don\\u2019t even know which one sometimes. Who knows what the thresholds are really supposed to be. We got too many alerts one day, we turned it off.

But there\\u2019s very much a gap between what people are saying they\\u2019re supposed to do, what people in their heads say they\\u2019re going to do next quarter or the quarter after that and what\\u2019s really happening in practice. And what we saw was teams are falling more and more into monitoring debt. And so effectively, their customers are becoming their monitoring and it\\u2019s getting harder to catch up. And so, what Akita does is we\\u2019re the fastest, easiest way for teams to quickly see what endpoints you have in your system\\u2014so that\\u2019s API endpoints\\u2014what\\u2019s slow and what\\u2019s throwing errors. And you might wonder, okay, wait, wait, wait, Jean. Monitoring is usually about, like, logs, metrics, and traces. I\\u2019m not used to hearing about API\\u2014like, what do APIs have to do with any of it?

And my view is, look, we want the most simple form of what might be wrong with your system, we want a developer to be able to get started without having to change any code, make any annotations, drop in any libraries. APIs are something you can watch from the outside of a system. And when it comes to which alerts actually matter, where do you want errors to be alerts, where do you want thresholds to really matter, my view is, look, the places where your system interfaces with another system are probably where you want to start if you\\u2019ve really gotten nothing. And so, Akita view is, we\\u2019re going to start from the outside in on this monitoring. We\\u2019re turning a lot of the views on monitoring and observability on its head and we just want to be the tool that you reach for if you\\u2019ve got nothing, it\\u2019s middle of the night, you have alerts on some endpoint, and you don\\u2019t want to spend a few hours or weeks setting up some other tool. And we also want to be able to grow with you up until you need that power tool that many of the existing solutions out there are today.

Corey: It feels like monitoring is very often one of those reactive things. I come from the infrastructure world, so you start off with, \\u201cWhat do you use for monitoring?\\u201d \\u201cOh, we wait till the help desk calls us and users are reporting a problem.\\u201d Okay, that gets you somewhere. And then it becomes oh, well, what was wrong that time? The drive filled up. Okay, so we\\u2019re going to build checks in that tell us when the drives are filling up.

And you wind up trying to enumerate all of the different badness. And as a result, if you leave that to its logical conclusion, one of the stories that I heard out of MySpace once upon a time\\u2014which dates me somewhat\\u2014is that you would have a shift, so there were three shifts working around the clock, and each one would open about 5000 tickets, give or take, for the monitoring alerts that wound up firing off throughout their infrastructure. At that point, it\\u2019s almost, why bother? Because no one is going to be around to triage these things; no one is going to see any of the signal buried and all of that noise. When you talk about doing this for an API perspective, are you running synthetics against those APIs? Are you shimming them in order to see what\\u2019s passing through them? What\\u2019s the implementation side look like?

Jean: Yeah, that\\u2019s a great question. So, we\\u2019re using a technology called BPF, Berkeley Packet Filter. The more trendy, buzzy term is EBPF\\u2014

Corey: The EBPF. Oh yes.

Jean: Yeah, Extended Berkeley Packet Filter. But here\\u2019s the secret, we only use the BPF part. It\\u2019s actually a little easier for users to install. The E part is, you know, fancy and often finicky. But um\\u2014

Corey: SEBPF then: Shortened Extended BPF. Why not?

Jean: [laugh]. Yeah. And what BPF allows us to do is passively watch traffic from the outside of a system. So, think of it as you\\u2019re sending API calls across the network. We\\u2019re just watching that network. We\\u2019re not in the path of that traffic. So, we\\u2019re not intercepting the traffic in any way, we\\u2019re not creating any additional overhead for the traffic, we\\u2019re not slowing it down in any way. We\\u2019re just sitting on the side, we\\u2019re watching all of it, and then we\\u2019re taking that and shipping an obfuscated version off to our cloud, and then we\\u2019re giving you analytics on that.

Corey: One of the things that strikes me as being\\u2026 I guess, a common trope is there are a bunch of observability solutions out there that offer this sort of insight into what\\u2019s going on within an environment, but it\\u2019s, \\u201cStep one: instrument with some SDK or some agent across everything. Do an entire deploy across your fleet.\\u201d Which yeah, people are not generally going to be in a hurry to sign up for. And further, you also said a minute ago that the idea being that someone could start using this in the middle of the night in the middle of an outage, which tells me that it\\u2019s not, \\u201cStep one: get the infrastructure sparkling. Step two: do a global deploy to everything.\\u201d How do you go about doing that? What is the level of embeddedness into the environment?

Jean: Yeah, that\\u2019s a great question. So, the reason we chose BPF is I wanted a completely black-box solution. So, no SDKs, no code annotations. I wanted people to be able to change a config file and have our solution apply to anything that\\u2019s on the system. So, you could add routes, you could do all kinds of things. I wanted there to be no additional work on the part of the developer when that happened.

And so, we\\u2019re not the only solution that uses BPF or EBPF. There\\u2019s many other solutions that say, \\u201cHey, just drop us in. We\\u2019ll let you do anything you want.\\u201d The big difference is what happens with the traffic once it gets processed. So, what EBPF or BPF gives you is it watches everything about your system. And so, you can imagine that\\u2019s a lot of different events. That\\u2019s a lot of things.

If you\\u2019re trying to fix an incident in the middle of the night and someone just dumps on you 1000 pages of logs, like, what are you going to do with that? And so, our view is, the more interesting and important and valuable thing to do here is not make it so that you just have the ability to watch everything about your system but to make it so that developers don\\u2019t have to sift through thousands of events just to figure out what went wrong. So, we\\u2019ve spent years building algorithms to automatically analyze these API events to figure out, first of all, what are your endpoints? Because it\\u2019s one thing to turn on something like Wireshark and just say, okay, here are the thousand API calls, I saw\\u2014ten thousand\\u2014but it\\u2019s another thing to say, \\u201cHey, 500 of those were actually the same endpoint and 300 of those had errors.\\u201d That\\u2019s quite a hard problem.

And before us, it turns out that there was no other solution that even did that to the level of being able to compile together, \\u201cHere are all the slow calls to an endpoint,\\u201d or, \\u201cHere are all of the erroneous calls to an endpoint.\\u201d That was blood, sweat, and tears of developers in the night before. And so, that\\u2019s the first major thing we do. And then metrics on top of that. So, today we have what\\u2019s slow, what\\u2019s throwing errors. People have asked us for other things like show me what happened after I deployed. Show me what\\u2019s going on this week versus last week. But now that we have this data set, you can imagine there\\u2019s all kinds of questions we can now start answering much more quickly on top of it.

Corey: One thing that strikes me about your site is that when I go to akitasoftware.com, you\\u2019ve got a shout-out section at the top. And because I\\u2019ve been doing this long enough where I find that, yeah, you work at a company; you\\u2019re going to say all kinds of wonderful, amazing aspirational things about it, and basically because I have deep-seated personality disorders, I will make fun of those things as my default reflexive reaction. But something that AWS, for example, does very well is when they announce something ridiculous on stage at re:Invent, I make fun of it, as is normal, but then they have a customer come up and say, \\u201cAnd here\\u2019s the expensive, painful problem that they solved for us.\\u201d

And that\\u2019s where I shut up and start listening. Because it\\u2019s a very different story to get someone else, who is presumably not being paid, to get on stage and say, \\u201cYeah, this solved a sophisticated, painful problem.\\u201d Your shout-outs page has not just a laundry list of people saying great things about it, but there are former folks who have been on the show here, people I know and trust: Scott Johnson over at Docker, Gergely Orosz over at The Pragmatic Engineer, and other folks who have been luminaries in the space for a while. These are not the sort of people that are going to say, \\u201cOh, sure. Why not? Oh, you\\u2019re going to send me a $50 gift card in a Twitter DM? Sure I\\u2019ll say nice things,\\u201d like it\\u2019s one of those respond to a viral tweet spamming something nonsense. These are people who have gravitas. It\\u2019s clear that there\\u2019s something you\\u2019re building that is resonating.

Jean: Yeah. And for that, they found us. Everyone that I\\u2019ve tried to bribe to say good things about us actually [laugh] refused.

Corey: Oh, yeah. As it turns out that it\\u2019s one of those things where people are more expensive than you might think. It\\u2019s like, \\u201cWhat, you want me to sell my credibility down the road?\\u201d Doesn\\u2019t work super well. But there\\u2019s something like the unsolicited testimonials that come out of, this is amazing, once people start kicking the tires on it.

You\\u2019re currently in open beta. So, I guess my big question for you is, whenever you see a product that says, \\u201cOh, yeah, we solve everything cloud, on-prem, on physical instances, on virtual machines, on Docker, on serverless, everything across the board. It\\u2019s awesome.\\u201d I have some skepticism on that. What is your ideal application architecture that Akita works best on? And what sort of things are you a complete nonstarter for?

Jean: Yeah, I\\u2019ll start with a couple of things we work well on. So, container platforms. We work relatively well. So, that\\u2019s your Fargate, that\\u2019s your Azure Web Apps. But that, you know, things running, we call them container platforms. Kubernetes is also something that a lot of our users have picked us up and had success with us on. I will say our Kubernetes deploy is not as smooth as we would like. We say, you know, you can install us\\u2014

Corey: Well, that is Kubernetes, yes.

Jean: [laugh]. Yeah.

Corey: Nothing in Kubernetes is as smooth as we would like.

Jean: Yeah, so we\\u2019re actually rolling out Kubernetes injection support in the next couple of weeks. So, those are the two that people have had the most success on. If you\\u2019re running on bare metal or on a VM, we work, but I will say that you have to know your way around a little bit to get that to work. What we don\\u2019t work on is any Platform as a Service. So, like, a Heroku, a Lambda, a Render at the moment. So those, we haven\\u2019t found a way to passively listen to the network traffic in a good way right now.

And we also work best for unencrypted HTTP REST traffic. So, if you have encrypted traffic, it\\u2019s not a non-starter, but you need to fall into a couple of categories. You either need to be using Kubernetes, you can run Akita as a sidecar, or you\\u2019re using Nginx. And so, that\\u2019s something we\\u2019re still expanding support on. And we do not support GraphQL or GRPC at the moment.

Corey: That\\u2019s okay. Neither do I. It does seem these days that unencrypted HTTP API calls are increasingly becoming something of a relic, where folks are treating those as anti-patterns to be stamped out ruthlessly. Are you still seeing significant deployments of unencrypted APIs?

Jean: Yeah. [laugh]. So, Corey\\u2014

Corey: That is the reality, yes.

Jean: That\\u2019s a really good question, Corey, because in the beginning, we weren\\u2019t sure what we wanted to focus on. And I\\u2019m not saying the whole deployment is unencrypted HTTP, but there is a place to install Akita to watch where it\\u2019s unencrypted HTTP. And so, this is what I mean by if you have encrypted traffic, but you can install Akita as a Kubernetes sidecar, we can still watch that. But there was a big question when we started: should this be GraphQL, GRPC, or should it be REST? And I read the \\u201cState of the API Report\\u201d from Postman for you know, five years, and I still keep up with it.

And every year, it seemed that not only was REST, remaining dominant, it was actually growing. So, [laugh] this was shocking to me as well because people said, well, \\u201cWe have this more structured stuff, now. There\\u2019s GRPC, there\\u2019s GraphQL.\\u201d But it seems that for the added complexity, people weren\\u2019t necessarily seeing the value and so, REST continues to dominate. And I\\u2019ve actually even seen a decline in GraphQL since we first started doing this. So, I\\u2019m fully on board the REST wagon. And in terms of encrypted versus unencrypted, I would also like to see more encryption as well. That\\u2019s why we\\u2019re working on burning down the long tail of support for that.

Corey: Yeah, it\\u2019s one of those challenges. Whenever you\\u2019re deploying something relatively new, there\\u2019s this idea that it should be forward-looking and you, on some level, want to modernize your architecture and infrastructure to keep up with it. An AWS integration story I see that\\u2019s like that these days is, \\u201cOh, yeah, generate an IAM credential set and just upload those into our system.\\u201d Yeah, the modern way of doing that is role assumption: to find a role and here\\u2019s how to configure it so that it can do what we need to do. So, whenever you start seeing things that are, \\u201cOh, yeah, just turn the security clock back in time a little bit,\\u201d that\\u2019s always a little bit of an eyebrow raise.

I can also definitely empathize with the joys of dealing with anything that even touches networking in a Lambda context. Building the Lambda extension for Tailscale was one of the last big dives I made into that area and I still have nightmares as a result. It does a lot of interesting things right up until you step off the golden path. And then suddenly, everything becomes yaks all the way down, in desperate need of shaving.

Jean: Yeah, Lambda does something we want to handle on our roadmap, but I\\u2026 believe we need a bigger team before [laugh] we are ready to tackle that.

Corey: Yeah, we\\u2019re going to need a bigger boat is very often [laugh] the story people have when they start looking at entire new architectural paradigms. So, you end up talking about working in containerized environments. Do you find that most of your deployments are living in cloud environments, in private data centers, some people call them private cloud. Where does the bulk of your user applications tend to live these days?

Jean: The bulk of our user applications are in the cloud. So, we\\u2019re targeting small to medium businesses to start. The reason being, we want to give our users a magical deployment experience. So, right now, a lot of our users are deploying in under 30 minutes. That\\u2019s in no small part due to automations that we\\u2019ve built.

And so, we initially made the strategic decision to focus on places where we get the most visibility. And so\\u2014where one, we get the most visibility, and two, we are ready for that level of scale. So, we found that, you know, for a large business, we\\u2019ve run inside some of their production environments and there are API calls that we don\\u2019t yet handle well or it\\u2019s just such a large number of calls, we\\u2019re not doing the inference as well and our algorithms don\\u2019t work as well. And so, we\\u2019ve made the decision to start small, build our way up, and start in places where we can just aggressively iterate because we can see everything that\\u2019s going on. And so, we\\u2019ve stayed away, for instance, from any on-prem deployments for that reason because then we can\\u2019t see everything that\\u2019s going on. And so, smaller companies that are okay with us watching pretty much everything they\\u2019re doing has been where we started. And now we\\u2019re moving up into the medium-sized businesses.

Corey: The challenge that I guess I\\u2019m still trying to wrap my head around is, I think that it takes someone with a particularly rosy set of glasses on to look at the current state of monitoring and observability and say that it\\u2019s not profoundly broken in a whole bunch of ways. Now, where it all falls apart, Tower of Babelesque, is that there doesn\\u2019t seem to be consensus on where exactly it\\u2019s broken. Where do you see, I guess, this coming apart at the seams?

Jean: I agree, it\\u2019s broken. And so, if I tap into my background, which is I was a programming languages person in my very recently, previous life, programming languages people like to say the problem and the solution is all lies in abstraction. And so, computing is all about building abstractions on top of what you have now so that you don\\u2019t have to deal with so many details and you got to think at a higher level; you\\u2019re free of the shackles of so many low-level details. What I see is that today, monitoring and observability is a sort of abstraction nightmare. People have just taken it as gospel that you need to live at the lowest level of abstraction possible the same way that people truly believe that assembly code was the way everybody was going to program forevermore back, you know, 50 years ago.

So today, what\\u2019s happening is that when people think monitoring, they think logs, not what\\u2019s wrong with my system, what do I need to pay attention to? They think, \\u201cI have to log everything, I have to consume all those logs, we\\u2019re just operating at the level of logs.\\u201d And that\\u2019s not wrong because there haven\\u2019t been any tools that have given people any help above the level of logs. Although that\\u2019s not entirely correct, you know? There\\u2019s also events and there\\u2019s also traces, but I wouldn\\u2019t say that\\u2019s actually lifting the level of [laugh] abstraction very much either.

And so, people today are thinking about monitoring and observability as this full control, like, I\\u2019m driving my, like, race car, completely manual transmission, I want to feel everything. And not everyone wants to or needs to do that to get to where they need to go. And so, my question is, how far are can we lift the level of abstraction for monitoring and observability? I don\\u2019t believe that other people are really asking this question because most of the other players in the space, they\\u2019re asking what else can we monitor? Where else can we monitor it? How much faster can we do it? Or how much more detail can we give the people who really want the power tools?

But the people entering the buyer\'s market with needs, they\\u2019re not people\\u2014you don\\u2019t have, like, you know, hordes of people who need more powerful tools. You have people who don\\u2019t know about the systems are dealing with and they want easier. They want to figure out if there\\u2019s anything wrong with our system so they can get off work and do other things with their lives.

Corey: That, I think, is probably the thing that gets overlooked the most. It\\u2019s people don\\u2019t tend to log into their monitoring systems very often. They don\\u2019t want to. When they do, it\\u2019s always out of hours, middle of the night, and they\\u2019re confronted with a whole bunch of upsell dialogs of, \\u201cHey, it\\u2019s been a while. You want to go on a tour of the new interface?\\u201d

Meanwhile, anything with half a brain can see there\\u2019s a giant spike on the graph or telemetry stop coming in.

Jean: Yeah.

Corey: It\\u2019s way outside of normal business hours where this person is and maybe they\\u2019re not going to be in the best mood to engage with your brand.

Jean: Yeah. Right now, I think a lot of the problem is, you\\u2019re either working with monitoring because you\\u2019re desperate, you\\u2019re in the middle of an active incident, or you\\u2019re a monitoring fanatic. And there isn\\u2019t a lot in between. So, there\\u2019s a tweet that someone in my network tweeted me that I really liked which is, \\u201cMonitoring should be a chore, not a hobby.\\u201d And right now, it\\u2019s either a hobby or an urgent necessity [laugh].

And when it gets to the point\\u2014so you know, if we think about doing dishes this way, it would be as if, like, only, like, the dish fanatics did dishes, or, like, you will just have piles of dishes, like, all over the place and raccoons and no dishes left, and then you\\u2019re, like, \\u201cAh, time to do a thing.\\u201d But there should be something in between where there\\u2019s a defined set of things that people can do on a regular basis to keep up with what they\\u2019re doing. It should be accessible to everyone on the team, not just a couple of people who are true fanatics. No offense to the people out there, I love you guys, you\\u2019re the ones who are really helping us build our tool the most, but you know, there\\u2019s got to be a world in which more people are able to do the things you do.

Corey: That\\u2019s part of the challenge is bringing a lot of the fire down from Mount Olympus to the rest of humanity, where at some level, Prometheus was a great name from that\\u2014

Jean: Yep [laugh].

Corey: Just from that perspective because you basically need to be at that level of insight. I think Kubernetes suffers from the same overall problem where it is not reasonably responsible to run a Kubernetes production cluster without some people who really know what\\u2019s going on. That\\u2019s rapidly changing, which is for the better, because most companies are not going to be able to afford a multimillion-dollar team of operators who know the ins and outs of these incredibly complex systems. It has to become more accessible and simpler. And we have an entire near century at this point of watching abstractions get more and more and more complex and then collapsing down in this particular field. And I think that we\\u2019re overdue for that correction in a lot of the modern infrastructure, tooling, and approaches that we take.

Jean: I agree. It hasn\\u2019t happened yet in monitoring and observability. It\\u2019s happened in coding, it\\u2019s happened in infrastructure, it\\u2019s happened in APIs, but all of that has made it so that it\\u2019s easier to get into monitoring debt. And it just hasn\\u2019t happened yet for anything that\\u2019s more reactive and more about understanding what the system is that you have.

Corey: You mentioned specifically that your background was in programming languages. That\\u2019s understating it slightly. You were a tenure-track professor of computer science at Carnegie Mellon before entering industry. How tied to what your area of academic speciality was, is what you\\u2019re now at Akita?

Jean: That\\u2019s a great question and there are two answers to that. The first is very not tied. If it were tied, I would have stayed in my very cushy, highly [laugh] competitive job that I worked for years to get, to do stuff there. And so like, what we\\u2019re doing now is comes out of thousands of conversations with developers and desire to build on the ground tools that I\\u2019m\\u2014there\\u2019s some technically interesting parts to it, for sure. I think that our technical innovation is our moat, but is it at the level of publishable papers? Publishable papers are a very narrow thing; I wouldn\\u2019t be able to say yes to that question.

On the other hand, everything that I was trained to do was about identifying a problem and coming up with an out-of-the-box solution for it. And especially in programming languages research, it\\u2019s really about abstractions. It\\u2019s really about, you know, taking a set of patterns that you see of problems people have, coming up with the right abstractions to solve that problem, evaluating your solution, and then, you know, prototyping that out and building on top of it. And so, in that case, you know, we identified, hey, people have a huge gap when it comes to monitoring and observability. I framed it as an abstraction problem, how can we lift it up?

We saw APIs as this is a great level to build a new level of solution. And our solution, it\\u2019s innovative, but it also solves the problem. And to me, that\\u2019s the most important thing. Our solution didn\\u2019t need to be innovative. If you\\u2019re operating in an academic setting, it\\u2019s really about\\u2026 producing a new idea. It doesn\\u2019t actually [laugh]\\u2014I like to believe that all endeavors really have one main goal, and in academia, the main goal is producing something new. And to me, building a product is about solving a problem and our main endeavor was really to solve a real problem here.

Corey: I think that it is, in many cases, useful when we start seeing a lot of, I guess, overflow back and forth between academia and industry, in both directions. I think that it is doing academia a disservice when you start looking at it purely as pure theory, and oh yeah, they don\\u2019t deal with any of the vocational stuff. Conversely, I think the idea that industry doesn\\u2019t have anything to learn from academia is dramatically misunderstanding the way the world works. The idea of watching some of that ebb and flow and crossover between them is neat to see.

Jean: Yeah, I agree. I think there\\u2019s a lot of academics I super respect and admire who have done great things that are useful in industry. And it\\u2019s really about, I think, what you want your main goal to be at the time. Is it, do you want to be optimizing for new ideas or contributing, like, a full solution to a problem at the time? But it\\u2019s there\\u2019s a lot of overlap in the skills you need.

Corey: One last topic I\\u2019d like to dive into before we call it an episode is that there\\u2019s an awful lot of hype around a variety of different things. And right now in this moment, AI seems to be one of those areas that is getting an awful lot of attention. It\\u2019s clear too there\\u2019s something of value there\\u2014unlike blockchain, which has struggled to identify anything that was not fraud as a value proposition for the last decade-and-a-half\\u2014but it\\u2019s clear that AI is offering value already. You have recently, as of this recording, released an AI chatbot, which, okay, great. But what piques my interest is one, it\\u2019s a dog, which\\u2026 germane to my interest, by all means, and two, it is marketed as, and I quote, \\u201cExceedingly polite.\\u201d

Jean: [laugh].

Corey: Manners are important. Tell me about this pupper.

Jean: Yeah, this dog came really out of four or five days of one of our engineers experimenting with ChatGPT. So, for a little bit of background, I\\u2019ll just say that I have been excited about the this latest wave of AI since the beginning. So, I think at the very beginning, a lot of dev tools people were skeptical of GitHub Copilot; there was a lot of controversy around GitHub Copilot. I was very early. And I think all the Copilot people retweeted me because I was just their earlies\\u2014like, one of their earliest fans. I was like, \\u201cThis is the coolest thing I\\u2019ve seen.\\u201d

I\\u2019ve actually spent the decade before making fun of AI-based [laugh] programming. But there were two things about GitHub Copilot that made my jaw drop. And that\\u2019s related to your question. So, for a little bit of background, I did my PhD in a group focused on program synthesis. So, it was really about, how can we automatically generate programs from a variety of means? From constraints\\u2014

Corey: Like copying and pasting off a Stack Overflow, or\\u2014

Jean: Well, the\\u2014I mean, that actually one of the projects that my group was literally applying machine-learning to terabytes of other example programs to generate new programs. So, it was very similar to GitHub Copilot before GitHub Copilot. It was synthesizing API calls from analyzing terabytes of other API calls. And the thing that I had always been uncomfortable with these machine-learning approaches in my group was, they were in the compiler loop. So, it was, you know, you wrote some code, the compiler did some AI, and then it spit back out some code that, you know, like you just ran.

And so, that never sat well with me. I always said, \\u201cWell, I don\\u2019t really see how this is going to be practical,\\u201d because people can\\u2019t just run random code that you basically got off the internet. And so, what really excited me about GitHub Copilot was the fact that it was in the editor loop. I was like, \\u201cOh, my God.\\u201d

Corey: It had the context. It was right there. You didn\\u2019t have to go tabbing to something else.

Jean: Exactly.

Corey: Oh, yeah. I\\u2019m in the same boat. I think it is basically\\u2014I\\u2019ve seen the future unfolding before my eyes.

Jean: Yeah. Was the autocomplete thing. And to me, that was the missing piece. Because in your editor, you always read your code before you go off and\\u2014you know, like, you read your code, whoever code reviews your code reads your code. There\\u2019s always at least, you know, two pairs of eyes, at least theoretically, reading your code.

So, that was one thing that was jaw-dropping to me. That was the revelation of Copilot. And then the other thing was that it was marketed not as, \\u201cWe write your code for you,\\u201d but the whole Copilot marketing was that, you know, it kind of helps you with boilerplate. And to me, I had been obsessed with this idea of how can you help developers write less boilerplate for years. And so, this AI-supported boilerplate copiloting was very exciting to me.

And I saw that is very much the beginning of a new era, where, yes, there\\u2019s tons of data on how we should be programming. I mean, all of Akita is based on the fact that we should be mining all the data we have about how your system and your code is operating to help you do stuff better. And so, to me, you know, Copilot is very much in that same philosophy. But our AI chatbot is, you know, just a next step along this progression. Because for us, you know, we collect all this data about your API behavior; we have been using non-AI methods to analyze this data and show it to you.

And what ChatGPT allowed us to do in less than a week was analyze this data using very powerful large-language models and I have this conversational interface that both gives you the opportunity to check over and follow up on the question so that what you\\u2019re spitting out\\u2014so what we\\u2019re spitting out as Aki the dog doesn\\u2019t have to be a hundred percent correct. But to me, the fact that Aki is exceedingly polite and kind of goofy\\u2014he, you know, randomly woofs and says a lot of things about how he\\u2019s a dog\\u2014it\\u2019s the right level of seriousness so that it\\u2019s not messaging, hey, this is the end all, be all, the way, you know, the compiler loop never sat well with me because I just felt deeply uncomfortable that an AI was having that level of authority in a system, but a friendly dog that shows up and tells you some things that you can ask some additional questions to, no one\\u2019s going to take him that seriously. But if he says something useful, you\\u2019re going to listen. And so, I was really excited about the way this was set up. Because I mean, I believe that AI should be a collaborator and it should be a collaborator that you never take with full authority. And so, the chat and the politeness covered those two parts for me both.

Corey: Yeah, on some level, I can\\u2019t shake the feeling that it\\u2019s still very early days there for Chat-Gipity\\u2014yes, that\\u2019s how I pronounce it\\u2014and it\\u2019s brethren as far as redefining, on some level, what\\u2019s possible. I think that it\\u2019s in many cases being overhyped, but it\\u2019s solving an awful lot of the\\u2026 the boilerplate, the stuff that is challenging. A question I have, though, is that, as a former professor, a concern that I have is when students are using this, it\\u2019s less to do with the fact that they\\u2019re not\\u2014they\\u2019re taking shortcuts that weren\\u2019t available to me and wanting to make them suffer, but rather, it\\u2019s, on some level, if you use it to write your English papers, for example. Okay, great, it gets the boring essay you don\\u2019t want to write out of the way, but the reason you write those things is it teaches you to form a story, to tell a narrative, to structure an argument, and I think that letting the computer do those things, on some level, has the potential to weaken us across the board. Where do you stand on it, given that you see both sides of that particular snake?

Jean: So, here\\u2019s a devil\\u2019s advocate sort of response to it, is that maybe the writing [laugh] was never the important part. And it\\u2019s, as you say, telling the story was the important part. And so, what better way to distill that out than the prompt engineering piece of it? Because if you knew that you could always get someone to flesh out your story for you, then it really comes down to, you know, I want to tell a story with these five main points. And in some way, you could see this as a playing field leveler.

You know, I think that as a\\u2014English is actually not my first language. I spent a lot of time editing my parents writing for their work when I was a kid. And something I always felt really strongly about was not discriminating against people because they can\\u2019t form sentences or they don\\u2019t have the right idioms. And I actually spent a lot of time proofreading my friends\\u2019 emails when I was in grad school for the non-native English speakers. And so, one way you could see this as, look, people who are not insiders now are on the same playing field. They just have to be clear thinkers.

Corey: That is a fascinating take. I think I\\u2019m going to have to\\u2014I\\u2019m going to have to ruminate on that one. I really want to thank you for taking the time to speak with me today about what you\\u2019re up to. If people want to learn more, where\\u2019s the best place for them to find you?

Jean: Well, I\\u2019m always on Twitter, still [laugh]. I\\u2019m @jeanqasaur\\u2014J-E-A-N-Q-A-S-A-U-R. And there\\u2019s a chat dialog on akitasoftware.com. I [laugh] personally oversee a lot of that chat, so if you ever want to find me, that is a place, you know, where all messages will get back to me somehow.

Corey: And we will, of course, put a link to that into the [show notes 00:35:01]. Thank you so much for your time. I appreciate it.

Jean: Thank you, Corey.

Corey: Jean Yang, CEO at Akita Software. I\\u2019m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you\\u2019ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you\\u2019ve hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry insulting comment that you will then, of course, proceed to copy to the other 17 podcast tools that you use, just like you do your observability monitoring suite.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.