Best Practices in AWS Certificate Manager with Jonathan Kozolchyk

Published: July 6, 2023, 10 a.m.

b'

Jonathan (Koz) Kozolchyk, General Manager for Certificate Services at AWS, joins Corey on Screaming in the Cloud to discuss the best practices he recommends around certificates. Jonathan walks through when and why he recommends private certs, and the use cases where he\\u2019d recommend longer or unusual expirations. Jonathan also highlights the importance of knowing who\\u2019s using what cert and why he believes in separating expiration from rotation. Corey and Jonathan also discuss their love of smart home devices as well as their security concerns around them and how they hope these concerns are addressed moving forward.\\xa0


About Jonathan

Jonathan is General Manager of Certificate Services for AWS, leading the engineering, operations, and product management of AWS certificate offerings including AWS Certificate Manager (ACM) AWS Private CA, Code Signing, and Encryption in transit. Jonathan is an experienced leader of software organizations, with a focus on high availability distributed systems and PKI. Starting as an intern, he has built his career at Amazon, and has led development teams within our Consumer and AWS businesses, spanning from Fulfillment Center Software, Identity Services, Customer Protection Systems and Cryptography. Jonathan is passionate about building high performing teams, and working together to create solutions for our customers. He holds a BS in Computer Science from University of Illinois, and multiple patents for his work inventing for customers. When not at work you\\u2019ll find him with his wife and two kids or playing with hobbies that are hard to do well with limited upside, like roasting coffee.



Links Referenced:


Transcript


Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What\'s Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That\'s S-Y-S-D-I-G.com/screaming.


My thanks as well to Sysdig for sponsoring this ridiculous podcast.



Corey: Welcome to Screaming in the Cloud. I\\u2019m Corey Quinn. As I record this, we are about a week and a half from re:Inforce in Anaheim, California. I am not attending, not out of any moral reason not to because I don\\u2019t believe in cloud security or conferences that Amazon has that are named after subject lines, but rather because I am going to be officiating a wedding on the other side of the world because I am an ordained minister of the Church of There Is A Problem With This Website\\u2019s Security Certificate. So today, my guest is going to be someone who\\u2019s a contributor, in many ways, to that religion, Jonathan Kozolchyk\\u2014but, you know, we all call him Koz\\u2014is the general manager for Certificate Services at AWS. Koz, thank you for joining me.



Koz: Happy to be here, Corey.



Corey: So, one of the nice things about ACM historically\\u2014the managed service that handles certificates from AWS\\u2014is that for anything public-facing, it\\u2019s free\\u2014which is always nice, you should not be doing upcharges for security\\u2014but you also don\\u2019t let people have the private portion of the cert. You control all of the endpoints that terminate SSL. Whereas when I terminate SSL myself, it terminates on the floor because I\\u2019ve dropped things here and there, which means that suddenly the world of people exposing things they shouldn\\u2019t or expiry concerns just largely seemed to melt away. What was the reason that Amazon looked around at the landscape and said, \\u201cAh, we\\u2019re going to launch our own certificate service, but bear with me here, we\\u2019re not going to charge people money for it.\\u201d It seems a little bit out of character.



Koz: Well, Amazon itself has been battling with certificates for years, long before even AWS was a thing, and we learned that you have to automate. And even that\\u2019s not enough; you have to inspect and you have to audit, you need a controlled loop. And we learned that you need a closed loop to truly manage it and make sure that you don\\u2019t have outages. And so, when we built ACM, we built it saying, we need to provide that same functionality to our customers, that certificates should not be the thing that makes them go out. Is that we need to keep them available and we need to minimize the sharp edges customers have to deal with.



Corey: I somewhat recently caught some flack on one of the Twitter replacement social media sites for complaining about the user experience of expired SSL certs. Because on the one hand, if I go to my bank\\u2019s website, and the response is that instead, the server is sneakyhackerman.com, it has the exact same alert and failure mode as, holy crap, this certificate reached its expiry period 20 minutes ago. And from my perspective, one of those is a lot more serious than the other. What also I wind up encountering is not just when I\\u2019m doing banking, but when I\\u2019m trying to read some random blog on how to solve a technical problem. I\\u2019m not exactly putting personal information into the thing. It feels like that was a missed opportunity, agree or disagree?



Koz: Well, I wouldn\\u2019t categorize it as a missed opportunity. I think one of the things you have to think about with security is you have to keep it simple so that everyone, whether they\\u2019re a technologist or not, can abide by the rules and be safe. And so, it\\u2019s much easier to say to somebody, \\u201cThere\\u2019s something wrong. Period. Stop.\\u201d versus saying there are degrees of wrongness. Now, that said, boy, do I wish we had originally built PKI and TLS such that you could submit multiple certificates to somebody, in a connection for example, so that you could always say, you know, my certificates can expire, but I\\u2019ve got two, and they\\u2019re off by six months, for example. Or do something so that you don\\u2019t have to close failed because the certificate expired.



Corey: It feels like people don\\u2019t tend to think about what failure modes are going to look like. Because, pfhh, as an expired certificate? What kind of irresponsible buffoon would do such a thing? But I\\u2019ve worked in enough companies where you have historically, the wildcard cert because individual certs cost money, once upon a time. So, you wound up getting the one certificate that could work on all of the stuff that ends in the same domain.



And that was great, but then whenever it expired, you had to go through and find all the places that you put it and you always miss some, so things would break for a while and the corporate response was, \\u201cUgh, that was awful. Instead of a one-year certificate, let\\u2019s get a five-year or a ten-year certificate this time.\\u201d And that doesn\\u2019t make the problem better; it makes it absolutely worse because now it proliferates forever. Everyone who knows where that thing lives is now long gone by the time it hits again. Counterintuitively, it seems the industry has largely been moving toward short-lived certs. Let\\u2019s Encrypt, for example, winds up rotating every 90 days, by my estimation. ACM is a year, if memory serves.



Koz: So, ACM certs are 13 months, and we start rotating them around the 11th month. And Let\\u2019s Encrypt offers you 90-day certs, but they don\\u2019t necessarily require you to rotate every 90 days; they expire in 90 days. My tip for everybody is divorce expiration from rotation. So, if your cert is a 90-day cert, rotate it at 45 days. If your cert is a year cert, give yourself a couple of months before expiration to start the rotation. And then you can alarm on it on your own timeline when something fails, and you still have time to fix it.



Corey: This makes a lot of sense in\\u2014you know, the second time because then you start remembering, okay, everywhere I use this cert, I need to start having alarms and alerts. And people are bad at these things. What ACM has done super well is that it removes that entire human from the loop because you control all of the endpoints. You folks have the ability to rotate it however often you\\u2019d like. You could have picked arbitrary timelines of huge amounts of time or small amounts of time and it would have been just fine.



I mean, you log into an EC2 instance role and I believe the credentials get passed out of either a 6 or a 12-hour validity window, and they\\u2019re consistently rotating on the back end and it\\u2019s completely invisible to the customer. Was there ever thought given to what that timeline should be,j what that experience should be? Or did you just, like, throw a dart at a wall? Like, \\u201cYeah, 13 months feels about right. We\\u2019re going to go with that.\\u201d And never revisited it. I have a guess which\\u2014



Koz: [laugh].



Corey: Side of that it was. Did you think at all about what you were doing at the time, or\\u2014yeah.



Koz: So, I will admit, this happened just before I got there. I got to ACM after\\u2014



Corey: Ah, blame the predecessor. Always a good call.



Koz: \\u2014the launch. It\\u2019s a God-given right to blame your predecessor.



Corey: Oh, absolutely. It\\u2019s their entire job.



Koz: I think they did a smart job here. What they did was they took the longest lifetime cert that was then allowed, at 13 months, knowing that we were going to automate the rotation and basically giving us as much time as possible to do it, right, without having to worry about scaling issues or having to rotate overly frequently. You know, there are customers who while I don\\u2019t\\u2014I strongly disagree with [pinning 00:07:35], for example, but there are customers out there who don\\u2019t like certs to change very often. I don\\u2019t recommend pinning at all, but I understand these cases are out there, and changing it once every year can be easier on customers than changing it every 20 minutes, for example. If I were to pick an ideal rotation time, it\\u2019d probably be under ten days because an OCSP response is good for ten days and if you rotate before, then I never have to update an OCSP response, for example. But changing that often would play havoc with many systems because of just the sheer frequency you\\u2019re rotating what is otherwise a perfectly valid certificate.



Corey: It is computationally expensive to generate certificates at scale, I would imagine.



Koz: It starts to be a problem. You\\u2019re definitely putting a lot of load on the HSMs at that point, [laugh] when you\\u2019re generating. You know, when you have millions of certs out in deployment, you\\u2019re generating quite a few at a time.



Corey: There is an aspect of your service that used to be part of ACM and now it\\u2019s its own service\\u2014which I think is probably the right move because it was confusing for a lot of customers\\u2014Amazon looks around and sees who can we compete with next, it feels like sometimes. And it seemed like you were squarely focused on competing against your most desperate of all enemies, my crappy USB key where I used to keep the private CA I used at any given job\\u2014at the time; I did not keep it after I left, to be very clear\\u2014for whatever I\\u2019m signing things for certificates for internal use. You\\u2019re, like, \\u201cAh, we can have your crappy USB key as a service.\\u201d And sure enough, you wound up rolling that out. It seems like adoption has been relatively brisk on that, just because I see it in almost every client account I work with.



Koz: Yeah. So, you\\u2019re talking about the private CA offering which is\\u2014



Corey: I\\u2014that\\u2019s right. Private CA was the new service name. Yes, it used to be a private certificate authority was an aspect of ACM, and now you\\u2019re\\u2014mmm, we\\u2019re just going to move that off.



Koz: And we split it out because like you said customers got confused. They thought they had to only use it with ACM. They didn\\u2019t understand it was a full standalone service. And it was built as a standalone service; it was not built as part of ACM. You know, before we built it, we talked to customers, and I remember meeting with people running fairly large startups, saying, \\u201cYes, please run this for me. I don\\u2019t know why, but I\\u2019ve got this piece of paper in my sock drawer that one of my security engineers gave me and said, \\u2018if something goes wrong with our CA, you and two other people have to give me this piece of paper.\\u2019\\u201d And others were like, \\u201cOh, you have a piece of paper? I have a USB stick in my sock drawer.\\u201d And like, this is what, you know, the startup world was running their CAs from sock drawers as far as I can tell.



Corey: Yeah. A piece of paper? Someone wrote out the key by hand? That sounds like hell on earth.



Koz: [sigh]. It was a sharding technique where you needed, you know, three of five or something like that to\\u2014



Corey: Oh, they, uh, Shamir\\u2019s Secret Sharing Service.



Koz: Yes.



Corey: The SSSS. Yeah.



Koz: Yes. You know, and we looked at it. And the other alternative was people would use open-source or free certificate authorities, but without any of the security, you\\u2019d want, like, HSM backing, for example, because that gets really expensive. And so yeah, we did what our customers wanted: we built this service. We\\u2019ve been very happy with the growth it\\u2019s taken and, like you said, we love the places we\\u2019ve seen it. It\\u2019s gone into all kinds of different things, from the traditional enterprise use cases to IoT use cases. At one point, there\\u2019s a company that tracks sheep and every collar has one of our certs in it. And so, I am active in the sheep-tracking industry.



Corey: I am certain that some wit is going to comment on this. \\u201cOh, there\\u2019s a company out there that tracks sheep. Yeah, it\\u2019s called Apple,\\u201d or Facebook, or whatever crappy\\u2026 whatever axe someone has to grind against any particular big company. But you\\u2019re talking actual sheep as in baa, smell bad, count them when going to sleep?



Koz: Yes. Actual sheep.



Corey: Excellent, excellent.



Koz: The certs are in drones, they\\u2019re in smart homes, so they\\u2019re everywhere now.



Corey: That is something I want to ask you about because I found that as a competition going on between your service, ACM because you won\\u2019t give me the private keys for reasons that we already talked about, and Let\\u2019s Encrypt. It feels like you two are both competing to not take my money, which is, you know, an odd sort of competition. You\\u2019re not actually competing, you\\u2019re both working for a secure internet in different ways, but I wind up getting certificates made automatically for me for all of my internal stuff using Let\\u2019s Encrypt, and with publicly resolvable domain names. Why would someone want a private CA instead of an option that, okay, yeah, we\\u2019re only using it internally, but there is public validity to the certificate?



Koz: Sure. And just because I have to nitpick, I wouldn\\u2019t say we\\u2019re competing with them. I personally love Let\\u2019s Encrypt; I use them at home, too. Amazon supports them financially; we give them resources. I think they\\u2019re great. I think\\u2014you know, as long as you\\u2019re getting certs I\\u2019m happy. The world is encrypted and I\\u2014people use private CA because fundamentally, before you get to the encryption, you need secure identity. And a certificate provides identity. And so, Let\\u2019s Encrypt is great if you have a publicly accessible DNS endpoint that you can prove you own and get a certificate for and you\\u2019re willing to update it within their 90-day windows. Let\\u2019s use the sheep example. The sheep don\\u2019t have publicly valid DNS endpoints and so\\u2014



Corey: Or to be very direct with you, they also tend to not have terrific operational practices around updating their own certificates.



Koz: Right. Same with drones, same with internal corporate. You may not want your DNS exposed to the internet, your internal sites. And so, you use a private certificate where you own both sides of the connection, right, where you can say\\u2014because you can put the CA in the trust store and then that gets you out of having to be compliant with the CA browser form and the web trust rules. A lot of the CA browser form dictates what a public certificate can and can\\u2019t do and the rules around that, and those are built very much around the idea of a browser connecting to a client and protecting that user.



Corey: And most people are not banking on a sheep.



Koz: Most people are not banking on a sheep, yes. But if you have, for example, a database that requires a restart to pick up a new cert, you\\u2019re not going to want to redo that every 90 days. You\\u2019re probably going to be fine with a five-year certificate on that because you want to minimize your downtime. Same goes with a lot of these IoT devices, right? You may want a thousand-year cert or a hundred-year cert or cert that doesn\\u2019t expire because this is a cert that happens at\\u2014that is generated at creation for the device. And it\\u2019s at birth, the machine is manufactured and it gets a certificate and you want it to live for the life of that device.



Or you have super-secret-project.internal.mycompany.com and you don\\u2019t want a publicly visible cert for that because you\\u2019re not ready to launch it, and so you\\u2019ll start with a private cert. Really, my advice to customers is, if you own both pieces of the connection, you know, if you have an API that gets called by a client you own, you\\u2019re almost always better off with a private certificate and managing that trust store yourself because then you are subject not to other people\\u2019s rules, but the rules that fit the security model and the threat assessment you\\u2019ve done.



Corey: For the publication system for my newsletter, when I was building it out, I wanted to use client certificates as a way of authenticating that it was me. Because I only have a small number of devices that need to talk to this thing; other people don\\u2019t, so how do I submit things into my queue and manage it? And back in those ancient days, the API Gateways didn\\u2019t support TLS authentication. Now, they do. I would redo it a bunch of different ways. They did support API key as an authentication mechanism, but the documentation back then was so terrible, or I was so new to this stuff, I didn\\u2019t realize what it was and introduced it myself from first principles where there\\u2019s a hard-coded UUID, and as long as there\\u2019s the right header with that UUID, I accept it, otherwise drop it on the floor. Which\\u2026 there are probably better ways to do that.



Koz: Sure. Certificates are, you know, a very popular way to handle that situation because they provide that secure identity, right? You can be assured that the thing connecting to you can prove it is who they say they are. And that\\u2019s a great use of a private CA.



Corey: Changing gears slightly. As we record this, we are about two weeks before re:Inforce, but I will be off doing my own thing on that day. Anything interesting and exciting coming out of your group that\\u2019s going to be announced, with the proviso, of course, that this will not air until after re:Inforce.



Koz: Yes. So, we are going to be pre-announcing the launch of a connector for Active Directory. So, you will be able to tie your private CA instance to your Active Directory tree and use private CA to issue certificates for use by Active Directory for all of your Windows hosts for the users in that Active Directory tree.



Corey: It has been many years since I touched Windows in anger, but in 2003 or so, I was a mediocre Small Business Windows Server Admin. Doesn\\u2019t Active Directory have a private CA built into it by default for whenever you\\u2019re creating a new directory?



Koz: It does.



Corey: Is that one of the FSMO roles? I\\u2019m trying to remember offhand.



Koz: What\\u2019s a Fimal?



Corey: FSMO. F-S-M-O. There are\\u2014I forget, it\\u2019s some trivia question that people love to haze each other with in Microsoft interviews. \\u201cWhat are the seven FSMO roles?\\u201d At least back then. And have to be moved before you decommission a domain controller or you\\u2019re going to have tears before bedtime.



Koz: Ah. Yeah, so Microsoft provides a certificate authority for use with Active Directory. They\\u2019ve had it for years and they had to provide it because back then nobody had a certificate authority, but AD needed one. The difference here is we manage it for you. And it\\u2019s backed by HSMs. We ensure that the keys are kept secure. It\\u2019s a serverless connection to your Active Directory tree, you don\\u2019t have to run any software of ours on your hosts. We take care of all of it.



And it\\u2019s been the top requests from customers for years now. It\\u2019s been quite [laugh] a bit of effort to build it, but we think customers are going to love it because they\\u2019re going to get all the security and best practices from private CA that they\\u2019re used to and they can decommission their on-prem certificate authority and not have to go through the hassle of running it.



Corey: A big area where I see a lot of private CA work has been in the realm of desktops for corporate environments because when you can pass out your custom trusted root or trusted CA to all of the various nodes you have and can control them, it becomes a lot easier. I always tended to shy away from it, just because in small businesses like the one that I own, I don\\u2019t want to play corporate IT guy more than I absolutely have to.



Koz: Yeah. Trust or management is always a painful part of PKI. As if there weren\\u2019t enough painful things in PKI. Trust store management is yet another one. Thankfully, in the large enterprises, there are good tooling out there to help you manage it for the corporate desktops and things like that.



And with private CA, you can also, if you already have an offline root that is in all of your trust stores in your enterprise, you can cross-sign the route that we give you from private CA into that hierarchy. And so, then you don\\u2019t have to distribute a new trust store out if you don\\u2019t want to.



Corey: This is a tricky release and I\\u2019m very glad I\\u2019m taking the week off it\\u2019s getting announced because there are two reactions that are going to happen to any snarking I can do about this. The first is no one knows what the hell this is and doesn\\u2019t have any context for the rest, and the other folks are going to be, \\u201cYes, shut up clown. This is going to change my workflow in amazing ways. I\\u2019ll deal with your nonsense later. I want to do this.\\u201d And I feel like one of those constituencies is very much your target market and the other isn\\u2019t. Which is fine. No service that AWS offers\\u2014except the bill\\u2014is for every customer, but every service is for someone.



Koz: That\\u2019s right. We\\u2019ve heard from a lot of our customers, especially as they\\u2014you know, the large international ones, right, they find themselves running separate Active Directory CAs in different countries because they have different regulatory requirements and separations that they want to do. They are chomping at the bit to get this functionality because we make it so easy to run a private CA in these different regions. There\\u2019s certainly going to be that segment at re:Inforce, that\\u2019s just happy certificates happen in the background and they don\\u2019t think anything about where they come from and this won\\u2019t resonate with them, but I assure you, for every one of them, they have a colleague somewhere else in the building that is going to do a happy dance when this launches because there\\u2019s a great deal of customer heavy-lifting and just sharp edges that we\\u2019re taking away from them. And we\\u2019ll manage it for them, and they\\u2019re going to love it.


[midroll 0:21:08]



Corey: One thing that I have seen the industry shift to that I love is the Let\\u2019s Encrypt model, where the certificate expires after 90 days. And I love that window because it is a quarter, which means yes, you can do the crappy thing and have a calendar reminder to renew the thing. It\\u2019s not something you have to do every week, so you will still do it, but you\\u2019re also not going to love it. It\\u2019s just enough friction to inspire people to automate these things. And that I think is the real win.



There\\u2019s a bunch of things like Certbot, I believe the protocol is called ACME A-C-M-E, always in caps, which usually means an acronym or someone has their caps lock key pressed\\u2014which is of course cruise control for cool. But that entire idea of being able to have a back-and-forth authentication pass and renew certificates on a schedule, it\\u2019s transformative.



Koz: I agree. ACM, even Amazon before ACM, we\\u2019ve always believed that automation is the way out of a lot of this pain. As you said earlier, moving from a one-year cert to a five-year cert doesn\\u2019t buy you anything other than you lose even more institutional knowledge when your cert expires. You know, I think that the move to further automation is great. I think ACME is a great first step.



One of the things we\\u2019ve learned is that we really do need a closed loop of monitoring to go with certificate issuance. So, at Amazon, for example, every cert that we issue, we also track and the endpoints emit metrics that tell us what cert they\\u2019re using. And it\\u2019s not what\\u2019s on disk, it\\u2019s what\\u2019s actually in the endpoint and what they\\u2019re serving from memory. And we know because we control every cert issued within the company, every cert that\\u2019s in use, and if we see a cert in use that, for example, isn\\u2019t the latest one we issued, we can send an alert to the team that\\u2019s running it. Or if we\\u2019ve issued a cert and we don\\u2019t see it in use, we see the old ones still in use, we can send them an alert, they can alarm and they can see that, oh, we need to do something because our automation failed in this case.



And so, I think ACME is great. I think the push Let\\u2019s Encrypt did to say, \\u201cWe\\u2019re going to give you a free certificate, but it\\u2019s going to be short-lived so you have to automate,\\u201d that\\u2019s a powerful carrot and stick combination they have going, and I think for many customers Certbot\\u2019s enough. But you\\u2019ll see even with ACM where we manage it for our customers, we have that closed loop internally as well to make sure that the cert when we issue a new cert to our client, you know, to the partner team, that it does get picked up and it does get loaded. Because issuing you a cert isn\\u2019t enough; we have to make sure that you\\u2019re actually using the new certificate.



Corey: I also have learned as a result of this, for example, that AWS certificate manager\\u2014Amazon Certificate Manager, the ACM, the certificate thingy that you run, that so many names, so many acronyms. It\\u2019s great\\u2014but it has a limit\\u2014by default\\u2014of 2500 certificates. And I know this because I smacked into it. Why? I wasn\\u2019t sitting there clicking and adding that many certificates, but I had a delightful step function pattern called \\u2018The Lambda invokes itself.\\u2019 And you can exhaust an awful lot of resources that way because I am bad at programming. That is why for safety, I always recommend that you iterate development-wise in an account that is not production, and preferably one that belongs to someone else.



Koz: [laugh]. We do have limits on cert issuance.



Corey: You have limits on everything in AWS. As it should because it turns out that whatever there\\u2019s not a limit, A, free database just dropped, and B, things get hammered to death. You have to harden these things. And it\\u2019s one of those things that\\u2019s obvious once you\\u2019ve operated at a certain point of scale, but until you do, it just feels arbitrary and capricious. It\\u2019s one of those things where I think Amazon is still\\u2014and all the cloud companies who do this\\u2014are misunderstood.



Koz: Yeah. So, in the case of the ACM limits, we look at them fairly regularly. Right now, they\\u2019re high enough that most of our customers, vast majority, never come close to hitting it. And the ones that do tend to go way over.



Corey: And it\\u2019s been a mistake, as in my case as well. This was not a complaint, incidentally. It was like, well, I want to wind up having more waste and more ridiculous nonsense. It was not my concern.



Koz: No no no, but we do, for those customers who have not mistake use cases but actual use cases where they need more, we\\u2019re happy to work with their account teams and with the customer and we can up those limits.



Corey: I\\u2019ve always found that limit increases, with remarkably few exceptions, the process is, \\u201cExplain to you what your use case is here.\\u201d And I feel like that is a screen for, first, are you doing something horrifying for which there\\u2019s a better solution? And two, it almost feels like it\\u2019s a bit of a customer research approach where this is fine for most customers. What are you folks doing over there and is there a use case we haven\\u2019t accounted for in how we use the service?



Koz: I always find we learned something when we look at the [P100 00:26:05] accounts that they use the most certificates, and how they\\u2019re operating.



Corey: Every time I think I\\u2019ve seen it all on AWS, I just talk to one more customer, and it\\u2019s back to school I go.



Koz: Yep. And I thank them for that education.



Corey: Oh, yeah. That is the best part of working with customers and honestly being privileged enough to work with some of these things and talk to the people who are building really neat stuff. I\\u2019m just kibitzing from the sideline most of the time.



Koz: Yeah.



Corey: So, one last topic I want to get into before we call it a show. You and I have been talking a fair bit, out of school, for lack of a better term, around a couple of shared interests. The one more germane to this is home automation, which is always great because especially in a married situation, at least as I am and I know you are as well, there\\u2019s one partner who is really into home automation and the other partner finds himself living in a haunted house.



Koz: [laugh]. I knew I had won that battle when my wife was on a work trip and she was in a hotel and she was talking to me on the phone and she realized she had to get out of bed to turn the lights off because she didn\\u2019t have our Alexa Good Night routine available to her to turn all the lights off and let her go to bed. And so, she is my core customer when I do the home automation stuff. And definitely make sure my use cases and my automations work for her. But yeah, I\\u2019m\\u2026 I love that space.



Coincidentally, it overlaps with my work life quite a bit because identity in smart home is a challenge. We\\u2019re really excited about the Matter standard. For those listening who aren\\u2019t sure what that is, it\\u2019s a new end-all be-all smart home standard for defining devices in a protocol-independent way that lets your hubs talk to devices without needing drivers from each company to interact with them. And one of the things I love about it is every device needs a certificate to identify it. And so, private CA has been a great partner with Matter, you know, it goes well with it.



In fact, we\\u2019re one of the leading certificate authorities for Matter devices. Customers love the pricing and the way they can get started without talking to anybody. So yeah, I\\u2019m excited to see, you know, as a smart home junkie and as a PKI guy, I\\u2019m excited to see Matter take off. Right now I have a huge amalgamation of smart home devices at home and seeing them all go to Matter will be wonderful.



Corey: Oh, it\\u2019s fantastic. I am a little worried about aspects of this, though, where you have things that get access to the internet and then act as a bridge. So suddenly, like, I have a IoT subnet with some controls on it for obvious reasons and honestly, one of the things I despise the most in this world has been the rise of smart TVs because I just want you to be a big dumb screen. \\u201cWell, how are you going to watch your movies?\\u201d \\u201cWith the Apple TV I\\u2019ve plugged into the thing. I just want you to be a screen. That\\u2019s it.\\u201d So, I live a bit in fear of the day where these things find alternate ways to talk to the internet and, you know, report on what I\\u2019m watching.



Koz: Yeah, I think Matter is going to help a lot with this because it\\u2019s focused on local control. And so, you\\u2019ll have to trust your hub, whether that\\u2019s your TV or your Echo device or what have you, but they all communicate securely amongst themselves. They use certificates for identification, and they\\u2019re building into Matter a robust revocation mechanism. You know, in my case at home, my TV\\u2019s not connected to the internet because I use my Fire TV to talk to it, similar to your Apple TV situation. I want a device I control not my TV, doing it. I\\u2019m happy with the big dumb screen.



And I think, you know, what you\\u2019re going to end up doing is saying there\\u2019s a device out there you\\u2019ll trust maybe more than others and say, \\u201cThat\\u2019s what I\\u2019m going to use as my hub for my Matter devices and that\\u2019s what will speak to the internet,\\u201d and otherwise my Matter devices will talk directly to my hub.



Corey: Yeah, there\\u2019s very much a spectrum of trust. There\\u2019s the, this is a Linux distribution on a computer that I installed myself and vetted and wound up contributing to at one point on the one end of the spectrum, and the other end of the spectrum of things you trust the absolute least in this world, which are, of course, printers. And most things fall somewhere in between.



Koz: Yes, right, now, it is a Wild West of rebranded white-label applications, right? You have all kinds of companies spitting out reference designs as products and white labeling the control app for it. And so, your phone starts collecting these smart home applications to control each one of these things because you buy different switches from different people. I\\u2019m looking forward to Matter collapsing that all down to having one application and one control model for all of the smart home devices.



Corey: Wemo explicitly stated that they\\u2019re not going to be pursuing this because it doesn\\u2019t let them differentiate the experience. Read as, cash grab. I also found out that Wemo\\u2014which is, of course, a Belkin subsidiary\\u2014had a critical vulnerability in some of the light switches it offered, including the one built into the wall in this room\\u2014until a week ago\\u2014where they\\u2019re not going to be releasing a patch for it because those are end-of-life. Really? Because I log into the Wemo app and the only way I would have known this has been the fact that it\\u2019s been a suspiciously long time since there was a firmware update available for it. But that\\u2019s it. Like, the only way I found this out was via a security advisory, at which point that got ripped out of the wall and replaced with something that isn\\u2019t, you know, horrifying. But man did that bother me.



Koz: Yeah. I think this is still an open issue for the smart home world.



Corey: Every company wants a moat of some sort, but I don\\u2019t want 15 different apps to manage this stuff. You turned me on to Home Assistant, which is an open-source, home control automation system and, on some level, the interface is very clearly built by a bunch of open-source people\\u2014good for them; they could benefit from a graphic designer or three to\\u2014or user experience person to tie it all together, but once you wrap your head around it, it works really well, where I have automations let me do different things. They even have an Apple Watch app [without its 00:32:14] complications on it. So, I can tap the thing and turn on the lights in my office to different levels if I don\\u2019t want to talk to the robot that runs my house. And because my daughter has started getting very deeply absorbed into some YouTube videos from time to time, after the third time I asked her what\\u2014I call her name, I tap a different one and the internet dies to her iPad specifically, and I wait about 30 to 45 seconds, and she\\u2019ll find me immediately.



Koz: That\\u2019s an amazing automation. I love Home Assistant. It\\u2019s certainly more technical than I could give to my parents, for example, right now. I think things like Matter are going to bring a lot of that functionality to the easier-to-use hubs. And I think Home Assistant will get better over time as well.



I think the only way to deal with these devices that are going to end-of-life and stop getting support is have them be local control only and so then it\\u2019s your hub that keeps getting support and that\\u2019s what talks to the internet. And so, you don\\u2019t\\u2014you know, if there\\u2019s a vulnerability in the TCP stack, for example, in your light switch, but your light switch only talks to the hub and isn\\u2019t allowed to talk to anything else, how severe is that? I don\\u2019t think it\\u2019s so bad. Certainly, I wall off all of my IoT devices so that they don\\u2019t talk to the rest of my network, but now you\\u2019re getting a fairly complicated networking\\u2026 mojo that listeners to your podcast I\\u2019m sure capable of, but many people aren\\u2019t.



Corey: I had something that did something very similar and then I had to remove a lot of those restrictions, try to diagnose a phantom issue that it appears was an unreported bug in the wireless AP when you use its second ethernet port as a bridge, where things would intermittently not be able to cross VLANs when passing through that. As in, the initial host key exchange for SSH would work and then it would stall and resets on both sides and it was a disaster. It was, what is going on here? And the answer was it was haunted. So, a small architecture change later, and the problem has not recurred. I need to reapply those restrictions.



Koz: I mean, these are the kinds of things that just make me want to live in a shack in the woods, right? Like, I don\\u2019t know how you manage something like that. Like, these are just pain points all over. I think over time, they\\u2019ll get better, but until then, that shack in the woods with not even running water sounds pretty appealing.



Corey: Yeah, at some level, having smart lights, for example, one of the best approaches that all the manufacturers I\\u2019ve seen have taken, it still works exactly as you would expect when you hit the light switch on the wall because that\\u2019s something that you really need to make work or it turns out for those of us who don\\u2019t live alone, we will not be allowed to smart home things anymore.



Koz: Exactly. I don\\u2019t have any smart bulbs in my house. They\\u2019re all smart switches because I don\\u2019t want to have to put tape over something and say, \\u201cDon\\u2019t hit that switch.\\u201d And then watch one of my family members pull the tape off and hit the switch anyways.



Corey: I have floor lamps with smart bulbs in them, but I wind up treating them all as one device. And I mean, I\\u2019ve taken the switch out from the root because it\\u2019s, like, too many things to wind up slicing and dicing. But yeah, there\\u2019s a scaling problem because right now a lot of this stuff\\u2014because Matter is not quite there all winds up using either Zigbee\\u2014which is fine; I have no problem with that it feels like it\\u2019s becoming Matter quickly\\u2014or WiFi. And there is an upper bound to how many devices you want or can have on some fairly limited frequency.



Koz: Yeah. I think this is still something that needs to be resolved. You know, I\\u2019ve got hundreds of devices in my house. Thankfully, most of them are not WiFi or Zigbee. But I think we\\u2019re going to see this evolve over time and I\\u2019m excited for it.



Corey: I was talking to someone where I was explaining that, well, how this stuff works. Like, \\u201cWell, how many devices could you possibly have on your home network?\\u201d And at the time it was about 70 or 80. And they just stared at me for the longest time. I mean, it used to be that I could name all the computers in my house. I can no longer do that.



Koz: Sure. Well, I mean, every light switch ends up being a computer.



Corey: And that\\u2019s the weirdest thing is that it\\u2019s, I\\u2019m used to computers, being a thing that requires maintenance and care and feeding and security patches and\\u2014yes, relevant to your work\\u2014an SSL certificate. It\\u2019s like, so what does all of that fancy wizardry do? Well, when it receives a signal, it completes a circuit. The end. And it\\u2019s, are really better off for some of these things? There are days we wonder.



Koz: Well, my light bill, my electric bill, is definitely better off having these smart switches because nobody in my house seems to know how to turn a light switch off. And so, having the house do it itself helps quite a bit.



Corey: To be very clear, I would skewer you if you worked on an AWS service that actually charged money for anything for what you just said about the complaining about light bills and optimizing light bills and the rest\\u2014



Koz: [laugh].



Corey: \\u2014but I\\u2019ve never had to optimize your service\\u2019s certificate bill beca\\u2014after you\\u2019ve spun off the one thing that charges\\u2014because you can\\u2019t cost optimize free, as it turns out, and I\\u2019ve yet to find a way to the one optimization possible where now you start paying customers money. I\\u2019m sure there\\u2019s a way to do that somewhere but damned if I can find it.



Koz: Well, if you find a way to optimize free, please let me know and I\\u2019ll share it with all of our customers.



Corey: [laugh]. Isn\\u2019t that the truth? I really want to thank you for taking the time to speak with me today. If people want to learn more, where\\u2019s the best place for them to find you?



Koz: I can give you the standard AWS answer.



Corey: Yeah, www.aws.com. Yeah.



Koz: Well, I would have said koz@amazon.com. I\\u2019m always happy to talk about certs and PKI. I find myself less active on social media lately. You can find me, I guess, on Twitter as @seakoz and on Bluesky as [kozolchyk.com 00:38:03].



Corey: And we will put links to all of that in the [show notes 00:38:06]. Thank you so much for being so generous with your time. I appreciate it.



Koz: Always happy, Corey.



Corey: Jonathan Kozolchyk, or Koz as we all call him, general manager for Certificate Services at AWS. I\\u2019m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you\\u2019ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you\\u2019ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that then will fail to post because your podcast platform of choice has an expired security certificate.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

'