DOP328: Five lessons from principal engineers on building reliable services

Published: Dec. 8, 2019, midnight

b'In this session, five Amazon principal engineers share hard-learned lessons from their experiences building reliable services at Amazon. Join Andrew Certain, Becky Weiss, Colm MacCarthaigh, David Yanacek, and Marc Brooker as they share personal stories that highlight a current Amazon best practice. The engineers discuss how Amazon uses timeouts, how we think about back-offs and retries, our approach to taking dependencies, how we measure performance, and how we use shuffle sharding.'