Getting started with AWS: Cloud demystification 101
Getting started with Amazon Web Services (AWS) can be daunting …. to say the least. So much power. So much complexity. Trying to get your bearings at the beginning can feel like being lost, well, in a cloud.
✈️ The three Cs. In fully exhausting this metaphor, we look to the U.S. Navy Flight manual. When lost, the manual instructs to climb, conserve, and confess. So let’s climb to 30,000 feet in hopes of getting a better vantage point from above it all. We’ll conserve time, our most precious fuel resource, by only touching the surface on a wide array of technologies. And let’s confess that when it comes to AWS, we’re all a little lost.
Why the Cloud?
Back in the day, there was a massive barrier of entry to running hosted software like data-driven websites and application backends. It took significant financial and people resources to deal with racking and stacking and powering and configuring and securing and managing physical servers. Procurement, redundancy, capacity to scale, etc. all required your own physical space, computing resources, suitcases of money, lots of guesswork, and countless manual steps by human beings.
Few of these concerns have gone away. The matrix is still powered by servers, storage, databases, networking, and applications. But, but, but … the responsibility of ownership has been delegated to cloud providers, the cost has been distributed over an economy of scale, and the manual efforts have been largely automated into repeatable, programmable systems. You just provision and use what you need.
The benefits are many, but we’ll highlight a few:
- Iterate with ease. With flexible infrastructure that can be spun up and torn down with ease, experimentation is inexpensive and low risk.
- On-demand. Entire environments can be spun up using instruction sets and dynamically in response to real-time demand.
- Repeatable. Your technology infrastructure can be defined and deployed via software-defined recipes.
- Distributed. You can be global in minutes and ensure resources are kept close to customers.
- Pay as you go in real-time. Add and shed capacity to match real-time demand instead of forecasting and investing in things you might not need.
- Massive economies of scale spread across millions of accounts for things like purchasing volume and R&D costs.
- Agility. Work is freed from the constraints of capacity.
- Focus. You need to minimize the amount of time spent on non-product differentiators.
- Security. You know, as much as is possible. ¯\_(ツ)_/¯
Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP) have emerged as the Big 3 of cloud service providers. Like with anything, they all have strengths and weaknesses. We’ll dig into those key differences in future posts, but one difference is clear. Amazon has more market share than the other two combined, so that’s where we’re choosing to begin our cloud explorations series. To put the magnitude of the economics alone into context, AWS hauled in nearly double the revenue of McDonald’s in 2019! Read that again and let it sink in.
The Big Picture
Part of what is so damn intimidating about AWS is all of the new vocabulary and the confusingly similar array of options – there are currently over 90 different services!! EC2, EBS, EFS, ELB, ALB, NLB, ECS, ECR, EKS … 🤯 It’s so very easy to get lost in the acronyms and complexities and totally miss the bigger picture of how and why it all fits together. For the remainder of this exploration, we’ll go from high-level concepts to some of the specific for-instances that fall into each of the following buckets:
- ⌨️ Access. Options for interacting with AWS.
- 🌎 Geography. Where the resources live and how that can impact performance, redundancy, privacy concerns, etc.
- 🧮 Computing. On-demand access to machines and computation time.
- 💾 File storage. Options for storing flat files.
- 🔑 Database services. Relational databases and key-value stores.
- 💬 Networking. How your systems talk and expose themselves to the outside.
- ⚡️Caching. Improve performance of web applications and move static content closer to end-users.
- 📈 Scaling and load balancing. Dynamically smart sizing resources and distributing traffic to match demand and availability.
- 👀 Monitoring. System monitoring, alerting, and response.
- 🤖 Automated cloud formation. Software-defined virtual networks.
- 🐳 Container orchestration. Deployment, management, scaling, and networking of containers.
- 👮🏼♀️ Security. Authentication, authorization, and shared responsibility.
- 💵 Pricing. Models and calculations.
- 📋 Certification. Learning paths and resume badges.
Accessing Amazon Web Services
From the perspective of a developer, there are three ways to interact with AWS.
- AWS CLI. Command-line interface for interacting with AWS services. Can be run on Linux, macOS, and Windows. As close to the metal as you can get.
- AWS Management Console. This is a web-based, graphical management interface that sits on top of the same APIs that can be called from the command line, programmatic scripts, SDKs, etc.
The cloud abstracts away a lot of detail, but we still have to be thoughtful about where our resources will live. Amazon logically and physically separates infrastructure into Regions, that are then composed of Availability Zones (AZs).
- Regions. A region is one of 22 physical locations across the globe. All regions are designed to be isolated from one another for purposes of fault tolerance and stability. You select regions based on factors like data residency requirements (GDPR, privacy, etc.), proximity to customers (the closer, the better), service availability (not all services are available in all regions), and cost (the price is not the same everywhere). Examples include us-east-1 in Northern Virginia and us-west-2 in Oregon.
- Availability Zones (AZ). An Availability Zone is a cluster of data centers, all connected via encrypted, low-latency links. A region is made up of two or more AZs, each with its own redundant power, networking, and connectivity. They support applications that are more available, fault-tolerant, and scalable than would be possible via a single data center. AZs are represented via their region and a letter identifier like us-east-1a.
Within the context of geography, it also makes sense to mention reaching distant customers. Since you’re choosing where your infrastructure lives, inevitably some customers will be further away and would otherwise deal with increased latency. This is solved via the use of edge locations and point of presence caching with Amazon’s content distribution network. We’ll dive deeper into this in the section below on caching.
On-demand computing is at the core of the cloud. Nearly every app you touch and site you access is running on or communicating with a virtual machine instance running Linux or Windows. Anything you can run on a physical server sitting next to you can be done in the cloud – application servers, web servers, database servers, game servers, mail servers, media servers, etc. Amazon’s offering is called Elastic Compute (EC2) and it offers many benefits.
- Elastic. In the name – build-up or shed cost on demand.
- Configurable. Compute capacity, memory, storage, etc. options can all be tuned to workloads, with sizing and capacity that best meet needs and cost objectives. There are a wide selection of instance types optimized for different use cases.
- Controllable. Instances can be spun up or down based on explicit command, automation, or triggered conditions.
- Integrated. Instances are tightly integrated with the AWS stack.
- Unopinionated. You can boot machine images on top of bare metal, prepare your own images for re-use, or leverage a marketplace of prepared images.
AWS Lambda. While a major topic of it’s own, it also makes sense to talk about serverless in the context of computing. With Lambda, you can run code without explicitly setting up servers. It allows you to execute code when certain events trigger. There is support for many languages and you only have to pay for actual compute time. An example use case would be generating thumbnails and crops whenever a user uploads a photo – this is not a service you need to have actively running 24-7; it’s an isolated function that needs to execute on-demand.
Where oh where to store all of your stuff. AWS provides three options for file storage – virtual disk, logical bucket, and archival.
- Elastic Block Store (EBS). Raw, unformatted block storage devices that you can mount to a server and use for data files, databases, or plain old file systems. The drives are networked to your server, rather than being physically connected. The drives are programmatically scalable and can be protected with replication across AZs, but are only accessible from a single EC2 instance in the given AWS region. Solid-state (SSD) and standard disk drives (HDD) are available.
- Elastic File System (EFS). Cloud native, elastic file system for Linux that grows and shrinks as you add or remove files. Designed for larger amounts of data than can be stored on a single EC2 instance (like large analytic workloads). It’s more expensive than EBS, but does allow you to mount the file system across multiple regions and instances.
- Simple Storage Service (S3) is the hard drive of the Internet. It’s a flat object store (no directories), with objects stored in named buckets and tagged by string-based keys. The storage is unlimited, with individual objects up to 5TB. It’s fast, durable, and highly reliable (11-9s!). Access can be controlled at the bucket and object level. It can be used for virtually any storage need including backups, media hosting, software delivery, data files, data lakes, and data pools. It’s even possible to use it as a content serving backbone of a website, serving up HTML, CSS, and images directly from S3 without threading through Apache or IIS. It has a complex cost structure but is generally the least expensive option for data storage alone.
- S3 Glacier. Inexpensive cold storage for long-term backup. Configurable with lifecycle management so that, for instance, S3 objects could be glacier archived after 60 days of not being accessed and then deleted after 1 year. Some primary use cases are media asset workflows, regulatory archiving, scientific data storage, and digital preservation of other infrequently accessed objects.
You can roll your own database setup on an EC2 instance or you can use one of Amazon’s managed database services to benefit from easier scalability, auto-patching, snapshots, multi-AZ replication, and automatic host replacement. There is support for both NoSQL and traditional relational databases.
- Relational Database Service (RDS) supports the following enterprise-class relational database engines: Amazon Aurora, PostgreSQL, MariaDB, ORACLE, MySQL, and Microsoft SQL Server.
- Aurora is a cloud native RDS engine. It’s a special implementation of MySQL and PostgreSQL designed to run optimally in the cloud – up to 5x faster than standard MySQL and 3x faster than standard PostgreSQL.
- DynamoDB is a managed NoSQL database service that works based on key-value, document, or graph-based storage. It can be configured to face directly to the web for a very easy GET and PUT API.
AWS also offers a handful of purpose-built database services for much more case-driven requirements like ML, data warehousing, graphs, etc.).
Compute services are how we process the bits. Storage and DB services are how the bits are persisted and retrieved. For our purposes, networking will refer to how the outside world sees your infrastructure and how your internal resources communicate with one another; in other words, how the bits flow into and through your systems.
- Virtual Private Cloud (VPC). Your software-defined, logically isolated section of the AWS cloud. It’s the defined virtual network where you launch and run AWS resources. You control this environment by selecting your own IP address range, creating subnets, configuring routing tables, and setting up network gateways. You define your contract with the outside world by defining public-facing subnets for things like web servers and private facing subnets for things like database and application servers. For additional layers of security, you can use security groups and network access control lists to control access to your EC2 instances.
- Route 53. Amazon’s Domain Name System (DNS) web service. Used for domain name registration, internet traffic routing to resources for your domain, and resource health checks.
- Internet gateway is a logical connection between your VPC and the Internet. There’s only one per VPC and without it, resources in the VPC cannot be accessed from the Internet. It performs network address translation (NAT) for instances that have been assigned public IP addresses. Because it’s a logical connection only (not a physical device), there’s no concern for availability risks or bandwidth constraints on your network traffic.
- Amazon API Gateway. Centralized service that sits between clients and backend resources, used for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs. Supports containerized and serverless workloads. In a centralized manner, can handle traffic management, CORS support, authorization and access control, throttling, monitoring, and API versioning.
- AWS Firewall Manager. Security management service used to centrally configure and manage firewall rules across your AWS accounts and applications.
- Amazon Simple Notification Service (SNS). Fully managed pub/sub messaging for distributed and serverless applications. Enables decoupling of microservices, distributed systems, and serverless apps via push-based, many-to-many messaging. Can also be used to send notifications to end users by way of mobile push, SMS, and email.
- AWS Direct Connect. Allows for provisioning a dedicated (physical fiber) network connection between AWS and your premises. This is what you would use to establish a hybrid cloud between your corporate data center and an Amazon VPC. It provides private connectivity and consistent network performance.
Distributed applications can realize massive performance gains by taking steps to reduce data access time and connection latency.
- Amazon ElasticCache. Fully managed Redis or Memcached in-memory data store that helps with scaling data-intensive apps. The performance gains come from allowing your apps to retrieve data from pre-fetched in-memory data stores instead of relying exclusively on slower disk-based databases and redundant query execution time.
- AWS CloudFront. Using a content distribution network (CDN) like AWS CloudFront allows for web content (static HTML, images, CSS, etc.) to be globally replicated across servers that are closer to all of your users. This point of presence cache reduces latency and improves performance for all, but especially for distant customers on the edge.
Scaling and Load Balancing
Dynamic resource allocation and traffic distribution are where the cloud investment really begins to pay significant dividends in terms of scaling to meet demand and improving fault tolerance in the most cost effective ways possible.
- EC2 Auto Scaling allows you to automatically add or remove EC2 instances as needed to match near real-time demand – scaling out for spikes and scaling in during off-peak. By default, it’s configured based on a min/max/desired instance specification, per time range schedule. When used with CloudWatch (discussed below), auto scaling can be based on metrics like cluster-average CPU utilization and can even be used to automatically replace unhealthy instances.
- AWS Auto Scaling. Confusingly named, but more expansive than the prior. Monitors apps and automatically adjusts capacity for EC2 Auto Scaling groups in addition to other AWS resources like Spot Fleets, Elastic Container Service, DynamoDB, and Aurora.
- Elastic Load Balancing (ELB) automatically distributes traffic across multiple instances. It will only send traffic to instances ready to receive traffic, based on health checks. ELB comes primarily via two flavors of load balancing – application request level and network connection level:
- Application Load Balancer (ALB). Best suited for load balancing of HTTP and HTTPS traffic. Works behind a single domain and can fork requests off to specific instances based on the type of request with the ability to route based on things like headers, request methods, paths, and query strings.
- Network Load Balancer (NLB). Best suited for load balancing of TCP, UDP, and TLS traffic where extreme performance is required (millions of requests per second). Optimized to handle sudden traffic spikes. This performance comes at the cost of not having access to the Layer 7 advanced routing options provided by the ALB.
CloudWatch serves as your eyes, ears, and in some cases, remote hands, in the cloud. It collects measurements (disk IO, memory utilization, CPU utilization), events, log files, and then raises alarms based on specified triggers (comparing against thresholds). Alarms can notify teams, trigger programmatic responses to deal with what was detected, add capacity, and shed capacity.
Automated Cloud Formation
Repeatability is one of the core tenants of great cloud architecture. Instead of taking manual steps every time you need to fire up a new resource, it’s now possible to define our infrastructure with software-based provisioning.
AWS CloudFormation allows you to define your infrastructure in code, using YAML or JSON. You can set up test, training, prod, disaster recovery environments, etc. all on demand. You can do A/B testing, blue-green roll-outs, and build your entire infrastructure all with script invocation. You can literally model your entire infrastructure and all of your application resources with a text file.
Containers are all the rage and rightfully so. They’ve made it easier to develop, deploy, and run applications – allowing developers to wrap an application and all of its dependency into a self-contained package, ensuring consistency across environments and release cycles. Amazon has support for both industry heavyweights – Docker and Kubernetes.
- Elastic Container Service (ECS). Amazon’s “Docker as a service”. Fully managed container orchestration service that allows you to run containerized Docker applications on AWS.
- Elastic Container Registry (ECR). Fully-managed, ECS integrated, Docker container registry for storing, managing, and deploying Docker container images.
- Elastic Kubernetes Service (EKS). Amazon’s “Kubernetes as a service” that allows you to run Kubernetes on AWS without needing to install and operate your own Kubernetes clusters.
- Fargate. AWS technology for running containers. Supports orchestration via ECS or EKS without having to manage the underlying EC2 instances. Fargate is considered serverless in that it allows you to think about and pay for resources in terms of applications rather than provisioning and managing servers.
Shared responsibility model. We don’t get to ignore security concerns just because we’ve delegated most of our infrastructure to the cloud. Amazon is responsible for securing its foundational services (compute, database, storage, and network) and global infrastructure (regions, AZs, and Edge Locations), but Amazon doesn’t know your business. You’re still on the hook for securing customer data, applications, access management, instance operating systems, firewall rules, and client/server-side encryption. Managed services (like Aurora) blur these lines a bit since Amazon becomes responsible for securing the servers, while you’re left to only worry about the code and data.
AWS Identify and Access Management (IAM) is used to securely control access to AWS resources. This includes both authentication (who is entering the account) and authorization (what she can do once there). It’s very granular and uses least necessary privilege as the goal.
- IAM user. A person or application that interacts with AWS.
- Group. Collection of users that share the same permissions.
- IAM policy. Authorization attributes, associated with a user or group, to determine access to things like the CLI, S3 buckets, etc.
- Role. Temporary privileges that an entity can assume. IAM users, applications, and services can assume IAM roles. Roles use an IAM policy for permissions.
- Root user. The initial user created when setting up your AWS account. It has full privilege to do anything and that can’t be changed. After creating your account, you should never log in with the root user. All operations need to be done by other named entities. You should delete root user access keys and create a separate IAM use with admin access.
Assessment and Defense. AWS provides a handful of tools to help with threat identification and mitigation:
- Amazon Inspector. Automated security assessment as a service. Runs across the top of your AWS account, checking for best practices and vulnerabilities – produces a detailed list of security findings and remediation recommendations.
- AWS Shield. Managed DDoS protection service for most common network layer attacks with always-on detection and mitigation. Comes in standard (included) and advanced ($) flavors.
Pricing still feels a lot more complex than it needs to be, and in-depth specifics are out of scope for this post, but in general you:
- Pay as you go and generally only for what you use
- Save when you reserve instances (up to 75% over equivalent on-demand capacity)
- Pay less by using more with automatic volume-based discounts
Compute. You’re charged per hour or second, and the price varies by instance size and type:
- On-Demand Instances. For short-term, unpredictable workloads. Charged by time used.
- Dedicated Hosts. Physical server dedicated to you.
- Reserved Instances. For applications with steady state usage. Discounted for 1 to 3-year commitments.
- Spot Instances. For applications with flexible start and end times, or urgent computing needs for large amounts of capacity. This utilizes spare AWS capacity at up to a 90% discount.
Storage. You’re charged based on the amount of storage used (per GB), the region, storage class, number and type of requests, and amount of data transferred out of the region.
Included Services. Amazon VPC, Elastic Beanstalk, Auto Scaling, CloudFormation, and IAM all come with no additional charge.
Cost Estimation and Management. With all of the flexibility and granularity in pricing models, comes a great deal of complexity in understanding and predicting costs. AWS provides a handful of tools to help.
- AWS Simple Monthly Calculator. Helps estimate your monthly bill with a per-service breakdown. Video overview.
- AWS Cost Explorer. Visualize, understand, and manage your AWS costs and usage over time.
- Trusted Advisor. A service that aims to help reduce cost, increase performance, and improve security. Scans for resources that are over-provisioned, underutilized, abandoned, etc.
AWS offers two different learning and certification paths depending on your goals.
- Roal-based. These all start with the foundational Cloud Practitioner certification (6 months experience), but then diverge into associate (1 year experience) and professional (2 years experience) tracks for Architect, Operations, and Developer certifications.
- Specialty. Domain-specific certifications for Advanced Networking, Security, Machine Learning, Data Analytics, Database, and Alexa Skills.
Lowering the Barrier of Entry
Even with increased awareness of what’s possible, it can still be a little overwhelming on where to start. And frankly, when one of the goals of all of this is reducing undifferentiated work, unnecessary time and brain energy spent on infrastructure concerns might not be capital well spent. Fortunately, Amazon has two options for much simpler bootstrapping. You could consider these options training wheels, but depending on your use case and given their ability to scale, they might be all you ever need.
Amazon Lightsail. We’ll write a lot more about this, but for beginners, prototypers, and folks with less complex needs, it’s worth considering Amazon Lightsail. It’s built on EC2 technology and offers access to virtual servers, storage, databases, and networking with the following advantages:
- Far less complexity with stripped down and streamlined management UI, but still backed by world-class infrastructure
- Lower, more predictable pricing
- AWS-provided migration paths for when/if you outgrow it
It shines at quickly spinning up simple web apps, websites (like this blog!), and dev/test environments. And if you’re new to the AWS world, it’s an empowering learning tool 💪.
Elastic Beanstalk is another more approachable option, also built on EC2. It simplifies managing, deploying, and scaling web apps and services. It’s more application-centric than Lightsail and effectively allows you to deploy without managing infrastructure. You upload your code and Elastic Beanstalk uses CloudFormation to create a normal EC2 instance, install an execution environment on the machine, and deploy your application for you. It handles capacity provisioning, load balancing, auto scaling, and application health monitoring. It can automatically provision resources like load balancers, auto scaling groups, S3, and relational database services.
- Included. There’s no extra cost to use the service – pay only for the underlying AWS resources needed to store and run your applications.
- ⏫ developer productivity. You can spin up an application with 0 infrastructure or resource configuration effort.
- Grows with you. It can automatically scale your application up (to millions of users) and down based on need (with adjustable settings).
- Managed. The underlying platform will be kept up-to-date by Amazon with the latest patches and updates.
- Controllable-if-desired. You can still look under the hood and take full control over any of the underlying resources (ex: changing EC2 instance type).
- Wide language support. Supports Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on servers including Apache, Nginx, Passenger, and IIS.
Both options are great for decreasing undifferentiated work, increasing developer productivity, and shortening the path from idea to application. As a matter of architectural principal, I always prefer to reach for the simplest answer that solves the problem and both of these solutions firmly check that box.
Above the Clouds
When putting all of these pieces together, the number of architecture permutations could be infinite. That’s both the power and the complexity. Before wrapping up, let’s channel that into one illustrative, 30,000-foot for-instance that supports routing, network address translation, multi-zone fault tolerance, content distribution, load balancing, auto scaling, multiple instance types, caching, managed databases, data replication, and monitoring – all delivered via a software-defined virtual network.
I hope this has been helpful. I know the web is full of resources talking about AWS, but it’s all incredibly fragmented. My goal was to put together a single, curated AWS overview that could serve as a conceptual getting started guide for developers, managers, and executives hoping to get a more holistic understanding of how all of this tremendous power and flexibility fits together. If you got this far, don’t hesitate to reach out with feedback or subscribe to our newsletter so you don’t miss out on future guides like this one.
Deriving ideas, style, or taste from a broad and diverse universe of sources, including but not limited to, software development, management, business matters, work from home life, emerging tech, and STEM parenting.