AWS/EC2

From Christoph's Personal Wiki
Jump to: navigation, search

Elastic Compute Cloud (EC2)

SEE: Amazon EC2 FAQs

Amazon Elastic Compute Cloud (EC2) is a web service that provides re-sizable compute capacity in the cloud. EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity (both up and down), as your computing requirements change.

Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios.

  • EC2 provision types:
    • On-Demand: Allows you to pay a fixed rate by the hour with no commitment.
    • Reserved
      • Amazon EC2 Reserved Instances allow you to reserve Amazon EC2 computing capacity for 1 or 3 years, in exchange for a significant discount (up to 75%) compared to On-Demand instance pricing (must pay for entire reserved period up-front).
    • Spot
      • Amazon EC2 Spot instances allow you to bid on spare Amazon EC2 computing capacity. Since Spot instances are often available at a discount compared to On-Demand pricing, you can significantly reduce the cost of running your applications, grow your application's compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
      • Useful for applications that have flexible start and end times and are only feasible at very low compute prices.
      • Example use cases: genomics, pharmaceutical companies, etc.
      • If you terminate, you pay for the hour the instance was terminated
      • If Amazon terminates it, you do not pay for the hour the instance was terminated
      • EC2 Spot prices
    • Dedicated Hosts: Physical EC2 server dedicated for your use.
      • Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licences.
      • Useful for regulatory requirements that may not support multi-tenant virtualization (e.g., governments)
      • Can be purchased On-Demand (hourly)
      • Reserved Dedicated Hosts (from 1 to 3 years) have up to 70% discount.
  • EC2 Instance Types:
    • D for Density (e.g., d2)
    • I for IOPS (e.g., i2)
    • R for RAM
    • T for general purpose (e.g., t2.micro)
    • M for Main choice (for general purpose apps)
    • C for Compute
    • G for Graphics
    • F for FPGA
    • P for general purpose GPU (think "pics")
    • X for Extreme memory
    • mnemonic: DIRTMCGFPX (or, "Dr Mc GIFT PX" => "Doctor McGift Pix" => gives out free pictures)
EC2 Instance Types
Family Speciality Use case
D2 Dense storage fileservers, data warehousing, hadoop
R4 Memory optimized memory intensive apps/DBs
M4 General purpose application servers
C4 Compute optimized CPU intensive apps/DBs
G2 Graphics intensive video encoding, 3D application streaming
I2 High-speed storage NoSQL DBs, data warehousing, etc.
F1 Field Programmable Gate Array hardware acceleration for your code
T2 Lowest cost, general purpose web servers, small DBs, etc.
P2 Graphics/general purpose GPU machine learning, Bit Coin mining, etc.
X1 Memory optimized SAP HANA/Apache Spark, etc.


$ curl http://169.254.169.254/latest/meta-data/public-ipv4 # returns the EC2 instance's public IPv4

NOTE: If one makes an Amazon Machine Image (AMI) public, this AMI is not immediately available across all regions, by default.

With EC2 you can have 2 types of storage: EBS storage and Instance Store. EBS is persistent and if an EC2 instance is stopped with an EBS volume attached, there will be no data lost. Instance Store is ephemeral and if the EC2 instance is stopped, all data will be lost.

  • EC2 status checks:
    • System Status Checks: These checks monitor the AWS systems required to use this instance and ensure they are functioning properly.
      These checks verify that your instance is reachable. They test that Amazon is able to get network packets to your instance.
      If these checks fail, there may be an issue with the infrastructure hosting your instance (such as AWS power, networking, or software systems). You may need to restart or replace the instance, wait for Amazon systems to resolve the issue, or seek technical support.
      These checks do not validate that your operating system and applications are accepting traffic.
    • Instance Status Checks: These checks monitor your software and network configuration for this instance.
      These checks verify that your instance's operating system is accepting traffic.
      If these checks fail, you may need to reboot your instance or make modifications to your operating system configuration.
  • EC2 monitoring (via CloudWatch):
    • Default (free): every 5 minutes
    • Detailed (not free): every 1 minute
  • Security Groups
    • These are "virtual firewalls"
    • All inbound traffic is blocked by default
    • All outbound traffic is allowed by default
    • 1 instance can have one or more security groups associated with it
    • Changes to security groups take effect immediately.
    • Security groups are stateful (i.e., if you create an inbound rule allowing traffic in, that traffic is automatically allowed back out again). NOTE: ACLs are stateless.
    • You can have any number of EC2 instances within a security group.
    • You can have multiple security groups attached to EC2 instances.
    • You cannot block specific IP addresses using security groups; instead use Network Access Control Lists.
    • You can specify ALLOW rules, but not DENY rules.
  • EC2 metadata: data about your instance that you can use to configure or manage the running instance.

EC2 metadata

SEE: for details
$ curl http://169.254.169.254/latest/meta-data/
ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
hostname
instance-action
instance-id
instance-type
local-hostname
local-ipv4
mac
metrics/
network/
placement/
product-codes
profile
public-hostname
public-ipv4
public-keys/
reservation-id
security-groups
  • Get the public IP of your EC2 instance (from within the instance):
$ curl http://169.254.169.254/latest/meta-data/public-ipv4
52.36.123.34

Placement Groups

A placement group is a logical grouping of instances within a single Availability Zones. Using placement groups enables applications to participate in a low-latency, 10 Gbps network. Placement groups are recommended for applications that benefit from low network latency, high network throughput, or both.

Use cases: Hadoop cluster, Casandra, etc.

  • A placement group can not span multiple Availability Zones.
  • The name you specify for a placement group must be unique within your AWS account.
  • Only certain types of instances can be launched in a placement group (Compute Optimized, GPU, Memory Optimized, Storage Optimized).
  • AWS recommends homogeneous instances (i.e., instances of the same size and family) with placement groups.
  • You can not merge placement groups.
  • You can not move an existing instance into a placement group. You can create an AMI from your existing instance, and then launch a new instance from the AMI into a placement group.

Elastic Block Storage (EBS)

SEE: Amazon EBS FAQs

Amazon Elastic Block Storage (EBS) allows you to create storage volumes and attach them to EC2 instances. Once attached, you can create a file system on top of these volumes, run a database, or use them in any other way you would use a block device. EBS volumes are placed in a specific Availability Zone (AZ), where they are automatically replicated to protect you from the failure of a single component.

  • EBS Volume types (see: for details)
    • General purpose SSD (gp2):
      General purpose SSD volume that balances price and performance for a wide variety of transactional workloads
      Ratio of 3 IOPS per GB with up to 10,000 IOPS and the ability to burst up to 3,000 IOPS for extended periods of time for volumes under 1 GiB.
    • Provisioned IOPS SSD (io1):
      Highest-performance SSD volume designed for mission-critical applications
      Designed for I/O intensive applications, such as large relational or NoSQL databases.
      More than 10,000 IOPS or 160 MiB/s of throughput per volume.
      Can provision up to 20,000 IOPS per volume.
    • Throughput Optimized HDD (st1):
      Low cost HDD volume designed for frequently accessed, throughput-intensive workloads
      Magnetic (spinning disk) HDD
      Useful for: Big Data, Data warehouses, log processing, etc. Sequential data
      Cannot be a boot volume
    • Cold HDD (sc1):
      Lowest cost HDD volume designed for less frequently access workloads
      Useful for: File server, etc.
      Cannot be a boot volume
    • Magnetic (standard):
      Infrequently accessed storage
      Lowest cost per gigabyte of all EBS volume types that is bootable.
      Magnetic volumes are ideal for workloads where data is accessed infrequently, and applications where the lowest storage cost is important.

NOTE: One can not mount 1 EBS volume to multiple EC2 instances; use EFS instead.

In order to enable encryption at rest using EC2 and Elastic Block Store, one needs to configure encryption when creating the EBS volume.

  • RAID (Redundant Array of Independent Disks)
    • RAID 0 - Striped, no redundancy, good performance
    • RAID 1 - Mirrored, redundancy
    • RAID 5 - Good for reads, bad for writes. AWS does not recommend ever putting RAID 5's on EBS
    • RAID 10 (RAID 1 + RAID 0) - Striped & mirrored. Good redundancy, good performance.
    • On AWS, use either RAID 0 or RAID 10. Useful for increasing disk I/O.
    • How to take a snapshot of a RAID array?
      • Problem: Take a snapshot, the snapshot excludes data held in the cache by applications and the OS. This tends not to matter on a single volume, however, using multiple volumes in a RAID, this can be a problem due to the inter-dependencies of the array.
      • Solution: Take an application consistent snapshot (i.e., stop the application from writing to disk and flush all caches to the disk).
      • Solution#1: Freeze the file system
      • Solution#2: Unmount the RAID array
      • Solution#3: Shut down the associated EC2 instance, take a snapshot, then start up the instance again.

Elastic File System (EFS)

SEE: Amazon EFS

Amazon Elastic File System (EFS) is a file storage service for EC2 instances. EFS is easy to use and provides a simple interface that allows you to create and configure file systems quickly and easily. With EFS, storage capacity is elastic, growing and shrinking automatically as you add and remove files, so your applications have the storage they need, when they need it.

  • EFS Features:
    • Block-based storage
    • Supports the Network File System version 4 (NFSv4) protocol
    • You only pay for the storage you use (no pre-provisioning required) => $0.30/GB
    • Can scale up to petabytes
    • Can support thousands of concurrent NFS connections
    • Data is stored across multiple AZs within a region
    • Read after write consistency
Mounting the EFS
  • Create a new directory on your EC2 instance:
$ sudo mkdir /efs
  • Mount your file system using the DNS name. The following command looks up your EC2 instance's AZ using the EC2 instance metadata URI 169.254.169.254, then mounts the file system using the DNS name for that AZ (note: replace fs-00000000 with your actual EFS ID):
$ sudo mount -t nfs4 $(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone).fs-00000000.efs.us-west-2.amazonaws.com:/ /efs

Amazon Machine Image (AMI)

An Amazon Machine Image (AMI) provides the information required to launch a virtual server in the Cloud. You specify an AMI when you launch an instance, and you can launch as many instances from the AMI as you need. You can also launch instances from as many different AMIs as you need.

An AMI consists of three different components:

  • A template for the root volume for the instance (e.g., an operating system, an application server, and applications)
  • Launch permissions that control which AWS accounts can use the AMI to launch instances.
  • A block device mapping, which specifies the volumes to attach to the instance when it is launched.

You can create your own AMIs (based off of snapshots you have taken from your EC2 instances). Your AMIs are private by default. However, you can also shared them with other AWS account. Or, you can make them public. If you do make them public, make sure to follow the AWS Hardening and Clean-up Requirements and the Guidelines for Shared Linux AMIs.

AMIs (like snapshots) are stored in S3.

AMIs are regional. You can only launch an AMI from the region in which it is stored. However, you can copy AMIs to other regions using the console, CLI, or the EC2 API.

  • Select your AMI based on:
    • Region
    • Operating system
    • Architecture (32- or 64-bit)
    • Launch permissions
    • Storage for the root device (Root Device Volume)
      • Instance store (aka ephemeral storage)
      • EBS backed volumes

All AMIs are categorized as either backed by Amazon EBS or backed by instance store

  • AMI types:
    • EBS
      The root device for an instance launched from this AMI is an EBS volume created from an EBS snapshot.
      Instances can be stopped (and rebooted/terminated)
      If you stop the instance, you will not lose your data
    • Instance Store (aka ephemeral storage)
      The root device for an instance launched from this AMI is an instance store volume created from a template stored in S3.
      You cannot stop an instance built from an instance store. You can only reboot or terminate it.
      If the underlying host fails, you will lose your data

By default, root volumes on either AMI type will be deleted on termination. However, with EBS volumes, you can tell AWS to keep the root device volume.

Auto Scaling

SEE: AWS Auto Scaling

Auto Scaling helps you maintain application availability and allows you to scale your Amazon EC2 capacity up or down automatically according to conditions you define. You can use Auto Scaling to help ensure that you are running your desired number of Amazon EC2 instances. Auto Scaling can also automatically increase the number of Amazon EC2 instances during demand spikes to maintain performance and decrease capacity during lulls to reduce costs. Auto Scaling is well suited both to applications that have stable demand patterns or that experience hourly, daily, or weekly variability in usage.

Auto Scaling pricing: Auto Scaling is enabled by Amazon CloudWatch and carries no additional fees. Amazon EC2 and Amazon CloudWatch service fees apply and are billed separately. Partial hours are billed as full hours.

Launch Configurations and Auto Scaling Groups