AWS/EC2

From Christoph's Personal Wiki
Revision as of 00:30, 9 February 2017 by Christoph (Talk | contribs) (Elastic Block Storage (EBS))

Jump to: navigation, search

Elastic Compute Cloud (EC2)

SEE: Amazon EC2 FAQs

Amazon Elastic Compute Cloud (EC2) is a web service that provides re-sizable compute capacity in the cloud. EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity (both up and down), as your computing requirements change.

Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios.

  • EC2 provision types:
    • On-Demand: Allows you to pay a fixed rate by the hour with no commitment.
    • Reserved
      • Amazon EC2 Reserved Instances allow you to reserve Amazon EC2 computing capacity for 1 or 3 years, in exchange for a significant discount (up to 75%) compared to On-Demand instance pricing (must pay for entire reserved period up-front).
    • Spot
      • Amazon EC2 Spot instances allow you to bid on spare Amazon EC2 computing capacity. Since Spot instances are often available at a discount compared to On-Demand pricing, you can significantly reduce the cost of running your applications, grow your application's compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
      • Useful for applications that have flexible start and end times and are only feasible at very low compute prices.
      • Example use cases: genomics, pharmaceutical companies, etc.
      • If you terminate, you pay for the hour the instance was terminated
      • If Amazon terminates it, you do not pay for the hour the instance was terminated
      • EC2 Spot prices
    • Dedicated Hosts: Physical EC2 server dedicated for your use.
      • Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licences.
      • Useful for regulatory requirements that may not support multi-tenant virtualization (e.g., governments)
      • Can be purchased On-Demand (hourly)
      • Reserved Dedicated Hosts (from 1 to 3 years) have up to 70% discount.
  • EC2 Instance Types:
    • D for Density (e.g., d2)
    • I for IOPS (e.g., i2)
    • R for RAM
    • T for general purpose (e.g., t2.micro)
    • M for Main choice (for general purpose apps)
    • C for Compute
    • G for Graphics
    • F for FPGA
    • P for general purpose GPU (this "pics")
    • X for Extreme memory
    • mnemonic: DIRTMCGFPX (or, "Dr Mc GIFT PX" => "Doctor McGift Pix" => gives out free pictures)
EC2 Instance Types
Family Speciality Use case
D2 Dense storage fileservers, data warehousing, hadoop
R4 Memory optimized memory intensive apps/DBs
M4 General purpose application servers
C4 Compute optimized CPU intensive apps/DBs
G2 Graphics intensive video encoding, 3D application streaming
I2 High-speed storage NoSQL DBs, data warehousing, etc.
F1 Field Programmable Gate Array hardware acceleration for your code
T2 Lowest cost, general purpose web servers, small DBs, etc.
P2 Graphics/general purpose GPU machine learning, Bit Coin mining, etc.
X1 Memory optimized SAP HANA/Apache Spark, etc.


$ curl http://169.254.169.254/latest/meta-data/public-ipv4 # returns the EC2 instance's public IPv4

NOTE: If one makes an Amazon Machine Image (AMI) public, this AMI is not immediately available across all regions, by default.

With EC2 you can have 2 types of storage: EBS storage and Instance Store. EBS is persistent and if an EC2 instance is stopped with an EBS volume attached, there will be no data lost. Instance Store is ephemeral and if the EC2 instance is stopped, all data will be lost.

  • EC2 status checks:
    • System Status Checks: These checks monitor the AWS systems required to use this instance and ensure they are functioning properly.
      These checks verify that your instance is reachable. They test that Amazon is able to get network packets to your instance.
      If these checks fail, there may be an issue with the infrastructure hosting your instance (such as AWS power, networking, or software systems). You may need to restart or replace the instance, wait for Amazon systems to resolve the issue, or seek technical support.
      These checks do not validate that your operating system and applications are accepting traffic.
    • Instance Status Checks: These checks monitor your software and network configuration for this instance.
      These checks verify that your instance's operating system is accepting traffic.
      If these checks fail, you may need to reboot your instance or make modifications to your operating system configuration.
  • EC2 monitoring (via CloudWatch):
    • Default (free): every 5 minutes
    • Detailed (not free): every 1 minute
  • Security Groups
    • These are "virtual firewalls"
    • All inbound traffic is blocked by default
    • All outbound traffic is allowed by default
    • 1 instance can have one or more security groups associated with it
    • Changes to security groups take effect immediately.
    • Security groups are stateful (i.e., if you create an inbound rule allowing traffic in, that traffic is automatically allowed back out again). NOTE: ACLs are stateless.
    • You can have any number of EC2 instances within a security group.
    • You can have multiple security groups attached to EC2 instances.
    • You cannot block specific IP addresses using security groups; instead use Network Access Control Lists.
    • You can specify ALLOW rules, but not DENY rules.
  • EC2 metadata: data about your instance that you can use to configure or manage the running instance.

Elastic Block Storage (EBS)

SEE: Amazon EBS FAQs

Amazon Elastic Block Storage (EBS) allows you to create storage volumes and attach them to EC2 instances. Once attached, you can create a file system on top of these volumes, run a database, or use them in any other way you would use a block device. EBS volumes are placed in a specific Availability Zone (AZ), where they are automatically replicated to protect you from the failure of a single component.

  • EBS Volume types (see: for details)
    • General purpose SSD (gp2):
      General purpose SSD volume that balances price and performance for a wide variety of transactional workloads
      Ratio of 3 IOPS per GB with up to 10,000 IOPS and the ability to burst up to 3,000 IOPS for extended periods of time for volumes under 1 GiB.
    • Provisioned IOPS SSD (io1):
      Highest-performance SSD volume designed for mission-critical applications
      Designed for I/O intensive applications, such as large relational or NoSQL databases.
      More than 10,000 IOPS or 160 MiB/s of throughput per volume.
      Can provision up to 20,000 IOPS per volume.
    • Throughput Optimized HDD (st1):
      Low cost HDD volume designed for frequently accessed, throughput-intensive workloads
      Magnetic (spinning disk) HDD
      Useful for: Big Data, Data warehouses, log processing, etc. Sequential data
      Cannot be a boot volume
    • Cold HDD (sc1):
      Lowest cost HDD volume designed for less frequently access workloads
      Useful for: File server, etc.
      Cannot be a boot volume
    • Magnetic (standard):
      Infrequently accessed storage
      Lowest cost per gigabyte of all EBS volume types that is bootable.
      Magnetic volumes are ideal for workloads where data is accessed infrequently, and applications where the lowest storage cost is important.

NOTE: One can not mount 1 EBS volume to multiple EC2 instances; use EFS instead.

In order to enable encryption at rest using EC2 and Elastic Block Store, one needs to configure encryption when creating the EBS volume.

  • RAID (Redundant Array of Independent Disks)
    • RAID 0 - Striped, no redundancy, good performance
    • RAID 1 - Mirrored, redundancy
    • RAID 5 - Good for reads, bad for writes. AWS does not recommend ever putting RAID 5's on EBS
    • RAID 10 (RAID 1 + RAID 0) - Striped & mirrored. Good redundancy, good performance.
    • On AWS, use either RAID 0 or RAID 10. Useful for increasing disk I/O.
    • How to take a snapshot of a RAID array?
      • Problem: Take a snapshot, the snapshot excludes data held in the cache by applications and the OS. This tends not to matter on a single volume, however, using multiple volumes in a RAID, this can be a problem due to the inter-dependencies of the array.
      • Solution: Take an application consistent snapshot (i.e., stop the application from writing to disk and flush all caches to the disk).
      • Solution#1: Freeze the file system
      • Solution#2: Unmount the RAID array
      • Solution#3: Shut down the associated EC2 instance, take a snapshot, then start up the instance again.