Difference between revisions of "Google Cloud Platform"

Latest revision as of 00:23, 27 July 2021

Google Cloud Platform (GCP) is a cloud computing service by Google that offers hosting on the same supporting infrastructure that Google uses internally for end-user products like Gmail, Google Search, Maps, and YouTube.

Overview

Google Cloud Platform (GCP) high-level view:
- Compute
  - App Engine (PaaS)
  - Kubernetes Engine (Hybrid; see Kubernetes)
  - Compute Engine (IaaS)
  - Cloud Functions
- Storage
  - BigTable
  - Cloud Storage
  - Cloud SQL
  - Cloud Spanner
  - Cloud Datastore
- Networking
  - VPCs
  - Load Balancers
  - Cloud DNS
  - Cloud CDN
- Stackdriver
  - Monitoring
  - Logging
- Big Data
  - BigQuery
  - Pub/Sub
  - Dataflow
  - Dataproc
  - Datalab
- Artificial Intelligence
  - Natural Language API
  - Vision API
  - Speech API
  - Translate API
  - Machine Learning

Examples

Google Compute Engine – IaaS service providing virtual machines similar to Amazon EC2.
Google App Engine – PaaS service for directly hosting applications similar to AWS Elastic Beanstalk.
BigTable – IaaS service providing map reduce services. Similar to Hadoop.
BigQuery – IaaS service providing Columnar database. Similar to Amazon Redshift.
Google Cloud Functions – FaaS service allowing functions to be triggered by events without developer resource management similar to Amazon Lambda or IBM OpenWhisk.

Cloud Regions and Zones

Region
- A Region is specific geographical location where you can run your resources
- It is a collection of zones
- Regional resources are available to resources in any zone in the region
- They are frequently expanding
Zone
- Zones are isolated physical locations within a region
- Zonal resources are only available in that zone
- Machines in different zones have no single point of failure

An effective disaster recovery plan would have assets deployed across multiple zones, or even different regions.

Standards, regulations, and certifications

SSAE16
ISO 27001
ISO 27017
ISO 27018
PCI
HIPAA
Complete list

GCP Certifications

SEE: GCP Training

Main

Cloud Resource Hierarchy

Provides a hierarchy of ownership
- Identity and Access Management (IAM)
Provides "attach" points and inheritance for access control and organization policies
Hierarchy overview
- Organization (not applicable to individual accounts)
- Projects
- Resources

Projects

Core organizational component of GCP
Controls access to resources (who has access to what)
Projects are where you create, enable, and use all GCP services
- Per project basis
- Permissions
- Billing
- APIs
- Etc.
Projects have three identifying attributes:
1. Project Name (user-friendly name)
2. Project ID (aka Application ID; must be unique across GCP)
3. Project Number (used in various places for identifying resources that belong to specific projects. For example, service account access names)

Identity and Access Management (IAM)

Who can do what on which resource
- Members (who) are granted permissions and roles (what) to GCP services (resource) using the principle of least privilege
IAM -> Policy -> Roles + Identities
See predefined roles

Members (the "who")
- Can be either a person or a service account
- People via:
  - Google account
  - Google group (e.g., dev.team@thecompany.com)
  - G Suite Domain
  - Cloud Identity (organization domain that is not a Google domain/account)
- Service account
  - Special type of Google account that belongs to your application, not an end-user
  - Does not use usernames/passwords; uses encryption keys
  - Identity for carrying out server-to-server interactions in a project (e.g., local server back application writing data to Cloud Storage)
  - Identified by an email address:
    - <project_number>@developer.gserviceaccount.com
    - <project_id>@developer.gserviceaccount.com
  - Three types of service accounts:
    - User-created (custom)
    - Built-in (Compute Engine and App Engine default service accounts)
    - Google APIs service accounts (runs internal Google processes on your behalf)
  - Application access
  - Used to authenticate from one service to another:
    - Programs running within Compute Engine instances can automatically acquire access tokens with credentials
    - Tokens used to access and service API in your project and any other services that granted access to that service account
    - Convenient when not accessing user data

Roles (the "what")
- A collection of permissions to give access to a given resource
- Permissions are represented as: <service>.<resource>.<verb> (e.g., compute.instances.delete)
- Permissions vs. roles
  - Users are not directly assigned permissions, but are assigned roles, which contain a collection of permissions:

      Role                 List of permissions

                           compute.instances.delete
                           compute.instances.get
compute.instanceAdmin ---> compute.instances.list
                           compute.instances.setMachineType
                           compute.instances.start
                           compute.instances.stop

Cloud IAM objects

Organization (created by Google Sales)
- Organization Owners are established at creation (note: always have more than one organization owner, for security purposes).
- Organization Owner assigns the Organization Administrator role from the G Suite Admin Console (Admin is a separate product).
- Organization Administrators manage GCP from the Cloud Console.
Folders
- Additional grouping mechanism and isolation boundaries between projects (e.g., different departments or teams).
- Folders allow delegation of administration rights.
Projects
Members
Roles
Resources
Products
G Suite Super Admins (are the only Organization Owners)
- Administers a Google-hosted domain
- Creates users, groups
- Controls user membership in groups

Resource manager roles

Organization
- Admin: full control over all resources
- Viewer: view access to all resources
Folder
- Admin: full control over folders
- Creator: browse hierarchy and create folders
- Viewer: view folders and projects below a resource
Project
- Creator: create new projects (automatic owner) and migrate new projects into organization
- Deleter: delete projects

Google Cloud Directory Sync (GCDS)

Synchronizes G Suite accounts to match the user data in existing LDAP or MS Active Directory
- Syncs groups and memberships, not content or settings
- Supports sophisticated rules for custom mapping of users, groups, non-employee contacts, user profiles, aliases, and exceptions
One-way synchronization from LDAP to directory
- Administer in LDAP, then periodically update to G Suite
Runs as a utility in your server environment

Cloud IAM best practices

Principle of least privileges
- Always apply the minimal access level required.
Use groups
- If group membership is secure, assign roles to groups and let the G Suite Admins handle membership.
- Always maintain an alternate.
- For high-risk areas, assign roles to individuals directly and forego the convenience of group assignment.
Control who can change policies and group memberships
Audit policy changes
- Audit logs record project-level permission changes.
- Additional levels are being added all the time.

Primitive vs. Predefined (aka curated) vs. Custom Roles

Primitive Roles
- Historically available GCP roles before Cloud IAM was implemented
- Applied at Project-level
- Broad roles:
1. Viewer: read only actions that preserve state (i.e., cannot make changes)
2. Editor: same as above + can modify state (e.g., deploy applications, modify code, configure services)
3. Owner: same as above + can manage access to project and all project resources (e.g., invite/remove members and delete projects) + can setup project billing
4. Billing administrator: manage billing + add/remove administrators
A project can have multiple owners, editors, viewers, and billing administrators
When to choose Primitive Roles
- When the GCP service does not provide a predefined role
- When you only need broad permissions for a project
- When you want to allow a member to modify permissions for a project
- When you work in a small team where the team members do not need granular permissions

Predefined ("Curated") Roles
- Provides much more granular access (e.g., prevent unwanted access to other resources)
- Granted at the resource-level
- Example: App Engine Admin (full access to only App Engine resources)
- Multiple predefined roles can be given to individual users

Custom Roles
- Can only be used at the project or organization levels (they cannot be used at the folder level)
- If you want to give custom permissions to a Compute Engine VM, use a service account

IAM Policy

A collection of statements that define who has what type of access
A full list of roles granted to a member for a resource
IAM Policy hierarchy
- Resource access is organized hierarchically, from the Organization down to the Resource(s)
- Organization -> Project -> Resource(s) — parent/child format
- Each child has exactly one parent
- Children inherit parent roles
- Parent policies overrule restrictive child policies

Cloud Identity-Aware Proxy (Cloud IAP)

Enforce access control policies for applications and resources:
- Identity-based access control
- Central authorization layer for applications access by HTTPS
Cloud IAM policy is applied after authentication

Interacting with GCP

There are four methods of interacting with GCP:
1. Cloud console (web user interface)
2. Cloud Shell and Cloud SDK (command-line interface)
3. Cloud Console Mobile App (for Android or iOS)
4. RESTful API (for custom applications)

Google Cloud SDK

Command line interface (CLI) tools for managing resources and applications on GCP
Includes:
- gcloud — many common GCP tasks
- gsutil — interact with Cloud Storage
- bq — interact with data in BigQuery
Can also be installed locally as a Docker image or run from within Cloud Shell (via the UI)
install

Cloud Shell

Interactive web-based shell environment for GCP, accessed from a web console
Easy to manage resources without having to install the Google Cloud SDK locally.
Includes:
- A temporary Compute Engine virtual machine/instance
- CLI access to the instance from a web browser
- 5 GB of persistent disk storage
- Pre-installed Google Cloud SDK and other tools
- Language support for: Python, Go, Node.js, PHP, Ruby, and Java
- Web preview functionality (especially useful for App Engine)
- Built-in authorization for access to GCP projects and resources

Limitations
- 1 hour time out for inactivity
  - Machine will terminate/self-delete
  - $HOME directory contents will be preserved for a new session
- Direct interactive use only
  - Not for running high computational/network workloads
  - If in violation of GCP terms of use, session can be terminated without notice
- For long periods of inactivity, home disk may be recycled (with advance notice via email)
  - If you need longer inactive period, consider either locally installed SDK or use Cloud Storage for long-term storage

RESTful APIs

"Intended for software developers" ~ Google
Programmatic access to GCP resources
- Typically uses JSON as an interchange format
- Uses OAuth 2.0 for authentication and authorization
Enabled via the GCP Console
Most APIs have daily quotas, which can be increased upon request (to Google Support)
You can experiment via APIs Explorer

APIs Explorer

An interactive tool that lets you easily try Google APIs using a browser
With the APIs Explorer, you can:
- Browse quickly through available APIs and versions
- See methods available for each API and what parameters they support, along with inline documentation
- Execute requests for any method and see responses in real time
- Easily make authenticated and authorized API calls

Google Compute

There are multiple Compute options in GCP for hosting your applications, where "option" is the method of hosting:

Google Compute Engine (GCE)
Google Container Engine (deprecated)
Google Kubernetes Engine (GKE)
Google App Engine (GAE)
Google Cloud Functions

In the above list, each option is ordered from "highly customizable" (GCE) to "highly managed" (Google Cloud Functions).

See: Choosing the right compute option in GCP: a decision tree

Each option can take advantage of the rest of the GCP services. E.g.,

Storage
Networking
Big Data
Security

Google Compute Engine (GCE)

Google Compute Engine (GCE)

Infrastructure as a Service (IaaS)
Virtual Machines (VMs), aka "instances"
Per-second billing; sustained use discounts
High throughput to storage at no extra cost
Custom machine types: Only pay for the hardware you need/use

Storage for VMs
- Persistent disks (either standard or SSD)
- Any data saved to scratch space (local SSD) will not be saved when the VM is terminated

Preemtible VM

Connecting to a VM

$ gcloud compute --project "my-project-123456" ssh --zone "us-west1-b" "my-vm"

The above command will create SSH keys (stored in ~/.ssh by default). After that, you can use the private key to SSH into your VM:

$ ssh -i ~/.ssh/google_compute_engine username@x.x.x.x

Compute Engine Metadata

Project metadata - visible to all VMs in a project
VM metadata - private to the VM:
- Instance hostname
- External IP address
- SSH keys
- Project ID
- Service account information and token
key-value pairs
- Directories containing key-value pairs
- Can return: specific value for a key, all keys in a directory, or a recursive list of keys

Query metadata

Console, CloudShell, or API
From VM, using wget or curl
- The project metadata URL is: http://metadata.google.internal/computeMetadata/v1/project/
- The instance metadata URL is: http://metadata.google.internal/computeMetadata/v1/instance/
Using gcloud:
- Project: gcloud compute project-info describe
- Instance: gcloud compute instances describe <INSTANCE>

Each URL returns a set of entries that can be appended to the URL. Project settings contain info (e.g., project ID). Instance settings contain info on disks, hostname, machine type, etc.

Detect when metadata has changed

Detect using wait_for_changeparameter:

$ curl "http://metadata/computeMetadata/v1/instance/tags?wait_for_change=true" -H "Metadata-Flavor: Google"

note: timeout_sec - returns if value changed after n seconds

Entity tags - HTTP ETag (used for webcache validation)
- If metadata ETag differs from local version, then the latest value is returned immediately
- Instances can use ETags to validate if they have the latest value

Handle maintenance event

Retrieve scheduling options (availability policies)
- /scheduling directory
- maintenance-event attribute
- Notifies when a maintenance even is about to occur
- Value changes 60 seconds before a transparent maintenance event starts
Query periodically to trigger application code prior to a transparent maintenance event
- Example: backup, logging, save state

Custom metadata

One can also set one's own values so that one can use them in code on the VM.

$ curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"
{"access_token":"aa00...","expires_in":3599,"token_type":"Bearer"}

$ curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/project/"
attributes/
numeric-project-id
project-id

Preemptible VMs

Preemptible VMs are highly affordable, short-lived compute instances suitable for batch jobs and fault-tolerant workloads. Preemptible VMs offer the same machine types and options as regular compute instances and last for up to 24 hours. If your applications are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Google Compute Engine costs significantly.

Managed Instance Groups

Deploys identical instances, based on an instance template
Instance group can be resized
Manager ensures all instances are in "RUNNING" state
Typically used with autoscaler
Can be single zone or regional
In the UI, the instance template dialog looks and works exactly like creating an instance, except that it records the choices so it can repeat them.

Autoscaling

Available as part of the Compute Engine API
Used to automatically scale number of instances in a managed instance group based on workload
- Helps reduce costs by terminating instances when not required
Create one autoscaler per managed instance group
Autoscaler can be used with zone-based managed instance groups or regional managed instance groups
Autoscaler is fast (typically ~1 minute moving window)

Autoscaling policies

Policies determine behaviour

Policy options:

Average CPU utilization
- If average usage of total vCPU cores in instance group exceeds target, autoscaler adds more instances
HTTP load balancing serving capacity (defined in the backend service)
- Maximum CPU utilization
- Maximum requests per second/instance
Stackdriver standard and custom metrics

Multiple policies:

Autoscaler allows multiple policies (up to 5), but only for a single managed instance group.
Autoscaler handles mutiple policies by calculating recommended number of VMs for each policy and picking the policy that leaves the largest number of VMs in the group
- Ensures enough VMs to handle application workloads and allows you to scale apps that have multiple possible bottlenecks.

Example: Enable autoscaling for a managed instance group using CPU utilization:

$ gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \
    --max-num-replicas 20 \
    --target-cpu-utilization 0.75 \
    --cool-down-period 90

Note: the "cool down period" is the number of seconds the autoscaler should wait after a VM has started before the autoscaler starts collecting information from it.

Configuring an autoscaler

Create instance template (including startup, shutdown scripts) to automate tasks that must take place while instance is unattended:
- Software installation
- Software startup and shutdown
- Log archiving
Create managed instance group
Create autoscaler
Optionally, define multiple policies for autoscalers

Google App Engine (GAE)

Platform as a Service (PaaS)
Developers can focus on writing code, while GCP handles the rest
Build scalable web applications and mobile backends
Your code is run as a binary called a "runtime"
It is a managed service
- You never touch the underlying infrastructure
- Deployment, maintenance, and scalability are handled for you
- Reduces operational overhead
There are two environments available in GAE:
1. Standard:
  - Simpler to use
  - Finer grain autoscale
  - Free daily quota
  - Supports specific versions of Python, PHP, Go, and Java
2. Flexible:
  - (natively) supports Java 8, Servlet 3.1, Jetty 8, Python 2.7/3.5, Node.js, Ruby, PHP, .NET core, and Go (plus other runtimes, if using a custom Docker image)

App Engine Standard Environment

Your Standard App Engine Environment applications run in a "sandbox" and have the following constraints:
- No writing to local files (must write to a DB instead for persistent data)
- All requests your application receives have a 60 second time out
- Limits on third-party software

Example App Engine Standard environment workflow (e.g., for your web app)
1. Develop and test the web app locally
2. Use the SDK to deploy to App Engine (Project -> App Engine -> App Servers -> Application Instances)
3. App Engine automatically scales and reliably serves your web application
  - App Engine can access a variety of services using dedicated APIs (e.g., NoSQL, Memcache, task queues, scheduled tasks, search, logs, etc.)

App Engine Flexible Environment

Build and deploy containerized apps with a click
No sandbox constraints
Can access App Engine resources
Your apps run inside Docker containers on (managed) Google Compute Engine VMs.
- Their health is monitored and automatically healed
- You choose the geographical region
- Critical backward compatible updates to the VM's OS is automatically applied

Comparison of App Engine environments
	Standard Environment	Flexible Environment
Instance startup:	Milliseconds	Minutes
SSH access	No	Yes (not by default)
Write to local disk	No	Yes (scratch space; writes are ephemeral)
Support for 3rd-party binaries	No	Yes
Network access	Via App Engine Services	Yes
Pricing model	After free daily quota, pay per instance class, with automatic shutdown	Pay for resource allocation per hour; no automatic shutdown

Google Cloud Functions

Serverless environment for building and connecting cloud services
Event-driven
- Function executes as a "trigger" in response to a cloud-based event (Cloud Storage, Cloud Pub/Sub, or in an HTTP call)
- Simple, single-purpose functions
Write function code in either Node.js or Python 3
Example: A file is uploaded to Cloud Storage (event), function executes in response to event (trigger)
Easier and less expensive than provisioning a server to watch for events
Not available in all Regions. As of January 2019, available in:
- us-central1 (Iowa, USA)
- us-east1 (South Carolina, USA)
- europe-west1 (Belgium)
- asia-northeast1 (Tokyo, Japan)

Comparing Compute Options

Comparison of Google Compute options
	Compute Engine	Kubernetes Engine	App Engine Flexible	App Engine Standard	Cloud Functions
Service model	IaaS	Hybrid	PaaS	PaaS	Serverless
Use cases	General computing workloads	Container-based workloads	Web and mobile apps; container-based workloads	Web and mobile apps	Ephemeral functions responding to events
	Toward managed infrastructure <--------------------> Toward dynamic infrastructure

Storage

Every application needs to store data, where it is business data, media to be streamd, or sensor data from devices. Consider the "three Vs" (3Vs):

Variety: how similar structured or variable the data is;
Velocity: how fast the data arrives; and
Volatility: how long the data retains value and, therefore, needs to be accessible.

Cloud Storage

Cloud Storage is binary, large-object storage

High performance, Internet-scale
Data encryption at rest
Data encryption in transit by default from Google to endpoint

Bucket attributes
- Globally unique name
- Storage class
- Location (region or multi-region)
- IAM policies or Access Control Lists (ACLs)
- Objects are immutable (turn on versioning for updating a "file" and keeping a history of changes)
- Object lifecycle management rules (e.g., delete objects older than x-number of days; keep only the 3 most recent versions of an object {if versioning as been enabled on the bucket})

Cloud Storage Classes
	Multi-regional	Regional	Nearline	Coldline
Intended for data that is:	Most frequently accessed	Accessed frequently within a region	Accessed less than once a month	Accessed less than once a year
Feature	Geo-redundant	Regional	Backup	Archived or DR
Availability SLA	99.95%	99.90%	99.00%	99.00%
Durability	99.999999999% (11 nines)	99.999999999% (11 nines)	99.999999999% (11 nines)	99.999999999% (11 nines)
Duration	Hot data	Hot data	30-day minimum	90-day minimum
Access APIs	Consistent APIs
Access time	Millisecond access
Use cases	Content storage and delivery	In-region analytics; transcoding	Long-tail content; backups	Archiving; disaster recovery
Storage price	$$$$	$$$	$$	$
Retrieval price	none	none	$	$$

There are several ways to bring data into Cloud Storage:

Online transfer: self-managed copies using CLI or drag-and-drop
Storage Transfer Service: scheduled, managed batch transfers
Transfer Appliance: rackable appliances to securely ship your data

Cloud Storage works with other GCP services:

Compute Engine: VM startup scripts, images, and general object storage
App Engine: object storage, logs, and Datastore backups
Cloud SQL: import/export tables
BigQuery: import/export tables

Signed URLs

"Valet key" access to buckets and objects via ticket:
- Ticket is a cryptographically signed URL
- Time-limited
- Operations specified in ticket: HTTP GET, PUT, DELETE (not POST)
- Program can decide when or whether to give out signed URL
- Any user with URL can invoke permitted operations
Example using private account key and gsutil

$ gsutil signurl -d 10m /path/to/privatekey.p12 gs://bucket/object

Example of a signed URL:

http://mybucket.storage.googleapis.com/foobar.txt?\
GoogleAccessId=1234567890123@developer.gserviceaccount.com&\
Expires=1552087696&\
Signature=ATlz98...asASDF345ASDF%3D

Query parameters:
- GoogleAccessID: email address of the service account
- Expires: when the signature expires (in Unix epoch format)
- Signature: a cryptographic hash of composite URL string
The string that is digitally signed must contain:
- HTTP verb (GET)
- Expiration
- Canonical resource (/bucket/object)

Cloud BigTable

Cloud BigTable is a fully managed NoSQL, wide-column database service for terabyte applications.

Accessed using the HBase API
Native compatibility with big data, Hadoop ecosystems
Managed, scalable storage
Data encryption in-flight and at rest
Control access with IAM
BigTable drives major applications, such as Google Search, Google Analytics, and Gmail
Learns and adjusts to access patterns
BigTable scales UP well; Datastore scales DOWN well.
If you need any of the following, consider using BigTable:
- Storing > 1TB structure data;
- Very high volume of writes;
- Read/write latency < 10 milliseconds and strong consistency; and/or
- HBase API compatible.

BigTable access patterns

Application API
Data can be read from and written to Cloud BigTable through a data service layer, like Managed VMs, the HBase REST Server, or a Java Server using the HBase client. Typically, this will be to serve data to applications, dashboards, and data services.
Streaming
Data can be streamed in (written even-by-even) through a variety of popular stream processing frameworks, like Cloud Dataflow Streaming, Spark Streaming, and Storm.
Batch Processing
Data can be read from and written to Cloud BigTable through batch processes, like Hadoop MapReduce, Dataflow, or Spark. Often, summarized or newly calculated data is written back to Cloud BigTable or to a downstream database.

Cloud SQL

Cloud SQL is a managed RDBMS.

Offers MySQL and PostgeSQLBeta databases as a service (DBaaS)
Automatic replication
Managed backups (automatic or scheduled)
Vertical scaling (read and write)
Horizontal scaling (read)
Google security (network firewalls and encryption)

Use cases

App Engine
Cloud SQL can be used with App Engine, using standard drivers.

You can configure a Cloud SQL instance to follow an App Engine application.
Compute Engine
Compute Engine instances can be authorized to access Cloud SQL instances using an external IP address.

Cloud SQL instance can be configured with a preferred zone.
External services
Cloud SQL can be used with external applications and clients.

Standard tools can be used to administer databases.

External read replicas can be configured.

Cloud Spanner

A horizontally scalable RDBMS (can scale to larger database sizes than Cloud SQL)
Transactional consistency at global scale
Managed instances with high availability
SQL queries (ANSI 2011 with extensions)
Automatic replication
Sharding
Use cases include financial applications and inventory applications

Cloud Datastore

Cloud Datastore is a horizontally scalable NoSQL DB.

Designed for application backends (databases can span Compute Engine and App Engine)
Scales automatically
Handles sharding and replication
Supports transactions that affect multiple database rows (unlike Cloud BigTable)
Allows for SQL-like queries
Includes a free daily quota (for storage, reads, writes, deletes, and small operations)

Comparing storage options

	Cloud Datastore	BigTable	Cloud Storage	Cloud SQL	Cloud Spanner	BigQuery¹
Type	NoSQL document	NoSQL wide column	Blobstore	Relational SQL for OLTP²	Relational SQL for OLTP²	Relational SQL for OLAP³
Access metaphor	Persistent Hashmap	Key-values, HBase API	Like files in a file system	Relational database	Globally scalable RDBMS	Relational
Read	filter objects on property	scan rows	Must copy to local disk	SELECT rows	transactional reads/writes	SELECT rows
Write	put object	put row	One file	INSERT row	transactional reads/writes	Batch/stream
Update granularity	Attribute	Row	An object (a "file")	Field	SQL, Schemas ACID transactions Strong consistency High availability	Field
Transactions	Yes	Single-row	No	Yes	Yes	No
Complex queries	No	No	No	Yes	Yes	Yes
Capacity	Terabytes+	Petabytes+	Petabytes+	Terabytes	Petabytes	Petabytes+
Unit size	1 MB/entry	~10 MB/cell ~100 MB/row	5 TB/object	Determined by DB engine	10,240 MiB/row	10 MB/row
Best for	Getting started, App Engine apps	"Flat" data, heavy read/write, events, analytical data	Structured and unstructured binary or object data	Web frameworks, existing apps	Large-scale database apps (> ~2 TB)	Interactive querying, offline analytics
Usage	Structured data from App Engine apps	No-ops, high throughput, scalable, flattened data	Store blobs	No-ops SQL database	SQL, Schemas ACID transactions Strong consistency High availability	Interactive SQL* querying fully managed warehouse
Use cases	Getting started, App Engine apps	AdTech, financial, and IoT data	Images, large media files, backups	User credentials, customer orders	Whenever high I/O, global consistency is needed	Data warehousing

¹Sits on the edge of storage and data processing
²Online Transaction Processing (OLTP)
³Online Analytical Processing (OLAP)

See the Google storage decision tree for a graphical version of the above table.

	Cloud Spanner	Relational DB	Non-relational DB
Schema	Yes	Yes	No
SQL	Yes	Yes	No
Consistency	Strong	Strong	Eventual
Availability	High	Failover	High
Scalability	Horizontal	Vertical	Horizontal
Replication	Automatic	Configurable	Configurable

Networking

Virtual Private Cloud (VPC)

Official Google Cloud VPC Product Page

Each VPC network is contained in a GCP project.
You can provision GCP resources, connect them to each other, and isolate them from one another.
Google Cloud VPC networks are global; subnets are regional (and subnets can span the zones that make up the region).
You can have resources in different zones on the same subnet.
You can dynamically increase the size of a subnet in a custom network by expanding the range of IP addresses allocated to it (without any workload shutdown or downtime).
Forward traffic from one instance to another instance within the same network, even across subnets, without requiring external IP addresses.
Use your VPC route table to forward traffic within the network, even across subnets (and zones) without requiring an external IP address.
VPCs give you a global distributed firewall.
You can define firewall rules in terms of metadata tags on VMs (e.g., tag all of your web servers {VMs} with "web" and write a firewall rule stating that traffic on ports 80 and/or 443 is allowed into all VMs with the "web" tag, no matter what their IP address happens to be).
VPCs belong to GCP projects, however, if you wish to establish connections between VPCs, you can use VPC peering.
- If you want to use the full power of IAM to control who and what in one project can interact with a VPC in another project, use shared VPCs.

Shared VPC

Share GCP VPC networks across projects in your Cloud organization using Shared VPC

Shared VPC allows:

Creation of a VPC network of RFC1918 IP spaces that associated projects can use
Project admins to create VMs in the shared VPC network spaces
Network and security admins to create VPNs and firewall rules, usable by the projects in the VPC network
Policies to be applied and enforced easily across a Cloud organization

VPC Network Peering

VPC Network Peering provides the following advantages over using external IP addresses or VPNs to connect networks:

Network latency
Network security
Network cost

Cloud Load Balancers

With global Cloud Load Balancing, your application presents a single front-end to the world.

Users get a single, global anycast IP address.
Traffic goes over the Google backbone from the closest point-of-presence to the user.
Backends are selected based on load.
Only healthy backends receive traffic.
No pre-warming is required.

Cloud Load Balancing Options
Global HTTP(S)	Global SSL Proxy	Global TCP Proxy	Regional	Regional (internal)
Layer 7 load balancing based on load	Layer 4 load balancing of non-HTTPS SSL traffic based on load	Layer 4 load balancing of non-SSL TCP traffic	Load balancing of any traffic (TCP, UDP)	Load balancing of traffic inside a VPC
Can route different URLs to different backends	Supported on specific port numbers	Supported on specific port numbers	Supported on any port number	Used for the internal tiers of multi-tier applications
Distributes HTTP(S) traffic among groups of instances based on: Proximity to the user Requested URL Both	Distributes SSL traffic among groups of instances based on proximity to user.	Distributes TCP traffic among groups of instances based on proximity to user.	Distributes traffic among a pool of instances within a region. Can balance any kind of TCP/UDP traffic.	Distributes traffic from GCP VMs to a group of instances in the same region.

Network Load Balancing

Target Pools

A Target Pool resource defines a group of instances that should receive incoming traffic from forwarding rules.

Target pools can only be used with forwarding rules that handle: TCP/UDP traffic
You must create a target pool before you can use it with a forwarding rule.
Each project can have up to 50 target pools.
A target pool can have only one health check.
- Network load balancing only supports httpHealthChecks.
Instances can be in different zones but must be in the same region; add to pool at creation or use Instance Groups.

Internal Load Balancing

Internal load balancing allows you to:

Load balance TCP/UDP traffic using a private frontend IP.
Load balance across instances in a region.
Configure health checking for your backends.
Get the benefits of a fully managed load balancing service that scales as you need to handle client traffic.

Cloud DNS

Cloud DNS is highly available and scalable

100% SLA (only GCP service that offers this)
Create managed zones, then add, edit, delete DNS records.
Programmatically manage zones and records using RESTful API or CLI.
Lookup that translates symbolic names to IP addresses
High-performance DNS lookup for your users
Cost-effective for massive updates (millions of records)
Manage DNS records through API or Console
Request routed to the nearest location, reducing latency
Use cases:
- DNS resolver for your company's users w/o managing your own servers
- DNS propagation of company DNS records

Cloud DNS managed zones

An abstraction that manages all DNS records for a single domain name
One project may have multiple managed zones
Must enable the Cloud DNS API in GCP Console first
gcloud dns managed-zones ...
Managed zones:
- Permission controls at project level
- Monitor propagation of changes to DNS name servers (docs)

Cloud Content Delivery Network (CDN)

Uses Google's globally distributed edge caches to cache content close to yours.
Or, you can use CDN Interconnect if you would prefer to use a different (non-GCP) CDN.

Cloud VPN

Securely connects your on-premises network to your GCP VPC network
Traffic travelling between the two networks is protected as it travels over the Internet:
- Encrypted by one VPN gateway
- Decrypted by the other VPN gateway
99.9% SLA
Supports site-to-site VPN
Supports:
- Static routes
- Dynamic routes (Cloud Router)
Supports IKEv1 and IKEv2, using a shared secret
Uses Encapsulating Security Payload (ESP) in tunnel-mode with authentication

Cloud Router

Provides BGP routing
- Dynamically discovers and advertises routes
Supports graceful restart
Supports ECMP
Primary/backup tunnels for failover
- MED
- AS Path length
- AS Prepend

Cloud Interconnect

Enterprise-grade connection to GCP
Connect through a service provider's network
Provides dedicated bandwidth (50Mbps - 10Gbps)
Provides access to private (e.g., RFC1918) network addresses
Enables easy hybrid Cloud deployment
Does not require the use of and management of hardware VPN devices

External Peering

Direct Peering

Connect to GCP through Google POPs
Provides N x 10G transport circuits for private Cloud traffic
BGP direct connect between your network and Google's network at Edge Network locations
Autonomous System numbers (AS) are exchanged via IXPs and some private facilities
Technical, commercial, and legal requirements

Resource Manager

Official website

Cloud Resource Manager allows you to hierarchically manage resources by project, folder, and organization.

Billing and Resource Monitoring

Organization contains all billing accounts
Project is associated with one billing account
Project accumulates consumption of all resources
A resource belongs to one, and only one, project
Resource consumption is measured on:
- Rate of use/time
- Number of items
- Feature use

Resource hierarchy

Global
- Images
- Snapshots
- Networks
Regional
- External IP addresses
Zonal
- Instances
- Disks

Project quotas

All resources are subject to project quotas (or limits)

Quotas typically fall into one of three categories:
1. How many resources you can create per project;
2. How quickly you can make API requests in a project (rate limits); and
3. Some quotas are per region
Quota examples:
- 5 networks per project
- 300 admin requests per minute (e.g., Cloud Spanner)
- 24 CPUs region/project
Most quotas can be increased through a self-service form or a support ticket
- IAM & admin -> Quotas

Labels

A utility for organizing GCP resources
- Labels are key-value pairs
- Attached to resources (e.g., VMs, disk, snapshots, images)
- Can be created/applied via the Console, gcloud, or API
Example uses of labels:
- Search and list all resources (inventory)
- Filter resources (e.g., separate production from test)
- Labels used in scripts
Label specification
- A label is a key-value pair
- Label keys and non-empty label values can contain lowercase letters, digits, and hyphens; must start with a letter; and must end with a letter or digit. The regular expression is: [a-z]([-a-z0-9]*[a-z0-9])
- The maximum length of label keys and values is 63 characters.
- There can be a maximum of 64 labels per resource.

Stackdriver

Official website

Overview:

Integrated monitoring, logging, diagnostics
Manages across platforms
- GCP and AWS
- Dynamic discovery of GCP with smart defaults
- Open source agents and integrations
Access to powerful data and analytics tools
Collaborations with third-party software

Stackdriver provides services for:

Monitoring
- Platform, system, and application metrics
- Uptime/health checks
- Dashboards and alerts
Logging
- Platform, system, and application logs
- Log search, view, filter, and export
- Log-based metrics
- Export logs to BigQuery, Cloud Storage, and Cloud Pub/Sub
Debugger
- Debug applications
Error reporting
- Analyzes and aggregates the errors in your Cloud apps and notifies you when new errors are detected
Trace
- Latency reporting and sampling
- Per-URL latency and statistics

Monitoring

Install monitoring agent:

$ curl -O https://repo.stackdriver.com/stack-install.sh
$ sudo bash stack-install.sh --write-gcm

# To install the Stackdriver monitoring agent:
$ curl -sSO https://dl.google.com/cloudagents/install-monitoring-agent.sh
$ sudo bash install-monitoring-agent.sh

# To install the Stackdriver logging agent:
$ curl -sSO https://dl.google.com/cloudagents/install-logging-agent.sh
$ sudo bash install-logging-agent.sh

Group metrics:
- Aggregate metrics across a set of machines
  - Dynamically defined
  - Useful for high-change environments
- Separate production from development
- Filter Kubernetes Engine data by name and custom tags for cluster

Custom metrics (e.g., in Python)

Alerts — best practices
- Alert on symptoms, not causes (e.g., monitor failing queries, not database down)
- Use multiple channels
  - Avoid a single point-of-failure in your alert strategy
  - Multiple notification channels (e.g., email, SMS, Slack, etc.)
- Customize alerts to audience needs
  - Use descriptions to tell them what actions to take, what resources to examine.
- Avoid noise
  - Adjust monitoring so alerts are actionable, not dismissable.

Logging

Platform, system, and application logs
- API to write to logs
- 30-day retention + option to transfer to Cloud Storage
Log search/view/filter
Log-based metrics
Monitoring alerts can be set on log events
Data can be exported to BigQuery

Viewing and exporting
- Configure sinks for export
- Stackdriver logs must have access to the resource
  - Cloud Storage: storage and archiving
  - BigQuery: analysis
  - Cloud Pub/Sub: software integration
- Export process
  - As new log entries arrive, they are exported to sinks.
  - Existing log entries, at the time the sink is created, are not exported.
  - Cloud Storage entries are batched and sent out (~hourly).

Exporting logs
- Retain data longer by exporting to different storage classes of Cloud Storage
- Search and analyze logs with BigQuery
- Advanced visualization with Cloud Datalab
- Stream logs to application or endpoint with Cloud Pub/Sub

Error Reporting

Aggregate and display errors for running Cloud services

Error notifications
Error dashboards
Python, Java, JavaScript, Ruby, C#, PHP, and Go

Available for:

App Engine Standard
App Engine Flexible (beta)
Compute Engine (beta)
AWS EC2 (beta)
[Not available for] Kubernetes Engine

Tracing

Tracing system:

Displays data in near real-time
Latency reporting
Per-URL latency sampling

Collects latency data on:

App Engine
Google HTTPS load balancers
Applications instrumented with the Stackdriver Trace SDKs

Debugging

Inspect an application without stopping it or slowing it down significantly
Can be used with App Engine Standard or Flexible, Compute Engine, or Kubernetes Engine
Python, Java, or Go
Debug snapshots:
- Capture call stack and local variables of a running application
Debug log-points:
- Inject logging into a service without stopping it

Big Data

In the very near future, every company will be a data company, as making the fastest and best use of data is a critical source of competitive advantage.

Google Cloud's big data services are fully managed and scalable.

BigQuery: Analytics database; stream data at 100,000 rows per seconds
Cloud Pub/Sub: Scalable and flexible enterprise messaging
Cloud Dataproc: Managed Hadoop MapReduce, Spark, Pig, and Hive service
Cloud Dataflow: Stream and batch processing; unified and simplified pipelines
Cloud Datalab: Interactive data exploration

BigQuery

Official website

BigQuery is a fast, highly scalable, cost-effective, and fully managed Cloud data warehouse for analytics, with built-in machine learning.

Provides near real-time interactive analysis of massive datasets (hundreds of TBs) using SQL syntax (SQL 2011)
Instead of using a dynamic pipeline (like Cloud Dataflow), use BigQuery for data that needs to run more in the way of exploring a vast sea of data (and are able to do ad hoc SQL queries on that massive dataset)
No cluster maintenance required
Load data from Cloud Storage or Cloud Datastore or stream it into BigQuery at up to 100,000 rows per second
In addition to SQL queries, you can read/write data in BigQuery via Cloud Dataflow, Hadoop, and Spark
Compute and storage are separated with a terabit network in between
You only pay for storage and processing use
- You pay for your data storage separately from queries
Automatic discount for long-term data storage
- When the age of your data reach 90 days in BigQuery, Google will automatically drop the price of storage.
Free monthly quotas
99.9% SLA

Google's infrastructure is global and so is BigQuery. BigQuery lets you specify the region where your data will be kept. For example, if you want to keep data in Europe, you do not have to setup a cluster in Europe. Simply specify "EU" as the location when you create your dataset. US and Asia location are also available.

Cloud Pub/Sub

Official website

Cloud Pub/Sub allows you to ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.

Scalable, reliable messaging
"Pub" => Publishers; "Sub" => Subscribers
Supports many-to-many asynchronous messaging
Application components make push/pull subscriptions to topics
Includes support for offline consumers
Designed to provide at least once delivery at low latency (i.e., it is possible for some messages to be delivered more than one; write your code to handle such situations)
Building block for data ingestion in Dataflow, IoT, Marketing Analytics, etc.
It is the foundation for Dataflow streaming
Useful for push notifications for cloud-based apps
Connect apps across GCP (e.g., push/pull between Compute Engine and App Engine)

Cloud Dataproc

Official website

Cloud Dataproc is a managed, Cloud-native Apache Hadoop & Apache Spark service.

A fast, easy, managed way to run Hadoop and Spark/Hive/Pig on GCP
Create clusters in 90 seconds or less (on average)
Scale clusters up and down, even when jobs are running
Easily migrate on-premises Hadoop jobs to the Cloud
Use Spark SQL and Spark Machine Learning libraries (MLlib) to run classification algorithms
Save money with preemptible instances
The rate for pricing is based on the hour, but Dataproc is billed by the second (one minute minimum)

The MapReduce module means that one function (traditionally called the "Map" function) runs in parallel with a massive dataset to produce intermediate results. Another function (the "Reduce" function) builds a final result set, based on all those intermediate results.

Cloud Dataflow

Official website

Cloud Dataflow is a simplified stream and batch data processing service, with equal reliability and expressiveness.

Use Cloud Dataproc when you have a dataset of known size or when you want to manage your cluster size yourself. If your data is ingested in real-time or is of an unpredictable size or rate, use Cloud Dataflow.

Offers managed data pipelines
Useful for fraud detection, financial services, IoT analytics, healthcare, logistics, clickstream, Point-of-Sale (PoS) and segmentation analysis in retail
Extract/Transform/Load (ETL) pipelines to move, filter, enrich, and shape data
Data analysis: batch computation and continuous computation using streaming
Processes data using Compute Engine instances
- Clusters are sized for you
- Automated scaling; no instance provisioning required
Write code once and get batch and streaming (transform-based programming model)
Orchestration: create pipelines that coordinate services, including external services
Integrates with GCP services: Cloud Storage, Cloud Pub/Sub, BigQuery, and BigTable
- Open source Python and Java SDKs

Cloud Datalab

Official website

An interactive tool for data exploration, analysis, visualization, and machine learning.

Interactive tool for large-scale data exploration, transformation, analysis, and visualization
Integrated and open source (built on Jupyter)
Only pay for the resources you use (no charge for using Datalab itself)
Analyze data in BigQuery, Compute Engine, and Cloud Storage using Python, SQL, and JavaScaript
Easily deploy models to BigQuery
Visualize your data with Google Charts and matplotlib

Cloud Machine Learning

The Google Cloud Machine Learning Platform provides modern machine learning services with pre-trained models and a platform to generate your own taillored models.

Open source tool to build and run neural network models
- Wide platform support: CPU, GPU, or TPU; mobile, server, or Cloud
Fully managed machine learning service
- Familiar notebook-based developer experience
- Optimized for Google's infrastructure; integrates with BigQuery and Cloud Storage
Pre-trained machine learning models built by Google
- Speech: stream results in real-time; detects 80 languages
- Vision: Identify objects, landmarks, text, and content
- Translate: Language translation, including detection
- Natural language: structure and meaning of text

Why use the Cloud Machine Learning Platform?

For structure data
- Classification and regression
- Recommendation
- Anomaly detection
For unstructured data
- Image and video analytics
- Text analytics

Cloud Vision API

Official website

Analyze images with a simple REST API.

Logo detection, label detection, etc.
With the Cloud Vision API, you can:
- Gain insight from images
- Detect inappropriate context
- Analyze sentiment
- Extract text

Cloud Natural Language API

Cloud Speech-to-Text official website

Can return text in real-time
Highly accurate, even in noisy environments
Access from any device
As of March 2019, it recognizes over 120 languages and variants
Uses ML models to reveal structure and meaning of text
- It can do syntax analysis (breaking down sentences into tokens, identify nouns, verbs, adjectives, and other parts of speech and figure out the relationships among the words).
Extract information about items mentioned in text documents, news articles, and blog posts

Cloud Translation API

Official website

Dynamically translate between languages.

Translate arbitrary strings between thousands of language pairs
Programmatically detect a document's language
Support for dozens of languages

Cloud Video Intelligence API

Official website

Search and discover your media content with Cloud Video Intelligence.

Annotate the contents of videos
Detect scene changes
Flag inappropriate content
Support for a variety of video formats

Tools

Cloud Endpoints

Official website

Develop, deploy, and manage APIs on any Google Cloud backend.

Distributed API management
Export your API using a RESTful interface
Control access and validate calls with JSON Web Tokens and Google API keys
- Identify web / mobile users with Auth0 and Firebase Authentication
Generate client libraries

Supported platforms

Runtime environment
- App Engine Flexible Environment
- Kubernetes Engine
- Compute Engine
Clients
- Android
- iOS
- Javascript

Apigee Edge

A platform for making APIs available to your customers and partners
Helps you secure and monetize APIs
Contains analytics, monetization, and a developer portal

Cloud Source Repositories

Official website

Fully featured Git repositories hosted on GCP.

Private Git remote on GCP
- IAM roles: owner, editor, viewer
- Stackdriver integration
- Console source code browser
Create repo

$ gcloud init
$ gcloud source repos create <REPO_NAME>

Can be a mirror of a hosted GitHub or BitBucket repo

Deployment Manager

Official website

Create and manage Cloud resources with simple templates (written in YAML).

It is an infrastructure management tool
Provides repeatable deployments
It is declarative; not imperative
a declarative approach allows the user to specify what the configuration should be and let the system figure out the steps to take;

an imperative approach requires the user to define the steps to take to create and configure resources
Besides YAML, you can also use Python or Jinja2 templates
Deployment Manager is available at no additional charge to Cloud Platform customers.

Creating a Deployment Configuration

Creating a configuration
- *.yaml file defines the basic configuration
- Include import at the top of the YAML file to expand to full-featured templates written in Python or Jinja2
- Program configuration is bidirectional and interactive: receives data like machine-type and returns data like ip-address
Use "preview" to validate configuration before using it:

$ gcloud deployment-manager deployments update my-deploy --config *.yaml --preview

Template details

10 MB limit (neither the original configuration or the expanded one)
Python or Jinja2 templates cannot make any network or system calls (they will automatically be rejected)
Templates can be nested
- Isolate specific functions into meaningful files
- Create reusable assets
- Example: a separate template for firewall rules
Templates have properties
Templates can use environment variables
Supports the startup script and metadata capabilities
Deployments can be updated (uses GCP API)
- Add resources: default policy is acquire or create as needed
- Remove resources: default policy is to delete the resource

SEE: Step-by-step guide.

Cloud Marketplace

Official website

Formerly called "Cloud Launcher"
A solution marketplace containing pre-packaged, ready-to-deploy solutions (deployed via Deployment Manager)
- Some offered by Google
- Others by third-party vendors
Separate fees:
- license fees for software
- image usage fees
Image usage fee vs. separate license is up to vendor
Google updates images, but not running instances
Cloud technology partners

Command Line Interface (CLI)

The Google Cloud SDK is a set of tools that you can use to manage resources and applications hosted on the Google Cloud Platform (GCP). These include the gcloud, gsutil, and bq command line tools. The gcloud command-line tool is downloaded along with the Cloud SDK.

gcloud reference guide

Configuration and services

$ gcloud components list

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                  Components                                                 │
├───────────────┬──────────────────────────────────────────────────────┬──────────────────────────┬───────────┤
│     Status    │                         Name                         │            ID            │    Size   │
├───────────────┼──────────────────────────────────────────────────────┼──────────────────────────┼───────────┤
│ Not Installed │ App Engine Go Extensions                             │ app-engine-go            │  56.6 MiB │
│ Not Installed │ Cloud Bigtable Command Line Tool                     │ cbt                      │   6.4 MiB │
│ Not Installed │ Cloud Bigtable Emulator                              │ bigtable                 │   5.6 MiB │
│ Not Installed │ Cloud Datalab Command Line Tool                      │ datalab                  │   < 1 MiB │
│ Not Installed │ Cloud Datastore Emulator                             │ cloud-datastore-emulator │  17.7 MiB │
│ Not Installed │ Cloud Datastore Emulator (Legacy)                    │ gcd-emulator             │  38.1 MiB │
│ Not Installed │ Cloud Firestore Emulator                             │ cloud-firestore-emulator │  27.5 MiB │
│ Not Installed │ Cloud Pub/Sub Emulator                               │ pubsub-emulator          │  33.4 MiB │
│ Not Installed │ Cloud SQL Proxy                                      │ cloud_sql_proxy          │   3.8 MiB │
│ Not Installed │ Emulator Reverse Proxy                               │ emulator-reverse-proxy   │  14.5 MiB │
│ Not Installed │ Google Cloud Build Local Builder                     │ cloud-build-local        │   6.0 MiB │
│ Not Installed │ Google Container Registry's Docker credential helper │ docker-credential-gcr    │   1.8 MiB │
│ Not Installed │ gcloud Alpha Commands                                │ alpha                    │   < 1 MiB │
│ Not Installed │ gcloud Beta Commands                                 │ beta                     │   < 1 MiB │
│ Not Installed │ gcloud app Java Extensions                           │ app-engine-java          │ 107.5 MiB │
│ Not Installed │ gcloud app PHP Extensions                            │ app-engine-php           │           │
│ Not Installed │ gcloud app Python Extensions                         │ app-engine-python        │   6.2 MiB │
│ Not Installed │ gcloud app Python Extensions (Extra Libraries)       │ app-engine-python-extras │  28.5 MiB │
│ Not Installed │ kubectl                                              │ kubectl                  │   < 1 MiB │
│ Installed     │ BigQuery Command Line Tool                           │ bq                       │   < 1 MiB │
│ Installed     │ Cloud SDK Core Libraries                             │ core                     │   9.1 MiB │
│ Installed     │ Cloud Storage Command Line Tool                      │ gsutil                   │   3.5 MiB │
└───────────────┴──────────────────────────────────────────────────────┴──────────────────────────┴───────────┘

To install or remove components at your current SDK version [228.0.0], run:

$ gcloud components install COMPONENT_ID
$ gcloud components remove COMPONENT_ID

To update your SDK installation to the latest version [228.0.0], run:

$ gcloud components update

Initialize gcloud:

$ gcloud init

Get a list of GCP authorized accounts:

$ gcloud auth list

Set the active account:

$ gcloud config set account <account>

Get current account:

$ gcloud config get-value account

Get current gcloud configuration:

$ gcloud config list

[compute]
region = us-west1
zone = us-west1-a
[core]
account = someone@somewhere.com
disable_usage_reporting = True
project = my-project-223521

Your active configuration is: [default]

Get a list of all configurations:

$ gcloud config configurations list

NAME     IS_ACTIVE  ACCOUNT                PROJECT            DEFAULT_ZONE  DEFAULT_REGION
default  True       someone@somewhere.com  my-project-223521  us-west1-a    us-west1

Get a list of all (enabled) services:

$ gcloud services list

NAME                                    TITLE
bigquery-json.googleapis.com            BigQuery API
cloudapis.googleapis.com                Google Cloud APIs
clouddebugger.googleapis.com            Stackdriver Debugger API
cloudtrace.googleapis.com               Stackdriver Trace API
compute.googleapis.com                  Compute Engine API
container.googleapis.com                Kubernetes Engine API
containerregistry.googleapis.com        Container Registry API
datastore.googleapis.com                Cloud Datastore API
dns.googleapis.com                      Google Cloud DNS API
logging.googleapis.com                  Stackdriver Logging API
monitoring.googleapis.com               Stackdriver Monitoring API
oslogin.googleapis.com                  Cloud OS Login API
pubsub.googleapis.com                   Cloud Pub/Sub API
servicemanagement.googleapis.com        Service Management API
serviceusage.googleapis.com             Service Usage API
sql-component.googleapis.com            Cloud SQL
stackdriver.googleapis.com              Stackdriver API
stackdriverprovisioning.googleapis.com  Stackdriver Provisioning Service
storage-api.googleapis.com              Google Cloud Storage JSON API
storage-component.googleapis.com        Google Cloud Storage

$ gcloud services list --enabled --sort-by="NAME"
$ gcloud services list --available --sort-by="NAME"

Creating a project

Get a list of billing accounts:

$ gcloud beta billing accounts list

ACCOUNT_ID            NAME                OPEN  MASTER_ACCOUNT_ID
000000-000000-000000  My Billing Account  True

Create a project:

$ gcloud projects create dev-project-01 --name="dev-project-01" \
    --labels=team=area51

Link the above project to a billing account:

$ gcloud beta billing projects link dev-project-01 \
    --billing-account=000000-000000-000000

Switch between projects:

$ gcloud config set project ${PROJECT_NAME}

Get project-wide metadata (including project quotas):

$ gcloud compute project-info describe  # current project
#~OR~ specific project:
$ gcloud compute project-info describe --project ${PROJECT_NAME}

Managing multiple SDK configurations

Note: When you install the SDK, it will setup a default configuration and ask you to assign a project to it (and a default region).

Create a new configuration, activate, and switch between configurations:

$ gcloud config configurations create dev
$ gcloud config configurations list
$ gcloud config list
$ gcloud config configurations activate default
$ gcloud config set project dev-project-01
$ gcloud config set account someone@somewhere.com

Compute Engine

Creating a VM/instance

Use default values:

$ gcloud compute instances create "dev-server" --zone us-west1-a

Use customised values:

$ gcloud compute instances create "dev-server" \
    --project=my-project-123456 \
    --zone=us-west1-a \
    --machine-type=f1-micro \
    --subnet=default \
    --network-tier=PREMIUM \
    --maintenance-policy=MIGRATE \
    --service-account=00000000000-compute@developer.gserviceaccount.com \
    --scopes=https://www.googleapis.com/auth/devstorage.read_only,\
             https://www.googleapis.com/auth/logging.write,\
             https://www.googleapis.com/auth/monitoring.write,\
             https://www.googleapis.com/auth/servicecontrol,\
             https://www.googleapis.com/auth/service.management.readonly,\
             https://www.googleapis.com/auth/trace.append \
    --image=centos-7-v20181210 \
    --image-project=centos-cloud \
    --boot-disk-size=10GB \
    --boot-disk-type=pd-standard \
    --boot-disk-device-name=dev-server

Another example of creating a VM:

$ gcloud config list
$ gcloud compute zones list | grep us-west
$ gcloud config set compute/zone us-west1-a
$ gcloud compute images list --filter="debian"
$ gcloud compute instances create "my-vm-2" \
    --machine-type "n1-standard-1" \
    --image-project "debian-cloud" \
    --image "debian-9-stretch-v20190213" \
    --subnet "default"

Connecting to a VM (via SSH)

Google-managed:

$ gcloud compute instances list
$ gcloud compute ssh xtof@dev-server
$ gcloud compute ssh xtof@dev-server --dry-run  # see the actual command

Using your own SSH key:

$ ssh-keygen -t rsa -f my-ssh-key -C xtof
$ echo "xtof:$(cat my-ssh-key.pub)" > gcp_keys.txt
$ gcloud compute instances add-metadata dev-server --metadata-from-file ssh-keys=gcp_keys.txt

Snapshots

$ gcloud compute snapshots list
$ gcloud compute disks list
$ gcloud compute disks snapshot dev-server
$ gcloud compute snapshots delete <snapshot_name>

Images

Show public and private images (from which we can create instances from):

$ gcloud compute images list

NAME                 PROJECT        FAMILY     DEPRECATED   STATUS
centos-6-v20181210   centos-cloud   centos-6                READY
centos-7-v20181210   centos-cloud   centos-7                READY
...

Setting firewall rules

Get a list of current firewall rules (project-wide):

$ gcloud compute firewall-rules list 
NAME                    NETWORK  DIRECTION  PRIORITY  ALLOW                         DENY  DISABLED
default-allow-icmp      default  INGRESS    65534     icmp                                False
default-allow-internal  default  INGRESS    65534     tcp:0-65535,udp:0-65535,icmp        False
default-allow-rdp       default  INGRESS    65534     tcp:3389                            False
default-allow-ssh       default  INGRESS    65534     tcp:22                              False

Set firewall rules (e.g., allow HTTP/HTTPS traffic):

$ gcloud compute firewall-rules create default-allow-http \
    --project=my-project-123456 \
    --direction=INGRESS --priority=1000 --network=default \
    --action=ALLOW --rules=tcp:80 --source-ranges=0.0.0.0/0 \
    --target-tags=http-server

$ gcloud compute firewall-rules create default-allow-https \
    --project=my-project-123456 \
    --direction=INGRESS --priority=1000 --network=default \
    --action=ALLOW --rules=tcp:443 --source-ranges=0.0.0.0/0 \
    --target-tags=https-server

List updated firewall rules:

$ gcloud compute firewall-rules list 
NAME                    NETWORK  DIRECTION  PRIORITY  ALLOW                         DENY  DISABLED
default-allow-http      default  INGRESS    1000      tcp:80                              False
default-allow-https     default  INGRESS    1000      tcp:443                             False
...

Create an instance using the above firewall rules (HTTP/HTTPS):

$ gcloud compute instances create "dev-server" --zone us-west1-a \
    --tags=http-server,https-server

Deleting a VM/instance

Get a list of instances:

$ gcloud compute instances list
NAME        ZONE        MACHINE_TYPE  PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP    STATUS
dev-server  us-west1-a  f1-micro                   10.138.0.2   35.230.26.217  RUNNING

Delete the above instance:

$ gcloud compute instances delete "dev-server" --zone "us-west1-a"

Kubernetes

SEE: The Kubernetes main article.

Managing a GKE cluster

Create a basic Kubernetes cluster:

$ gcloud container clusters create my-k8s-cluster --zone us-west1-a --num-nodes 2

Create a Kubernetes cluster (with more options defined):

$ gcloud beta container --project "gcp-k8s-123456" clusters create "xtof-gcp-k8s" \
   --zone "us-west1-a" \
   --username "admin" \
   --cluster-version "1.11.5-gke.5" \
   --machine-type "n1-standard-1" \
   --image-type "COS" \
   --disk-type "pd-standard" \
   --disk-size "100" \
   --scopes \
     "https://www.googleapis.com/auth/devstorage.read_only",
     "https://www.googleapis.com/auth/logging.write",
     "https://www.googleapis.com/auth/monitoring",
     "https://www.googleapis.com/auth/servicecontrol",
     "https://www.googleapis.com/auth/service.management.readonly",
     "https://www.googleapis.com/auth/trace.append" \
   --num-nodes "3" \
   --enable-stackdriver-kubernetes \
   --no-enable-ip-alias \
   --network "projects/gcp-k8s-123456/global/networks/default" \
   --subnetwork "projects/gcp-k8s-123456/regions/us-west1/subnetworks/default" \
   --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard,Istio \
   --istio-config auth=NONE \
   --enable-autoupgrade \
   --enable-autorepair

Get the Kubernetes credentials:

$ gcloud container clusters get-credentials xtof-gcp-k8s --zone us-west1-a --project gcp-k8s-123456

Resize a Kubernetes cluster:

$ gcloud container cluster resize --size=1 --zone=us-west1-a xtof-gcp-k8s

Delete the cluster:

$ gcloud container clusters delete --project "gcp-k8s-123456" "xtof-gcp-k8s" --zone "us-west1-a"

Google Container Registry (GCR)

SEE: The Container Registry Quick Start guide for details.
SEE: Docker for more details.

Overview

Docker container images
Public and private container storage
Fast, scalable retrieval and deployment
Billed for storage and egress, not per image
Works with open and 3rd party continuous delivery systems
IAM roles
ACLs for access control

Details

Configure (local) Docker to use the gcloud command-line tool as a credential helper:

$ gcloud auth configure-docker

Note: The above command will add the following settings to your (local) Docker config file (located at ${HOME}/.docker/config.json):

 {
  "credHelpers": {
    "gcr.io": "gcloud", 
    "us.gcr.io": "gcloud", 
    "eu.gcr.io": "gcloud", 
    "asia.gcr.io": "gcloud", 
    "staging-k8s.gcr.io": "gcloud", 
    "marketplace.gcr.io": "gcloud"
  }
}

Tag the image with a registry name:

$ docker tag ${IMAGE_NAME} gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${IMAGE_TAG}

Push the image to Container Registry:

$ docker push gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${IMAGE_TAG}

List images in the Container Registry:

$ gcloud container images list
#~OR~
$ gcloud container images list --repository=gcr.io/${PROJECT_ID}
#~OR~
$ gcloud container images list --repository=gcr.io/${PROJECT_ID} --filter "name:${IMAGE_NAME}"

Pull the image from Container Registry:

$ docker pull gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${IMAGE_TAG}

Cleanup (delete):

$ gcloud container images delete gcr.io/${PROJECT_ID}/${IMAGE_NAME}:${IMAGE_TAG} --force-delete-tags

Deployment Manager

"Deployment Manager is an infrastructure deployment service that automates the creation and management of Google Cloud Platform resources for you. Write flexible template and configuration files and use them to create deployments that have a variety of Cloud Platform services, such as Google Cloud Storage, Google Compute Engine, and Google Cloud SQL, configured to work together". source

Example:

$ gcloud deployment-manager deployments create my-deployment --config my-deployment.yml
$ gcloud deployment-manager deployments update my-deployment --config my-deployment.yml
$ gcloud deployment-manager deployments describe my-deployment

Miscellaneous

$ gcloud config set project <project-name>
$ gcloud config set compute/zone us-west1
$ gcloud config unset compute/zone
$ gcloud iam service-accounts list \
    --filter='displayName:"Compute Engine default service account"' \
    --format='value(email)'
$ gcloud iam service-accounts list --format=json | \
    jq -r '.[] | select(.email | startswith("my-project@")) | .email'
my-project@gmy-project-123456.iam.gserviceaccount.com

$ gcloud compute networks subnets list

NAME     REGION                   NETWORK  RANGE
default  us-west2                 default  10.168.0.0/20
default  asia-northeast1          default  10.146.0.0/20
default  us-west1                 default  10.138.0.0/20
default  southamerica-east1       default  10.158.0.0/20
default  europe-west4             default  10.164.0.0/20
default  asia-east1               default  10.140.0.0/20
default  europe-north1            default  10.166.0.0/20
default  asia-southeast1          default  10.148.0.0/20
default  us-east4                 default  10.150.0.0/20
default  europe-west1             default  10.132.0.0/20
default  europe-west2             default  10.154.0.0/20
default  europe-west3             default  10.156.0.0/20
default  australia-southeast1     default  10.152.0.0/20
default  asia-south1              default  10.160.0.0/20
default  us-east1                 default  10.142.0.0/20
default  us-central1              default  10.128.0.0/20
default  asia-east2               default  10.170.0.0/20
default  northamerica-northeast1  default  10.162.0.0/20

$ gcloud projects create example-foo-bar-1 --name="Happy project" \
    --labels=type=happy

$ gcloud compute forwarding-rules list \
    --filter='name:"my-app-forwarding-rules"' \
    --format='value(IPAddress)'
x.x.x.x

$ gcloud pubsub topics publish myTopic --message '{"name":"bob"}'
$ gcloud functions logs read

Cloud Storage

Storage Classes
Storage Class	Name for APIs and gsutil
Multi-Regional Storage	`multi_regional`
Regional Storage	`regional`
Nearline Storage	`nearline`
Coldline Storage	`coldline`

See: for details

Create a bucket:

$ PROJECT_NAME=my-project
$ REGION=us-west1
$ STORAGE_CLASS=regional
$ BUCKET_NAME=xtof-test

# Basic (using defaults):
$ gsutil mb gs://${BUCKET_NAME}

# Advanced (override defaults):
$ gsutil mb -p ${PROJECT_NAME} -c ${STORAGE_CLASS} -l ${REGION} gs://${BUCKET_NAME}

# Use Cloud Shell variables:
$ gsutil mb -l US ${DEVSHELL_PROJECT_ID} # <- creates a globally unique bucket name based off of your project ID

# Set the ACL of an object in your bucket:
$ gsutil acl ch -u allUsers:R gs://${DEVSHELL_PROJECT_ID}/foobar.png

Note: All buckets (and their objects) are private by default.

Upload an object to the above bucket:

$ gsutil cp Pictures/foobar.jpg gs://${BUCKET_NAME}

Move an object (file) from one bucket to another:

$ gsutil mv gs://${SOURCE_BUCKET} gs://${DESTINATION_BUCKET}

List the contents of a bucket:

$ gsutil ls gs://${BUCKET_NAME}     # basic info
$ gsutil ls -l gs://${BUCKET_NAME}  # extended info

Identity and Access Management

Get the IAM roles and rules for a given bucket (note: these are the default ones):

$ gsutil iam get gs://${BUCKET_NAME}

{
  "bindings": [
    {
      "members": [
        "projectEditor:my-project-123456", 
        "projectOwner:my-project-123456"
      ], 
      "role": "roles/storage.legacyBucketOwner"
    }, 
    {
      "members": [
        "projectViewer:my-project-123456"
      ], 
      "role": "roles/storage.legacyBucketReader"
    }
  ], 
  "etag": "CAE="
}

Lifecycle Management

Find all objects in a given bucket older than 2 days (i.e., when they were uploaded to the bucket or last modified) and convert them from "regional" to "nearline" storage class:

$ cat << EOF > lifecycle.json
{
  "lifecycle": {
    "rule": [
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "NEARLINE"
        },
        "condition": {
          "age": 2,
          "matchesStorageClass": [
            "REGIONAL"
          ]
        }
      }
    ]
  }
}
EOF

$ gsutil lifecycle set lifecycle.json gs://${BUCKET_NAME}/

Signed-URLs

First, create a Service Account, with just enough privileges to modify Cloud Storage, and add and download the assigned key.

$ gsutil cp test.txt gs://xtof-sandbox/
$ gsutil signurl -d 3m key.json gs://xtof-sandbox/test.txt

The above will return a signed-URL (it will look something like https://storage.googleapis.com/xtof-sandbox/test.txt?x-goog-signature=23asd...), which you can send to users and will only be valid for 3 minutes. After 3 minutes, they will get an "ExpiredToken" error.

Access control

$ gsutil cp foobar.txt gs://$BUCKET_NAME_1/

Get the default access list that has been assigned to foobar.txt:

$ gsutil acl get gs://$BUCKET_NAME_1/foobar.txt  > acl.txt
$ cat acl.txt
[
  {
    "entity": "project-owners-1045887948991",
    "projectTeam": {
      "projectNumber": "1045887948991",
      "team": "owners"
    },
    "role": "OWNER"
  },
  {
    "entity": "project-editors-1045887948991",
    "projectTeam": {
      "projectNumber": "1045887948991",
      "team": "editors"
    },
    "role": "OWNER"
  },
  {
    "entity": "project-viewers-1045887948991",
    "projectTeam": {
      "projectNumber": "1045887948991",
      "team": "viewers"
    },
    "role": "READER"
  },
  {
    "email": "storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "entity": "user-storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "role": "OWNER"
  }
]

Set the access list to private and verify the results:

$ gsutil acl set private gs://$BUCKET_NAME_1/foobar.txt
$ gsutil acl get gs://$BUCKET_NAME_1/foobar.txt > acl2.txt
$ cat acl2.txt
[
  {
    "email": "storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "entity": "user-storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "role": "OWNER"
  }
]

Update the access list to make the file publicly readable:

$ gsutil acl ch -u AllUsers:R gs://$BUCKET_NAME_1/foobar.txt
$ gsutil acl get gs://$BUCKET_NAME_1/foobar.txt > acl3.txt
$ cat acl3.txt
[
  {
    "email": "storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "entity": "user-storecore@sandbox-gcp-e2518d857e18d36a.iam.gserviceaccount.com",
    "role": "OWNER"
  },
  {
    "entity": "allUsers",
    "role": "READER"
  }
]

Customer-supplied encryption keys (CSEK)

Generate a CSEK key (an AES-256 base-64 key):

$ python -c 'import base64; import os; print(base64.encodestring(os.urandom(32)))'
1G8l2isrv/QO5zJveoNCN5PeuGHgHBDUzHUBzgiOSUc=

Edit ~/.boto:

encryption_key=1G8l2isrv/QO5zJveoNCN5PeuGHgHBDUzHUBzgiOSUc=

Note: If the ~/.boto file is empty, generate it with:

$ gsutil config -n

$ gsutil rewrite -k gs://$BUCKET_NAME_1/foobar2.txt

Enable lifecycle management

$ gsutil lifecycle get gs://$BUCKET_NAME_1
gs://storecore-35635/ has no lifecycle configuration.

$ vi life.json
{
  "rule":
  [
    {
      "action": {"type": "Delete"},
      "condition": {"age": 31}
    }
  ]
}

$ gsutil lifecycle set life.json gs://$BUCKET_NAME_1
Setting lifecycle configuration on gs://storecore-35635/...

$ gsutil lifecycle get gs://$BUCKET_NAME_1
{"rule": [{"action": {"type": "Delete"}, "condition": {"age": 31}}]}

Enable versioning

$ gsutil versioning get gs://$BUCKET_NAME_1
gs://storecore-35635: Suspended

$ gsutil versioning set on gs://$BUCKET_NAME_1
Enabling versioning for gs://storecore-35635/...

$ gsutil versioning get gs://$BUCKET_NAME_1
gs://storecore-35635: Enabled

Upload a file, delete some lines from the original file, upload again, repeat.

List all versions of the file:

$ gsutil ls -a gs://$BUCKET_NAME_1/foobar.txt
gs://storecore-35635/foobar.txt#1552432005059570
gs://storecore-35635/foobar.txt#1552432853742567
gs://storecore-35635/foobar.txt#1552432873281759

Synchronize a directory to a bucket

Make a nested directory structure so that you can examine what happens when it is recursively copied to a bucket.

Run the following commands:

 $ mkdir firstlevel
 $ mkdir ./firstlevel/secondlevel
 $ cp foobar.txt firstlevel
 $ cp foobar.txt firstlevel/secondlevel

Sync the firstlevel directory on the VM with your bucket:

$ gsutil rsync -r ./firstlevel gs://$BUCKET_NAME_1/firstlevel

Verify that versioning was enabled:

$ gsutil versioning get gs://$BUCKET_NAME_1

$ gsutil cat gs://$BUCKET_NAME_2/$FILE_NAME

BigQuery

$ bq query "select string_field_10 as request, count(*) as requestcount from logdata.accesslog group by request order by requestcount desc"
+----------------------------------------+--------------+
|                request                 | requestcount |
+----------------------------------------+--------------+
| GET /store HTTP/1.0                    |       337293 |
| GET /index.html HTTP/1.0               |       336193 |
| GET /products HTTP/1.0                 |       280937 |
| GET /services HTTP/1.0                 |       169090 |
| GET /products/desserttoppings HTTP/1.0 |        56580 |
| GET /products/floorwaxes HTTP/1.0      |        56451 |
| GET /careers HTTP/1.0                  |        56412 |
| GET /services/turnipwinding HTTP/1.0   |        56401 |
| GET /services/spacetravel HTTP/1.0     |        56176 |
| GET /favicon.ico HTTP/1.0              |        55845 |
+----------------------------------------+--------------+

Cloud VPN

This section will show how to setup a VPN between two subnets in different regions.

Create VPC networks

VPC#1
- Name: vpn-network-1
- Subnet creation mode: Custom
- Name: subnet-a
- Region: us-east1
- IP address range: 10.5.4.0/24
VPC#2
- Name: vpn-network-2
- Subnet creation mode: Custom
- Name: subnet-b
- Region: europe-west1
- IP address range: 10.1.3.0/24

Create test VMs

One in both regions (VM#1: us-east1; VM#2: europe-west1)

Create firewall rules

VPC network -> Firewall rules

Allow traffic to vpn-network-1
- Name: allow-icmp-ssh-network-1
- Network: vpn-network-1
- Targets: All instances in the network
- Source filter: IP ranges
- Source IP ranges: 0.0.0.0/0
- Protocols and ports: Specified protocols and ports
  - tcp -> 22
  - other -> icmp

Allow traffic to vpn-network-2
- Name: allow-icmp-ssh-network-2
- Network: vpn-network-2
- Targets: All instances in the network
- Source filter: IP ranges
- Source IP ranges: 0.0.0.0/0
- Protocols and ports: Specified protocols and ports
  - tcp -> 22
  - other -> icmp

Create and prepare the VPN gateways

Create two VPN gateways, one in each region. Create forwarding rules for Encapsulating Security Payload (ESP), UDP:500, and UDP:4500 for each gateway.

Create vpn-1 gateway:

$ gcloud compute target-vpn-gateways \
    create vpn-1 \
    --network vpn-network-1  \
    --region us-east1

Create vpn-2 gateway:

$ gcloud compute target-vpn-gateways \
    create vpn-2 \
    --network vpn-network-2  \
    --region europe-west1

Reserve a static IP for each network:

$ gcloud compute addresses create --region us-east1 vpn-1-static-ip
$ gcloud compute addresses list
$ export STATIC_IP_VPN_1=<IP address for vpn-1>

$ gcloud compute addresses create --region europe-west1 vpn-2-static-ip
$ gcloud compute addresses list
$ export STATIC_IP_VPN_2=<IP address for vpn-2>

Create forwarding rules for both vpn gateways

The forwarding rules forward traffic arriving on the external IP to the VPN gateway. It connects them together. Create three forwarding rules for the protocols necessary for VPN.

Create ESP forwarding rule for vpn-1:

$ gcloud compute \
    forwarding-rules create vpn-1-esp \
    --region us-east1  \
    --ip-protocol ESP  \
    --address ${STATIC_IP_VPN_1} \
    --target-vpn-gateway vpn-1

Create ESP forwarding rule for vpn-2:

$ gcloud compute \
    forwarding-rules create vpn-2-esp \
    --region europe-west1  \
    --ip-protocol ESP  \
    --address ${STATIC_IP_VPN_2} \
    --target-vpn-gateway vpn-2

Create UDP500 forwarding for vpn-1:

$ gcloud compute \
    forwarding-rules create vpn-1-udp500  \
    --region us-east1 \
    --ip-protocol UDP \
    --ports 500 \
    --address ${STATIC_IP_VPN_1} \
    --target-vpn-gateway vpn-1

Create UDP500 forwarding for vpn-2:

$ gcloud compute \
    forwarding-rules create vpn-2-udp500  \
    --region europe-west1 \
    --ip-protocol UDP \
    --ports 500 \
    --address ${STATIC_IP_VPN_2} \
    --target-vpn-gateway vpn-2

Create UDP4500 forwarding for vpn-1:

$ gcloud compute \
    forwarding-rules create vpn-1-udp4500  \
    --region us-east1 \
    --ip-protocol UDP --ports 4500 \
    --address ${STATIC_IP_VPN_1} \
    --target-vpn-gateway vpn-1

Create UDP4500 forwarding for vpn-2:

$ gcloud compute \
    forwarding-rules create vpn-2-udp4500  \
    --region europe-west1 \
    --ip-protocol UDP --ports 4500 \
    --address ${STATIC_IP_VPN_2} \
    --target-vpn-gateway vpn-2

List forwarding rules (created above):

$ gcloud compute forwarding-rules list
NAME           REGION        IP_ADDRESS      IP_PROTOCOL  TARGET
vpn-2-esp      europe-west1  34.76.244.229   ESP          europe-west1/targetVpnGateways/vpn-2
vpn-2-udp4500  europe-west1  34.76.244.229   UDP          europe-west1/targetVpnGateways/vpn-2
vpn-2-udp500   europe-west1  34.76.244.229   UDP          europe-west1/targetVpnGateways/vpn-2
vpn-1-esp      us-east1      35.237.123.102  ESP          us-east1/targetVpnGateways/vpn-1
vpn-1-udp4500  us-east1      35.237.123.102  UDP          us-east1/targetVpnGateways/vpn-1
vpn-1-udp500   us-east1      35.237.123.102  UDP          us-east1/targetVpnGateways/vpn-1

List VPN gateways:

$ gcloud compute target-vpn-gateways list
NAME   NETWORK        REGION
vpn-2  vpn-network-2  europe-west1
vpn-1  vpn-network-1  us-east1

Create tunnels

Create the tunnels between the VPN gateways. After the tunnels exist, create a static route to enable traffic to be forwarded into the tunnel. If this is successful, you can ping a local VM in one location on its internal IP from a VM in a different location.

Create the tunnel for traffic from Network-1 to Network-2:

$ gcloud compute \
    vpn-tunnels create tunnel1to2  \
    --peer-address ${STATIC_IP_VPN_2} \
    --region us-east1 \
    --ike-version 2 \
    --shared-secret gcprocks \
    --target-vpn-gateway vpn-1 \
    --local-traffic-selector 0.0.0.0/0 \
    --remote-traffic-selector 0.0.0.0/0

Create the tunnel for traffic from Network-2 to Network-1:

$ gcloud compute \
    vpn-tunnels create tunnel2to1 \
    --peer-address ${STATIC_IP_VPN_1} \
    --region europe-west1 \
    --ike-version 2 \
    --shared-secret gcprocks \
    --target-vpn-gateway vpn-2 \
    --local-traffic-selector 0.0.0.0/0 \
    --remote-traffic-selector 0.0.0.0/0

List VPN tunnels (created above):

$ gcloud compute vpn-tunnels list
NAME        REGION        GATEWAY  PEER_ADDRESS
tunnel2to1  europe-west1  vpn-2    35.237.123.102
tunnel1to2  us-east1      vpn-1    34.76.244.229

Create static routes

Create a static route from Network-1 to Network-2:

$ gcloud compute  \
    routes create route1to2  \
    --network vpn-network-1 \
    --next-hop-vpn-tunnel tunnel1to2 \
    --next-hop-vpn-tunnel-region us-east1 \
    --destination-range 10.1.3.0/24

Create a static route from Network-2 to Network-1:

$ gcloud compute  \
    routes create route2to1  \
    --network vpn-network-2 \
    --next-hop-vpn-tunnel tunnel2to1 \
    --next-hop-vpn-tunnel-region europe-west1 \
    --destination-range 10.5.4.0/24

List static routes (created above):

$ gcloud compute routes list
NAME        NETWORK        DEST_RANGE     NEXT_HOP                            PRIORITY
...
route1to2   vpn-network-1  10.1.3.0/24    us-east1/vpnTunnels/tunnel1to2      1000
route2to1   vpn-network-2  10.5.4.0/24    europe-west1/vpnTunnels/tunnel2to1  1000

Verify VPN connectivity

SSH into VM1 and check that you can ping the internal IP of VM2, and vice versa.

SSH into instance/VM

SEE: gcloud compute ssh for details.

To SSH into "example-instance" in zone us-west2-a, run:

$ gcloud compute ssh example-instance --zone=us-west2-a

You can also run a command on the virtual machine. For example, to get a snapshot of the guest's process tree, run:

$ gcloud compute ssh example-instance --zone=us-west2-a --command="ps -ejH"

If you are using the Google Container-Optimized virtual machine image, you can SSH into one of your containers with:

$ gcloud compute ssh example-instance --zone=us-west2-a --container=CONTAINER

You can limit the allowed time to ssh. For example, to allow a key to be used through 2019:

$ gcloud compute ssh example-instance --zone=us-west2-a --ssh-key-expiration="2020-01-01T00:00:00:00Z"

Or alternatively, allow access for the next two minutes:

$ gcloud compute ssh example-instance --zone=us-west2-a --ssh-key-expire-after=2m

Get billing information

$ gcloud alpha billing accounts list
ACCOUNT_ID            NAME              OPEN   MASTER_ACCOUNT_ID
000000-000000-000000  Blue Env - Xtof   True   0A0A0A-0B0B0B-0C0C0C

$ gcloud alpha billing projects list --billing-account=000000-000000-000000
PROJECT_ID     BILLING_ACCOUNT_ID    BILLING_ENABLED
blue-123456    AAAAAA-BBBBBB-CCCCCC  True
green-654321   DDDDDD-EEEEEE-FFFFFF  False

GCP vs. AWS

Note: All of the following are as of February 2017.

GCP vs. AWS

GCP vs. Azure

Compute
- Compute Engine vs. EC2
- App Engine vs. Elastic Beanstalk
- Container Engine vs. EC2
- Container Registry vs. ECR
- Cloud Functions vs. Lambda
Identity & Security
- Cloud IAM vs. IAM
- Cloud Resource Manager vs. n/a
- Cloud Security Scanner vs. Inspector
- Cloud Platform Security vs. n/a
Networking
- Cloud Virtual Network vs. VPC
- Cloud Load Balancing vs. ELB
- Cloud CDN vs. CloudFront
- Cloud Interconnect vs. Direct Connect
- Cloud DNS vs. Route53
Storage and Databases
- Cloud Storage vs. S3
- Cloud Bigtable vs. DynamoDB
- Cloud Datastore vs. SimpleDB
- Cloud SQL vs. RDS
- Persistent Disk vs. EBS
Big Data
- BigQuery vs. Redshift
- Cloud Dataflow vs. EMR
- Cloud Dataproc vs. EMR
- Cloud Datalab vs. n/a
- Cloud Pub/Sub vs. Kinesis
- Genomics vs. n/a
Machine Learning
- Cloud Machine Learning vs. Machine Learning
- Vision API vs. Rekognition
- Speech API vs. Polly
- Natural Language API vs. Lex
- Translation API vs. n/a
- Jobs API vs. n/a

Compute Services (GCP vs. AWS):
- Infrastructure as a Service (IaaS): Compute Engine vs. EC2
- Platform as a Service (PaaS): App Engine vs. Elastic Beanstalk
- Containers as a Service: Container Engine vs. EC2

Compute IaaS comparison
Feature	Amazon EC2	Compute Engine
Virtual machines	Instances	Instances
Machine images	Amazon Machine Image (AMI)	Image
Temporary virtual machines	Spot instances	Preemptible VMs
Firewall	Security groups	Compute Engine firewall rules
Automatic instance scaling	Auto Scaling	Compute Engine autoscaler
Local attached disk	Ephemeral disk	Local SSD
VM import	Supported formats: RAW, OVA, VMDK, VHD	Supported formats: AMI, RAW, VirtualBox
Deployment locality	Zonal	Zonal

Networking services comparison
	Networking	Load Balancing	CDN	On-premises connection	DNS
AWS	VPC	ELB	CloudFront	Direct Connect	Route53
GCP	Cloud VirtualNetwork¹	Cloud LoadBalancing²	Cloud CDN	Cloud InterConnect	Cloud DNS

¹GCP allows for 802.1q tagging (aka VLAN taggin). AWS does not.
²GCP allows for cross-region load balancing. AWS does not.

Storage services comparison
	Object	Block	Cold	File
AWS	S3	EBS¹	Glacier	EFS
GCP	Cloud Storage	Compute Engine Persistent Disks²	Cloud Storage Nearline	ZFS/Avere

¹An EBS volume can be attached to only one EC2 instance at a time. Can attach up to 40 disk volumes to a Linux instance. Available in only one region by default.
²GCP Persistent Disks in read-only mode can be attached to multiple instances simultaneously. Can attach up to 128 disk volumes. Snapshots are global and can be used in any region without additional operations or charges.

Database services comparison
	RDMS	NoSQL (key-value)	NoSQL (indexed)
AWS	RDS	DynamoDB	DynamoDB
GCP	Cloud SQL¹	Cloud Bigtable²	Cloud Datastore

¹MySQL only.
²100 MB maximum item size. Does not support secondary indexes.

Big Data services comparison
	Streaming data ingestion	Streaming data processing	Batch data processing	Analytics
AWS	Kinesis	Kinesis	EMR	Redshift
GCP	Cloud Pub/Sub	Cloud Dataflow	Cloud Dataflow / Cloud Dataproc	BigQuery

Cloud Pub/Sub: GCPs offering for data streaming and message queue. It allows for secure communication between applications and can also serve as a de-coupling method (a good way to scale).
Dataflow: GCPs managed service offering for batch and streaming data processing. Apache Beam under-the-hood.
Dataproc: GCPs offering for data processing using Apache Hadoop and Apache Spark. It is a massively parallel data processing and transformation engine.; Supported services: MapReduce, Apache Hive, Apache Pig, Apache Spark, Spark SQL, PySpark, and support for parallel jobs with YARN.
BigQuery: GCPs offering for a fully managed, massive data warehousing and analytics engine, allowing for data analytics using SQL.

Application services comparison
	Messaging
AWS	SNS
GCP	Cloud Pub/Sub

Cloud Pub/Sub (publisher/subscriber)

Management services comparison
	Monitoring	Deployment (IaC)
AWS	CloudWatch	CloudFormation
GCP	Stackdriver	Deployment Manager

External links

@@ Line 222: / Line 222: @@
 * Custom Roles
-** Can only be used at the project or organization levels (the cannot be used at the folder level)
+** Can only be used at the project or organization levels (they cannot be used at the folder level)
 ** If you want to give custom permissions to a Compute Engine VM, use a service account
@@ Line 334: / Line 334: @@
 * Storage for VMs
 ** Persistent disks (either standard or SSD)
-** Any data save to scratch space (local SSD) will not be saved when the VM is terminated
+** Any data saved to scratch space (local SSD) will not be saved when the VM is terminated
 * Preemtible VM
@@ Line 480: / Line 480: @@
 ** Limits on third-party software
-* Example App Engine ''Standard'' environment workflow (e.g., for you web app)
+* Example App Engine ''Standard'' environment workflow (e.g., for your web app)
 *# Develop and test the web app locally
 *# Use the SDK to deploy to App Engine (Project -> App Engine -> App Servers -> Application Instances)
-*# App Engine automatically scales and reliable serves your web application
+*# App Engine automatically scales and reliably serves your web application
 *#* App Engine can access a variety of services using dedicated APIs (e.g., NoSQL, Memcache, task queues, scheduled tasks, search, logs, etc.)
@@ Line 1,627: / Line 1,627: @@
 * Set the active account:
   $ gcloud config set account <account>
+* Get ''current'' account:
+ $ gcloud config get-value account
 * Get current gcloud configuration:
@@ Line 2,509: / Line 2,512: @@
 SSH into VM1 and check that you can ping the ''internal'' IP of VM2, and ''vice versa''.
+===SSH into instance/VM===
+SEE: [https://cloud.google.com/sdk/gcloud/reference/compute/ssh gcloud compute ssh] for details.
+* To SSH into "<code>example-instance</code>" in zone <code>us-west2-a</code>, run:
+ $ gcloud compute ssh example-instance --zone=us-west2-a
+* You can also run a command on the virtual machine. For example, to get a snapshot of the guest's process tree, run:
+ $ gcloud compute ssh example-instance --zone=us-west2-a --command="ps -ejH"
+* If you are using the Google Container-Optimized virtual machine image, you can SSH into one of your containers with:
+ $ gcloud compute ssh example-instance --zone=us-west2-a --container=CONTAINER
+* You can limit the allowed time to ssh. For example, to allow a key to be used through 2019:
+ $ gcloud compute ssh example-instance --zone=us-west2-a --ssh-key-expiration="2020-01-01T00:00:00:00Z"
+* Or alternatively, allow access for the next two minutes:
+ $ gcloud compute ssh example-instance --zone=us-west2-a --ssh-key-expire-after=2m
+===Get billing information===
+<pre>
+$ gcloud alpha billing accounts list
+ACCOUNT_ID            NAME              OPEN   MASTER_ACCOUNT_ID
+-000000-000000  Blue Env - Xtof   True   0A0A0A-0B0B0B-0C0C0C
+$ gcloud alpha billing projects list --billing-account=000000-000000-000000
+PROJECT_ID     BILLING_ACCOUNT_ID    BILLING_ENABLED
+blue-123456    AAAAAA-BBBBBB-CCCCCC  True
+green-654321   DDDDDD-EEEEEE-FFFFFF  False
+</pre>
 ==GCP vs. AWS==
@@ Line 2,711: / Line 2,745: @@
 ==External links==
 * [https://cloud.google.com/ Official website]
+* [https://cloud.google.com/docs/compare/aws#service_comparisons GCP ''vs.'' AWS]
 [[Category:Technical and Specialized Skills]]

Difference between revisions of "Google Cloud Platform"

Latest revision as of 00:23, 27 July 2021

Contents

Overview

Main

Identity and Access Management (IAM)

Interacting with GCP

Google Compute

Google Compute Engine (GCE)

Managed Instance Groups

Autoscaling

Google App Engine (GAE)

Google Cloud Functions

Comparing Compute Options

Storage

Cloud Storage

Cloud BigTable

Cloud SQL

Cloud Spanner

Cloud Datastore

Comparing storage options

Networking

Virtual Private Cloud (VPC)

Cloud Load Balancers

Network Load Balancing

Internal Load Balancing

Cloud DNS

Cloud Content Delivery Network (CDN)

Cloud VPN

Cloud Router

Cloud Interconnect

External Peering

Resource Manager

Stackdriver

Big Data

BigQuery

Cloud Pub/Sub

Cloud Dataproc

Cloud Dataflow

Cloud Datalab

Cloud Machine Learning

Cloud Vision API

Cloud Natural Language API

Cloud Translation API

Cloud Video Intelligence API

Tools

Cloud Endpoints

Cloud Source Repositories

Deployment Manager

Cloud Marketplace

Command Line Interface (CLI)

Configuration and services

Compute Engine

Kubernetes

Google Container Registry (GCR)

Deployment Manager

Miscellaneous

Cloud Storage

BigQuery

Cloud VPN

SSH into instance/VM

Get billing information

GCP vs. AWS

See also

External links

Navigation menu

Search