Google Cloud Platform
Google Cloud Platform (GCP) is a cloud computing service by Google that offers hosting on the same supporting infrastructure that Google uses internally for end-user products like Gmail, Google Search, Maps, and YouTube.
Contents
Elements
- Google Compute Engine – IaaS service providing virtual machines similar to Amazon EC2.
- Google App Engine – PaaS service for directly hosting applications similar to AWS Elastic Beanstalk.
- BigTable – IaaS service providing map reduce services. Similar to Hadoop.
- BigQuery – IaaS service providing Columnar database. Similar to Amazon Redshift.
- Google Cloud Functions – FaaS service allowing functions to be triggered by events without developer resource management similar to Amazon Lambda or IBM OpenWhisk.
Details
- Connecting to a VM
$ gcloud compute --project "my-project-123456" ssh --zone "us-west1-b" "my-vm"
The above command will create SSH keys (stored in ~/.ssh
by default). After that, you can use the private key to SSH into your VM:
$ ssh -i ~/.ssh/google_compute_engine username@x.x.x.x
- Compute Engine Metadata
- The project metadata URL is:
http://metadata.google.internal/computeMetadata/v1/project/
- The instance metadata URL is:
http://metadata.google.internal/computeMetadata/v1/instance/
Each URL returns a set of entries that can be appended to the URL. Project settings contain info (e.g., project ID). Instance settings contain info on disks, hostname, machine
type, etc.
One can also set one's own values so that one can use them in code on the VM.
$ curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token" {"access_token":"aa00...","expires_in":3599,"token_type":"Bearer"} $ curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/project/" attributes/ numeric-project-id project-id
- Preemptible VMs
Preemptible VMs are highly affordable, short-lived compute instances suitable for batch jobs and fault-tolerant workloads. Preemptible VMs offer the same machine types and options as regular compute instances and last for up to 24 hours. If your applications are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Google Compute Engine costs significantly.
GCP vs. AWS
Note: All of the following are as of February 2017.
- Compute
- Compute Engine vs. EC2
- App Engine vs. Elastic Beanstalk
- Container Engine vs. EC2
- Container Registry vs. ECR
- Cloud Functions vs. Lambda
- Identity & Security
- Cloud IAM vs. IAM
- Cloud Resource Manager vs. n/a
- Cloud Security Scanner vs. Inspector
- Cloud Platform Security vs. n/a
- Networking
- Cloud Virtual Network vs. VPC
- Cloud Load Balancing vs. ELB
- Cloud CDN vs. CloudFront
- Cloud Interconnect vs. Direct Connect
- Cloud DNS vs. Route53
- Storage and Databases
- Cloud Storage vs. S3
- Cloud Bigtable vs. DynamoDB
- Cloud Datastore vs. SimpleDB
- Cloud SQL vs. RDS
- Persistent Disk vs. EBS
- Big Data
- BigQuery vs. Redshift
- Cloud Dataflow vs. EMR
- Cloud Dataproc vs. EMR
- Cloud Datalab vs. n/a
- Cloud Pub/Sub vs. Kinesis
- Genomics vs. n/a
- Machine Learning
- Cloud Machine Learning vs. Machine Learning
- Vision API vs. Rekognition
- Speech API vs. Polly
- Natural Language API vs. Lex
- Translation API vs. n/a
- Jobs API vs. n/a
- Compute Services (GCP vs. AWS):
- Infrastructure as a Service (IaaS): Compute Engine vs. EC2
- Platform as a Service (PaaS): App Engine vs. Elastic Beanstalk
- Containers as a Service: Container Engine vs. EC2
Compute IaaS comparison | ||
---|---|---|
Feature | Amazon EC2 | Compute Engine |
Virtual machines | Instances | Instances |
Machine images | Amazon Machine Image (AMI) | Image |
Temporary virtual machines | Spot instances | Preemptible VMs |
Firewall | Security groups | Compute Engine firewall rules |
Automatic instance scaling | Auto Scaling | Compute Engine autoscaler |
Local attached disk | Ephemeral disk | Local SSD |
VM import | Supported formats: RAW, OVA, VMDK, VHD | Supported formats: AMI, RAW, VirtualBox |
Deployment locality | Zonal | Zonal |
Networking services comparison | |||||
---|---|---|---|---|---|
Networking | Load Balancing | CDN | On-premises connection | DNS | |
AWS | VPC | ELB | CloudFront | Direct Connect | Route53 |
GCP | Cloud VirtualNetwork1 | Cloud LoadBalancing2 | Cloud CDN | Cloud InterConnect | Cloud DNS |
1GCP allows for 802.1q tagging (aka VLAN taggin). AWS does not.
2GCP allows for cross-region load balancing. AWS does not.
Storage services comparison | ||||
---|---|---|---|---|
Object | Block | Cold | File | |
AWS | S3 | EBS1 | Glacier | EFS |
GCP | Cloud Storage | Compute Engine Persistent Disks2 | Cloud Storage Nearline | ZFS/Avere |
1An EBS volume can be attached to only one EC2 instance at a time. Can attach up to 40 disk volumes to a Linux instance. Available in only one region by default.
2GCP Persistent Disks in read-only mode can be attached to multiple instances simultaneously. Can attach up to 128 disk volumes. Snapshots are global and can be used in any region without additional operations or charges.
Database services comparison | |||
---|---|---|---|
RDMS | NoSQL (key-value) | NoSQL (indexed) | |
AWS | RDS | DynamoDB | DynamoDB |
GCP | Cloud SQL1 | Cloud Bigtable2 | Cloud Datastore |
1MySQL only.
2100 MB maximum item size. Does not support secondary indexes.
Big Data services comparison | ||||
---|---|---|---|---|
Streaming data ingestion | Streaming data processing | Batch data processing | Analytics | |
AWS | Kinesis | Kinesis | EMR | Redshift |
GCP | Cloud Pub/Sub | Cloud Dataflow | Cloud Dataflow / Cloud Dataproc | BigQuery |
- Cloud Pub/Sub
- GCPs offering for data streaming and message queue. It allows for secure communication between applications and can also serve as a de-coupling method (a good way to scale).
- Dataflow
- GCPs managed service offering for batch and streaming data processing. Apache Beam under-the-hood.
- Dataproc
- GCPs offering for data processing using Apache Hadoop and Apache Spark. It is a massively parallel data processing and transformation engine.
- Supported services: MapReduce, Apache Hive, Apache Pig, Apache Spark, Spark SQL, PySpark, and support for parallel jobs with YARN.
- BigQuery
- GCPs offering for a fully managed, massive data warehousing and analytics engine, allowing for data analytics using SQL.
Application services comparison | |
---|---|
Messaging | |
AWS | SNS |
GCP | Cloud Pub/Sub |
- Cloud Pub/Sub (publisher/subscriber)
Management services comparison | ||
---|---|---|
Monitoring | Deployment | |
AWS | CloudWatch | CloudFormation |
GCP | Stackdriver | Deployment Manager |
Command Line Interface (CLI)
The Google Cloud SDK is a set of tools that you can use to manage resources and applications hosted on the Google Cloud Platform (GCP). These include the gcloud, gsutil, and bq command line tools. The gcloud command-line tool is downloaded along with the Cloud SDK.
$ gcloud components list
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Components │ ├───────────────┬──────────────────────────────────────────────────────┬──────────────────────────┬───────────┤ │ Status │ Name │ ID │ Size │ ├───────────────┼──────────────────────────────────────────────────────┼──────────────────────────┼───────────┤ │ Not Installed │ App Engine Go Extensions │ app-engine-go │ 56.6 MiB │ │ Not Installed │ Cloud Bigtable Command Line Tool │ cbt │ 6.4 MiB │ │ Not Installed │ Cloud Bigtable Emulator │ bigtable │ 5.6 MiB │ │ Not Installed │ Cloud Datalab Command Line Tool │ datalab │ < 1 MiB │ │ Not Installed │ Cloud Datastore Emulator │ cloud-datastore-emulator │ 17.7 MiB │ │ Not Installed │ Cloud Datastore Emulator (Legacy) │ gcd-emulator │ 38.1 MiB │ │ Not Installed │ Cloud Firestore Emulator │ cloud-firestore-emulator │ 27.5 MiB │ │ Not Installed │ Cloud Pub/Sub Emulator │ pubsub-emulator │ 33.4 MiB │ │ Not Installed │ Cloud SQL Proxy │ cloud_sql_proxy │ 3.8 MiB │ │ Not Installed │ Emulator Reverse Proxy │ emulator-reverse-proxy │ 14.5 MiB │ │ Not Installed │ Google Cloud Build Local Builder │ cloud-build-local │ 6.0 MiB │ │ Not Installed │ Google Container Registry's Docker credential helper │ docker-credential-gcr │ 1.8 MiB │ │ Not Installed │ gcloud Alpha Commands │ alpha │ < 1 MiB │ │ Not Installed │ gcloud Beta Commands │ beta │ < 1 MiB │ │ Not Installed │ gcloud app Java Extensions │ app-engine-java │ 107.5 MiB │ │ Not Installed │ gcloud app PHP Extensions │ app-engine-php │ │ │ Not Installed │ gcloud app Python Extensions │ app-engine-python │ 6.2 MiB │ │ Not Installed │ gcloud app Python Extensions (Extra Libraries) │ app-engine-python-extras │ 28.5 MiB │ │ Not Installed │ kubectl │ kubectl │ < 1 MiB │ │ Installed │ BigQuery Command Line Tool │ bq │ < 1 MiB │ │ Installed │ Cloud SDK Core Libraries │ core │ 9.1 MiB │ │ Installed │ Cloud Storage Command Line Tool │ gsutil │ 3.5 MiB │ └───────────────┴──────────────────────────────────────────────────────┴──────────────────────────┴───────────┘
- To install or remove components at your current SDK version [228.0.0], run:
$ gcloud components install COMPONENT_ID $ gcloud components remove COMPONENT_ID
- To update your SDK installation to the latest version [228.0.0], run:
$ gcloud components update
- Initialize gcloud:
$ gcloud init
- Get current gcloud configuration:
$ gcloud config list
[compute] region = us-west1 zone = us-west1-a [core] account = someone@somewhere.com disable_usage_reporting = True project = my-project-223521 Your active configuration is: [default]
- Get a list of all configurations:
$ gcloud config configurations list
NAME IS_ACTIVE ACCOUNT PROJECT DEFAULT_ZONE DEFAULT_REGION default True someone@somewhere.com my-project-223521 us-west1-a us-west1
- Get a list of all (enabled) services:
$ $ gcloud services list
NAME TITLE bigquery-json.googleapis.com BigQuery API cloudapis.googleapis.com Google Cloud APIs clouddebugger.googleapis.com Stackdriver Debugger API cloudtrace.googleapis.com Stackdriver Trace API compute.googleapis.com Compute Engine API container.googleapis.com Kubernetes Engine API containerregistry.googleapis.com Container Registry API datastore.googleapis.com Cloud Datastore API dns.googleapis.com Google Cloud DNS API logging.googleapis.com Stackdriver Logging API monitoring.googleapis.com Stackdriver Monitoring API oslogin.googleapis.com Cloud OS Login API pubsub.googleapis.com Cloud Pub/Sub API servicemanagement.googleapis.com Service Management API serviceusage.googleapis.com Service Usage API sql-component.googleapis.com Cloud SQL stackdriver.googleapis.com Stackdriver API stackdriverprovisioning.googleapis.com Stackdriver Provisioning Service storage-api.googleapis.com Google Cloud Storage JSON API storage-component.googleapis.com Google Cloud Storage
$ gcloud services list --enabled --sort-by="NAME" $ gcloud services list --available --sort-by="NAME"
- Create a GKE cluster
- Create the Kubernetes cluster:
$ gcloud beta container --project "gcp-k8s-223521" clusters create "xtof-gcp-k8s" \ --zone "us-west1-a" \ --username "admin" \ --cluster-version "1.11.5-gke.5" \ --machine-type "n1-standard-1" \ --image-type "COS" \ --disk-type "pd-standard" \ --disk-size "100" \ --scopes \ "https://www.googleapis.com/auth/devstorage.read_only", "https://www.googleapis.com/auth/logging.write", "https://www.googleapis.com/auth/monitoring", "https://www.googleapis.com/auth/servicecontrol", "https://www.googleapis.com/auth/service.management.readonly", "https://www.googleapis.com/auth/trace.append" \ --num-nodes "3" \ --enable-stackdriver-kubernetes \ --no-enable-ip-alias \ --network "projects/gcp-k8s-223521/global/networks/default" \ --subnetwork "projects/gcp-k8s-223521/regions/us-west1/subnetworks/default" \ --addons HorizontalPodAutoscaling,HttpLoadBalancing,KubernetesDashboard,Istio \ --istio-config auth=NONE \ --enable-autoupgrade \ --enable-autorepair
- Get the Kubernetes credentials:
$ gcloud container clusters get-credentials xtof-gcp-k8s --zone us-west1-a --project gcp-k8s-223521
- Delete the cluster:
$ gcloud container clusters delete --project "gcp-k8s-223521" "xtof-gcp-k8s" --zone "us-west1-a"
- Miscellaneous
$ gcloud config set project <project-name> $ gcloud config set compute/zone us-west1 $ gcloud config unset compute/zone $ gcloud iam service-accounts list \ --filter='displayName:"Compute Engine default service account"' \ --format='value(email)'
$ gcloud compute networks subnets list
NAME REGION NETWORK RANGE default us-west2 default 10.168.0.0/20 default asia-northeast1 default 10.146.0.0/20 default us-west1 default 10.138.0.0/20 default southamerica-east1 default 10.158.0.0/20 default europe-west4 default 10.164.0.0/20 default asia-east1 default 10.140.0.0/20 default europe-north1 default 10.166.0.0/20 default asia-southeast1 default 10.148.0.0/20 default us-east4 default 10.150.0.0/20 default europe-west1 default 10.132.0.0/20 default europe-west2 default 10.154.0.0/20 default europe-west3 default 10.156.0.0/20 default australia-southeast1 default 10.152.0.0/20 default asia-south1 default 10.160.0.0/20 default us-east1 default 10.142.0.0/20 default us-central1 default 10.128.0.0/20 default asia-east2 default 10.170.0.0/20 default northamerica-northeast1 default 10.162.0.0/20
- Cloud Storage
- Create a bucket in the current project:
$ gsutil mb -l us-west1 gs://${BUCKET_NAME} $ gsutil mb -p ${PROJECT_NAME} -c regional -l ${PROJECT_REGION} gs://${BUCKET_NAME}
- Upload an object to the above bucket:
$ gsutil cp Pictures/foobar.jpg gs://${BUCKET_NAME}
- List the contents of a bucket:
$ gsutil ls gs://${BUCKET_NAME} # basic $ gsutil ls -l gs://${BUCKET_NAME} # extended
- Get the IAM roles and rules for a given bucket (note: these are the default ones):
$ gsutil iam get gs://${BUCKET_NAME}
{ "bindings": [ { "members": [ "projectEditor:my-project-123456", "projectOwner:my-project-123456" ], "role": "roles/storage.legacyBucketOwner" }, { "members": [ "projectViewer:my-project-123456" ], "role": "roles/storage.legacyBucketReader" } ], "etag": "CAE=" }