Difference between revisions of "AWS/Machine Learning"

Revision as of 23:56, 14 March 2017

This article will be about Amazon Web Services - Machine Learning (ML).

Machine Learning concepts

What is Machine Learning (ML)?

The basic concept of ML is to have computers or machines program themselves.
Machines can analyze large and complex datasets and identify patterns to create models, which are then used to predict outcomes.
Over time, these models can take into account new datasets and improve the accuracy of the predictions.

Examples of where ML is being used

Recommendations when checking out on an e-commerce site (e.g., purchases on Amazon.com)
Spam detection in email
Any kind of image, speech, or text recognition
Weather forecasts
Search engines

What is Amazon ML?

Amazon ML is supervised ML; learns from examples or historical data.
An Amazon ML Model requires your dataset to have both the features and the target for each observation/record.
A feature is an attribute of a record used to identify patterns; typically, there will be multiple features.
A target is the outcome that the patterns are linked to and is the value the ML algorithm is going to predict.
This linking is used to predict the outcomes
Example: {Go to the grocery store} {on Monday} (attribute {feature}) => Buy milk (target)

Why do ML on AWS?

Simplifies the whole process
No coding required for creating models
Identifies the best ML algorithm to run based on the input data
Easily integrates into other AWS services for data retrieval
Deploy within minutes
Full access via APIs
Scalable

Amazon ML pricing (as of March 2017)

Data Analysis and Model Building fees: $0.42/hour
Batch Predictions $0.10 per 1,000 predictions, rounded up to the next 1,000
Real-time predictions: $0.0001 per prediction, rounded up to the nearest penny (plus hourly capacity reservation charge only when the endpoint is active)

AWS ML Workflow

Create a data source
- S3 (i.e., upload a CSV file to S3)
- RDS and Redshift (i.e., run a SQL query on a Redshift cluster and get the data back directly into ML)
Identify the feature and target columns
- Select whether the file has a header row
- Select the correct field data types (possible types: binary, categorical, numeric, text)
- Select the target that needs to be predicted
- Select a Row ID, if the data has one
Train a model with a part of the dataset (generally 70%)
- By default, AWS ML takes 70% of your data and uses it to train the model
- It also automatically decides the best ML Model algorithm to use, based on the data schema
  - Binary target => binary model
  - Numeric target => regression model
  - Categorical target => multi-class model
Evaluate the model by running the remaining dataset through it
Fine-tune the model
Use the model for predictions

Types of ML models available

Binary
- The target/prediction value is a 0 or 1
- Best used when the prediction is a Boolean or one of two possible outcomes (e.g., true/false, yes/no, green apple/red apple, etc.)
- Examples:
  - Does an email match the spam criteria?
  - Will someone respond to a marketing email?
  - Does a purchase on a credit card seem fraudulent?
Multi-class
- The target/prediction is from a set of values
- Best used for predicting categories or types
- Examples:
  - What is the next product a user will purchase based on his/her history of purchases?
  - Film recommendations
Regression
- The target/prediction is a numeric value
- Best used for predicting scores
- Examples:
  - How many millimetres of rain can we expect?
  - Traffic delays
  - How many goals will my soccer team score?

External links

AWS Machine Learning

@@ Line 21: / Line 21: @@
 * This linking is used to predict the outcomes
 * Example: {Go to the grocery store} {on Monday} (attribute {feature}) => Buy milk (target)
+;Why do ML on AWS?
+* Simplifies the whole process
+* No coding required for creating models
+* Identifies the best ML algorithm to run based on the input data
+* Easily integrates into other AWS services for data retrieval
+* Deploy within minutes
+* Full access via APIs
+* Scalable
+;Amazon ML pricing (as of March 2017)
+* Data Analysis and Model Building fees: $0.42/hour
+* Batch Predictions $0.10 per 1,000 predictions, rounded up to the next 1,000
+* Real-time predictions: $0.0001 per prediction, rounded up to the nearest penny (plus hourly capacity reservation charge only when the endpoint is active)
+;AWS ML Workflow
+# Create a data source
+#* S3 (i.e., upload a CSV file to S3)
+#* RDS and Redshift (i.e., run a SQL query on a Redshift cluster and get the data back directly into ML)
+# Identify the feature and target columns
+#* Select whether the file has a header row
+#* Select the correct field data types (possible types: binary, categorical, numeric, text)
+#* Select the '''target''' that needs to be predicted
+#* Select a Row ID, if the data has one
+# Train a model with a part of the dataset (generally 70%)
+#* By default, AWS ML takes 70% of your data and uses it to train the model
+#* It also automatically decides the best ML Model algorithm to use, based on the data schema
+#** Binary target => binary model
+#** Numeric target => regression model
+#** Categorical target => multi-class model
+# Evaluate the model by running the remaining dataset through it
+# Fine-tune the model
+# Use the model for predictions
+;Types of ML models available
+* Binary
+** The target/prediction value is a 0 or 1
+** Best used when the prediction is a Boolean or one of two possible outcomes (e.g., true/false, yes/no, green apple/red apple, etc.)
+** Examples:
+*** Does an email match the spam criteria?
+*** Will someone respond to a marketing email?
+*** Does a purchase on a credit card seem fraudulent?
+* Multi-class
+** The target/prediction is from a set of values
+** Best used for predicting categories or types
+** Examples:
+*** What is the next product a user will purchase based on his/her history of purchases?
+*** Film recommendations
+* Regression
+** The target/prediction is a numeric value
+** Best used for predicting scores
+** Examples:
+*** How many millimetres of rain can we expect?
+*** Traffic delays
+*** How many goals will my soccer team score?
 ==External links==

Difference between revisions of "AWS/Machine Learning"

Revision as of 23:56, 14 March 2017

Machine Learning concepts

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools