[Machine Learning Diary] Day 4 Feature Engineer On GCP
Feature Engineering is an important step in the Machine Learning Process. To do this we need to understand the domain knowledge, and algorithm knowledge.
Example:
I will do Feature Engineering on Google Cloud Platform(GCP).
Step 1: Go To Dashboard
Active Cloud Shell(see the above picture) ->Start Cloud Shell
As we will datalab so that I attach the GCP explaination below, you can have a look.
Step 2: Set Connection
To Start Datalab we need to use Cloud Shell and type the commad.
datalab create dataengvm --zone us-central1-a
When the prompt comes up just type Y and press Enter
After you see this change port information then we can do the next step
Step 3: Preview On Port
Change port to 8081 then we can Preview on Port 8081
Step 4: Download Dataset
Use this code to download example dataset:
%bash git clone https://github.com/GoogleCloudPlatform/training-data-analyst cd training-data-analyst
After type the code either press Run or press Shift + Enter
After you refresh the previous page you should be able to see the downloaded folder
Then Let’s go to the feateng folder and Open feateng.ipynb.
This file in on github you can download can have a look.
This notebook explain really well you can download and play with it.
Feature Engineering
In this notebook, you will learn how to incorporate feature engineering into your pipeline.
- Working with feature columns
- Adding feature crosses in TensorFlow
- Reading data from BigQuery
- Creating datasets using Dataflow
- Using a wide-and-deep model
Conclusion:
This is a lab on coursera from Google Cloud Platform. If you would like to learn machine learning and GCP together I highly recommand you to have a look at it.
This course belongs to a series of coursera and after you finish it you can share it on Linkedin.