Lab 2 Create ML Workspace in Azure

Ksama Arora

Wrangle Data with Python in Azure ML - Lab 2

Step 1: Select workspace and launch Azure ML Studio

Step 2: Go to notebooks and create a notebook and you can see a pre-deployed compute running. When prompted, authenticate to the compute.

Step 3: Load a Dataframe

import pandas as pd
my_dataframe = pd.read_csv("https://raw.githubusercontent.com/pluralsight-cloud/DP-100-Designing-and-Implementing-a-Data-Science-Solution-on-Azure/main/MedicalClaimSummary.csv")
my_dataframe.head(1000)

Step 4: Wrangle - Replace Missing Strings (replace NaN values)

my_dataframe.fillna(value={"Payment Status": "Unkown"}, inplace=True)
my_dataframe.fillna(value={"Claim Network Status": "Unkown"}, inplace=True)
my_dataframe.head(1000)

Step 5: Wrangle - Delete Rows with any empty columns

my_dataframe.dropna(inplace=True)
my_dataframe.head(1000)

Step 6: Wrangle - Remove Duplicate Rows

my_dataframe.drop_duplicates(inplace=True)
my_dataframe.sort_index(inplace=True)
my_dataframe.head(1000)

Step 7: Save the transformed data (Refresh the file tree and you will see the new csv file)

my_dataframe.to_csv("WrangledData.csv")