Ksama Arora

Creating datastore for an Azure Blob Storage container:*

IMP Note:

AzureBlobDatastore class is used to register a new blob storage with the SDK v2. Note that the method register_azure_blob_container doesn’t belong to the SDK v2 but to the SDK v1.

Creating Data Asset and Reading Data*

When you create a data asset in Azure Machine Learning and point to a file or folder stored on your local device, a copy of the file or folder is uploaded to the default datastore (workspaceblobstore). The uploaded data can be found in the LocalUpload` folder. This ensures that you can access the data from the Azure Machine Learning workspace, even if the original file or folder on your local device becomes unavailable.

Screenshot-2024-05-22-at-3-23-33-PM.png

1. URI File Data Asset

A URI file data asset points to a specific file (e.g., CSV, JSON). Azure ML only stores the path to the file, this means you can point to any type of file.

2. URI Folder Data Asset

A URI folder data asset points to a specific folder. It works similar to the URI file data asset and supports the same paths.

3. MLTable Data Asset

MLTable data asset points to a tabular data. You specify the schema definition to read the data. Ideal to use when the schema of your data is complex or changes frequently. Only need to make changes in one location instead of multiple.

NOTE IMP:

To load data into Azure ML table for ML training with least number of steps possible, use the data type - multiple .txt files containing data with proper schema.