Init and have all packages required

This commit is contained in:
Lillian Violet 2024-09-04 10:15:43 +02:00
commit 782aba19ba
53 changed files with 21896 additions and 0 deletions

View file

@ -0,0 +1,118 @@
# Azure ML Lesson 2 Lab
## 1. Set environmental variables
1. Run VS Code in a Azure ML remote instance as shown before.
2. Press `File > Open Folder` and navigate to `azuremlpythonsdk-v2/` to open the exercise.
**IMPORTANT** Relative paths are assumed to be initialized from the `azuremlpythonsdk-v2` folder.
Open the file `initialize_constants.py`, there are three variables that should be updated:
- AZURE_WORKSPACE_NAME
- AZURE_RESOURCE_GROUP
- AZURE_SUBSCRIPTION_ID
Open your workspace at in `https://ml.azure.com`. At the top right, select the workspace name, then copy the workspace name, the subscription id and the resource name.
## 2. Load a workspace
Open the file `ml_client.py` and understand how a ML client object is loaded or created. In this lab, the namespace was already created. Just fill the name of the variables from `initialize_constants.py`.
When finished, run this file and check that it is executed without errors.
## 3. Load a Compute Cluster
Open the file `compute_aml.py` and understand how a compute cluster is loaded or created. In this lab, the compute cluster was already created but some variables should be added, which are marked with `XXXX`.
When finished, run this file and check that it is executed without errors.
What would happen if the compute cluster is not present?
## 4. Create a tabular dataset
Open the file `data_tabular.py` , several gaps should be filled which are marked with `XXXX`:
1. `ml_client = XXXXX()`
Hint: look into previous files.
2. How can you get the names of the datasets already registered in `if name_dataset not in [XXXXX for env in ml_client.data.list()]`
Hint: Try to get one object from the class [Data](https://learn.microsoft.com/en-us/python/api/azure-ai-ml/azure.ai.ml.entities.data?view=azure-python) and check their attributes.
3. Which should be the `path` parameter in `path=XXXXX`?
4. Which input should you give in `ml_client.data.create_or_update(XXXXX)`?
When finished, run this file and check that it is executed without errors.
## 5. Create and register an environment
Open the file `environment.py` , several gaps should be filled which are marked with `XXXX`:
1. `ml_client = XXXXX()`
Hint: look into previous files.
2. Which class should be used to register the environment?
Hint: Take a look [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-manage-environments-v2?tabs=python)
When finished, run this file and check that it is executed without errors.
## 6. Train a model from a tabular dataset using a remote compute
Open the file `azml_01_experiment_remote_compute.py` , several gaps should be filled which are marked with `XXXX`:
1. `ml_client = XXXX()`
Hint: look into previous files.
2. Complete the `latest_version_dataset` definition.
Hint: Take a look [here](https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-azure-ml-in-a-day#deploy-the-model-to-the-endpoint)
3. Complete the `Input` part.
Hint: Take a look [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-read-write-data-v2?tabs=python)
4. Complete the `command` part.
Hint: Take a look [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-read-write-data-v2?tabs=python)
When finished, run this file and check that it is executed without errors.
### 7. Tune hyperparameters using a remote compute
Open the file `azml_02_hyperparameters_tuning.py` , several gaps should be filled which are marked with `XXXX`. The hyperparameter search should be defined in the following space:
- learning_rate: one of the values 0.01, 0.1, 1.0
- n_estimators: one of the values 10, 100
Hint: Use the previous file as template.
Hint: For the `Hyperdrive settings` format, look [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-sweep-in-pipeline)
Open the file `diabetes_hyperdrive/diabetes_training.py` , several gaps should be filled which are marked with `XXXX`. A Gradient Boosting classification model should be trained and the auc and the accuracy in the test set should be computed.
Hint: Use as a template the file `data/diabetes_training.py`.
When finished, run this file and check that it is executed without errors.
## 8. Create a real-time inferencing service
Open the file `azml_03_realtime_inference.py` , several gaps should be filled which are marked with `XXXX`.
Hint: Take a look [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-mlflow-models?tabs=fromjob%2Cmir%2Csdk)
When finished, run this file and check that it is executed without errors.
## 9. Test the inference service
Open the file `azml_04_test_inference.py` , several gaps should be filled which are marked with `XXXX`.
Hint: Check [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-safely-rollout-online-endpoints?view=azureml-api-2&tabs=python)

View file

@ -0,0 +1,70 @@
"""
Script to train a model from a tabular dataset using a remote compute
Based on:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn
"""
from azure.ai.ml import Input, command
from azure.ai.ml.constants import AssetTypes
from compute_aml import create_or_load_aml
from data_tabular import create_tabular_dataset, name_dataset
from environment import custom_env_name
from initialize_constants import AML_COMPUTE_NAME
from ml_client import create_or_load_ml_client
experiment_name = "mslearn-train-diabetes"
experiment_folder = "./diabetes_training"
script_name = "diabetes_training.py"
registered_model_name = "diabetes_model"
def main():
# 1. Create or Load a ML client
ml_client = XXXX()
# 2. Create compute resources
create_or_load_aml()
# 3. Create and register a File Dataset
create_tabular_dataset()
latest_version_dataset = next(
dataset.latest_version
for dataset in ml_client.data.XXXX
if dataset.name == name_dataset
)
# 4. Run Job
job = command(
inputs=dict(
script_name=script_name,
data=Input(
type=AssetTypes.URI_FILE,
# @latest doesn't work with dataset paths
path=XXXX,
),
registered_model_name=registered_model_name,
),
code=experiment_folder,
command=(
"python ${{inputs.script_name}}"
+ " --data XXXX"
+ " --registered_model_name XXXX"
),
environment=f"{custom_env_name}@latest",
compute=AML_COMPUTE_NAME,
experiment_name=experiment_name,
display_name=experiment_name,
)
# submit the command
returned_job = ml_client.jobs.create_or_update(job)
# stream the output and wait until the job is finished
ml_client.jobs.stream(returned_job.name)
# refresh the latest status of the job after streaming
returned_job = ml_client.jobs.get(name=returned_job.name)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,113 @@
"""
Script to train tune hyperparameters
Based on:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn
"""
from azure.ai.ml import Input, command
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.entities import Model
from azure.ai.ml.sweep import Choice
from compute_aml import create_or_load_aml
from data_tabular import create_tabular_dataset, name_dataset
from environment import create_docker_environment, custom_env_name
from initialize_constants import AML_COMPUTE_NAME
from ml_client import create_or_load_ml_client
experiment_folder = "diabetes_hyperdrive"
experiment_name = "mslearn-diabetes-hyperdrive"
script_name = "diabetes_training.py"
registered_model_name = "diabetes_model_hyper"
best_model_name = "best_diabetes_model"
def main():
# 1. Create or Load a ML client
ml_client = XXXX()
# 2. Create compute resources
XXXX()
# 3. Create and register a File Dataset
XXXX()
latest_version_dataset = XXXX()
# 4. Environment
environment_names = [env.name for XXXX in ml_client.environments.list()]
if custom_env_name not in environment_names:
create_docker_environment()
# 5. Run Job
job_for_sweep = command(
inputs=dict(
script_name=script_name,
data=Input(
type=AssetTypes.URI_FILE,
# @latest doesn't work with dataset paths
path=f"azureml:{name_dataset}:{latest_version_dataset}",
),
registered_model_name=registered_model_name,
learning_rate=XXXX(values= XXXX),
n_estimators=XXXX(values=XXXX),
),
code=experiment_folder,
command=(
"python XXXX"
+ " --data XXXX"
+ " --registered_model_name XXXX"
+ " --learning_rate XXXX"
+ " --n_estimators XXXX"
),
environment=XXXX,
compute=AML_COMPUTE_NAME,
experiment_name=experiment_name,
display_name=experiment_name,
)
# Configure hyperdrive settings
sweep_job = job_for_sweep.XXXX(
compute=AML_COMPUTE_NAME,
sampling_algorithm="grid",
primary_metric="AUC",
goal="Maximize",
max_total_trials=6,
max_concurrent_trials=2,
)
# submit the command
returned_sweep_job = ml_client.create_or_update(sweep_job)
# stream the output and wait until the job is finished
ml_client.jobs.stream(returned_sweep_job.name)
# refresh the latest status of the job after streaming
returned_sweep_job = ml_client.jobs.get(name=returned_sweep_job.name)
# Find and register the best model
if returned_sweep_job.status == "Completed":
# First let us get the run which gave us the best result
best_run = returned_sweep_job.properties["best_child_run_id"]
# lets get the model from this run
model = Model(
# the script stores the model as the given name
path=(
f"azureml://jobs/{best_run}/outputs/artifacts/paths/"
+ f"{registered_model_name}/"
),
name=best_model_name,
type="mlflow_model",
)
else:
print(
f"Sweep job status: {returned_sweep_job.status}. \
Please wait until it completes"
)
# Register best model
print(f"Registering Model {best_model_name}")
ml_client.models.XXXX(model=model)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,49 @@
"""
Script to create a real-time inferencing service
Based on:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-mlflow-models
"""
from azure.ai.ml.entities import ManagedOnlineDeployment, ManagedOnlineEndpoint
from azml_02_hyperparameters_tuning import best_model_name
from initialize_constants import AZURE_WORKSPACE_NAME, VM_SIZE
from ml_client import create_or_load_ml_client
online_endpoint_name = ("srv-" + AZURE_WORKSPACE_NAME).lower()
def main():
# 1. Create or Load a ML client
ml_client = XXXX()
# 2. Create a endpoint
print(f"Creating endpoint {online_endpoint_name}")
endpoint = XXXX(
name=online_endpoint_name,
auth_mode="key",
)
# Method `result()` should be added to wait until completion
ml_client.online_endpoints.XXXX(endpoint).result()
# 3. Create a deployment
best_model_latest_version = XXXX
blue_deployment = XXXX(
name=online_endpoint_name,
endpoint_name=online_endpoint_name,
# @latest doesn't work with model paths
model=XXXX,
instance_type=VM_SIZE,
instance_count=1,
)
# Assign all the traffic to this endpoint
# Method `result()` should be added to wait until completion
ml_client.begin_create_or_update(blue_deployment).result()
endpoint.traffic = {online_endpoint_name: 100}
ml_client.begin_create_or_update(endpoint).result()
if __name__ == "__main__":
main()

View file

@ -0,0 +1,23 @@
"""
Script to use real-time inferencing with online endpoints
"""
from azml_03_realtime_inference import online_endpoint_name
from ml_client import create_or_load_ml_client
def main():
# 1. Load a Workspace
ml_client = XXXX()
# 2. Get predictions
output = ml_client.online_endpoints.XXXX(
endpoint_name=XXXX,
deployment_name=online_endpoint_name,
request_file="./diabetes_test_inference/request.json",
)
print(output)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,63 @@
"""
Script to initialize an Azure Machine Learning compute cluster (aml)
"""
from azure.ai.ml.entities import AmlCompute
from initialize_constants import AML_COMPUTE_NAME, MAX_NODES, MIN_NODES, VM_SIZE
from ml_client import create_or_load_ml_client
def create_or_load_aml(
cpu_compute_target=AML_COMPUTE_NAME,
vm_size=VM_SIZE,
min_nodes=MIN_NODES,
max_nodes=MAX_NODES,
):
"""Create or load an Azure Machine Learning compute cluster (aml) in a
given Workspace.
Args:
cpu_compute_target: Name of the compute resource
vm_size: Virtual machine size, VM_SIZE is used as default,
for example STANDARD_D2_V2. Set to STANDARD_NC6 to get a GPU
min_nodes: Minimal number of nodes, MIN_NODES is used as default.
max_nodes: Minimal number of nodes, MIN_NODES is used as default.
Returns:
An aml and set quick load.
"""
# Create or Load a Workspace
ml_client = create_or_load_ml_client()
try:
# let's see if the compute target already exists
cpu_cluster = ml_client.compute.get(XXXXX)
print(
f"You already have a cluster named {XXXXX},",
"we'll reuse it.",
)
except Exception:
print("Creating a new cpu compute target...")
cpu_cluster = AmlCompute(
name=cpu_compute_target,
# Azure ML Compute is the on-demand VM service
type="amlcompute",
# VM Family
size=vm_size,
# Minimum running nodes when there is no job running
min_instances=min_nodes,
# Nodes in cluster
max_instances=max_nodes,
# How many seconds will the node running after the job termination
idle_time_before_scale_down=180,
# Dedicated or LowPriority.
# The latter is cheaper but there is a chance of job termination
tier="Dedicated",
)
# Now, we pass the object to MLClient's create_or_update method
cpu_cluster = ml_client.compute.begin_create_or_update(XXXXX)
return cpu_cluster
if __name__ == "__main__":
create_or_load_aml()

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,31 @@
"""
Script to create and register file as an uri
"""
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.entities import Data
from ml_client import create_or_load_ml_client
name_dataset = "diabetes-dataset"
data_folder = "./data/diabetes.csv"
def create_tabular_dataset():
# 1. Create or Load a ML client
ml_client = XXXXX()
# 2. Add files
if name_dataset not in [XXXXX for env in ml_client.data.list()]:
tab_data_set = Data(
path=XXXXX,
type=AssetTypes.URI_FILE,
name=name_dataset,
)
ml_client.data.create_or_update(XXXXX)
else:
print("Dataset already registered.")
if __name__ == "__main__":
create_tabular_dataset()

View file

@ -0,0 +1,11 @@
name: model-env
dependencies:
- python=3.8
- scikit-learn
- pandas
- numpy
- matplotlib
- pip
- pip:
- mlflow
- azureml-mlflow

View file

@ -0,0 +1,123 @@
# Import libraries
import argparse
import os
import mlflow
import mlflow.sklearn
import numpy as np
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
def main():
"""Main function of the script."""
# Input and output arguments
# Get script arguments
parser = XXXX()
# Input dataset
parser.add_argument(
"XXXX",
type=str,
help="path to input data",
)
# Model name
parser.add_argument("XXXX", type=str, help="model name")
# Hyperparameters
parser.add_argument(
"XXXX",
type=float,
dest="learning_rate",
default=0.1,
help="learning rate",
)
parser.add_argument(
"XXXX",
type=int,
dest="n_estimators",
default=100,
help="number of estimators",
)
# Add arguments to args collection
args = parser.parse_args()
print(" ".join(f"{k}={v}" for k, v in vars(args).items()))
# Start Logging
mlflow.XXXX()
# enable autologging
mlflow.XXXX()
# load the diabetes data (passed as an input dataset)
print("input data:", args.data)
diabetes = pd.read_csv(args.data)
# Separate features and labels
X, y = (
diabetes[
[
"Pregnancies",
"PlasmaGlucose",
"DiastolicBloodPressure",
"TricepsThickness",
"SerumInsulin",
"BMI",
"DiabetesPedigree",
"Age",
]
].values,
diabetes["Diabetic"].values,
)
# Split data into training set and test set
X_train, X_test, y_train, y_test = XXXX(
X, y, test_size=0.30, random_state=0
)
# Train a Gradient Boosting classification model
# with the specified hyperparameters
print("Training a classification model")
model = XXXX(
learning_rate=XXXX, n_estimators=XXXX
).fit(X_train, y_train)
# calculate accuracy
y_hat = model.XXXX(X_test)
accuracy = np.average(y_hat == y_test)
print("Accuracy:", accuracy)
mlflow.log_metric("Accuracy", float(accuracy))
# calculate AUC
y_scores = model.XXXX(X_test)
auc = roc_auc_score(y_test, y_scores[:, 1])
print("AUC: " + str(auc))
mlflow.log_metric("AUC", float(auc))
# Registering the model to the workspace
print("Registering the model via MLFlow")
mlflow.XXXX(
sk_model=model,
registered_model_name=args.registered_model_name,
artifact_path=args.registered_model_name,
)
# Saving the model to a file
mlflow.sklearn.save_model(
sk_model=model,
path=os.path.join(args.registered_model_name, "trained_model"),
)
# Stop Logging
mlflow.XXXX()
if __name__ == "__main__":
main()

View file

@ -0,0 +1,4 @@
{"input_data": [
[2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22],
[0, 148, 58, 11, 179, 39.19207553, 0.160829008, 45]
]}

View file

@ -0,0 +1,115 @@
# Import libraries
import argparse
import os
import matplotlib.pyplot as plt
import mlflow
import mlflow.sklearn
import numpy as np
import pandas as pd
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
def main():
"""Main function of the script."""
# Input and output arguments
# Get script arguments
parser = argparse.ArgumentParser()
parser.add_argument(
"--data",
type=str,
help="path to input data",
)
parser.add_argument("--registered_model_name", type=str, help="model name")
args = parser.parse_args()
print(" ".join(f"{k}={v}" for k, v in vars(args).items()))
# Start Logging
mlflow.start_run()
# enable autologging
mlflow.sklearn.autolog()
# load the diabetes data (passed as an input dataset)
print("input data:", args.data)
diabetes = pd.read_csv(args.data)
mlflow.log_metric("num_samples", diabetes.shape[0])
mlflow.log_metric("num_features", diabetes.shape[1] - 1)
# Separate features and labels
X, y = (
diabetes[
[
"Pregnancies",
"PlasmaGlucose",
"DiastolicBloodPressure",
"TricepsThickness",
"SerumInsulin",
"BMI",
"DiabetesPedigree",
"Age",
]
].values,
diabetes["Diabetic"].values,
)
# Split data into training set and test set
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.30, random_state=0
)
# Train a decision tree model
print("Training a decision tree model")
model = DecisionTreeClassifier().fit(X_train, y_train)
# calculate accuracy
y_hat = model.predict(X_test)
accuracy = np.average(y_hat == y_test)
print("Accuracy:", accuracy)
mlflow.log_metric("Accuracy", float(accuracy))
# calculate AUC
y_scores = model.predict_proba(X_test)
auc = roc_auc_score(y_test, y_scores[:, 1])
print("AUC: " + str(auc))
mlflow.log_metric("AUC", float(auc))
# plot ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_scores[:, 1])
fig = plt.figure(figsize=(6, 4))
# Plot the diagonal 50% line
plt.plot([0, 1], [0, 1], "k--")
# Plot the FPR and TPR achieved by our model
plt.plot(fpr, tpr)
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
fig.savefig("ROC.png")
mlflow.log_artifact("ROC.png")
plt.show()
# Registering the model to the workspace
print("Registering the model via MLFlow")
mlflow.sklearn.log_model(
sk_model=model,
registered_model_name=args.registered_model_name,
artifact_path=args.registered_model_name,
)
# Saving the model to a file
mlflow.sklearn.save_model(
sk_model=model,
path=os.path.join(args.registered_model_name, "trained_model"),
)
# Stop Logging
mlflow.end_run()
if __name__ == "__main__":
main()

View file

@ -0,0 +1,33 @@
"""
Script to create and register an environment including SKlearn
"""
import os
from azure.ai.ml.entities import Environment
from ml_client import create_or_load_ml_client
dependencies_dir = "./dependencies"
custom_env_name = "custom-scikit-learn"
def create_docker_environment():
# 1. Create or Load a ML client
ml_client = XXXXX()
# 2. Create a Python environment for the experiment
env_docker_image = XXXXX(
name=custom_env_name,
conda_file=os.path.join(dependencies_dir, "XXXXX"),
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest",
)
ml_client.environments.create_or_update(env_docker_image)
print(
f"Environment with name {env_docker_image.name} is registered to the workspace,",
f"the environment version is {env_docker_image.version}"
)
if __name__ == "__main__":
create_docker_environment()

View file

@ -0,0 +1,23 @@
"""
Script to initialize global constants
"""
import os
# Global constants can be set via environmental variables
# Remove default values in production
AZURE_RESOURCE_GROUP = os.getenv("AZURE_RESOURCE_GROUP", "itvitae-azure-ml")
AZURE_SUBSCRIPTION_ID = os.getenv(
"AZURE_SUBSCRIPTION_ID", "34faeead-244d-4ae8-8194-1eeaaffaf5be"
)
AZURE_WORKSPACE_NAME = os.getenv(
"AZURE_WORKSPACE_NAME",
"ws-kevin-heimbach",
)
AZURE_LOCATION = os.getenv("AZURE_LOCATION", "westeurope")
# Choose names for your clusters
AML_COMPUTE_NAME = os.getenv("AML_COMPUTE_NAME", "aml-compute")
# General Servers Characteristics
VM_SIZE = os.getenv("VM_SIZE", "STANDARD_DS2_V2")
MIN_NODES = int(os.getenv("MIN_NODES", 0))
MAX_NODES = int(os.getenv("MAX_NODES", 1))
AGENT_COUNT = int(os.getenv("AGENT_COUNT", 2))

View file

@ -0,0 +1,46 @@
"""
Script to initialize MLClient object
"""
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from initialize_constants import (
AZURE_RESOURCE_GROUP,
AZURE_SUBSCRIPTION_ID,
AZURE_WORKSPACE_NAME,
)
def create_or_load_ml_client():
"""Create or load an Azure ML Client based on env variables.
Args:
None since information is taken from global constants
defined in initialize_constants.py.
Returns:
A workspace and set quick load.
"""
try:
credential = DefaultAzureCredential()
# Check if given credential can get token successfully.
credential.get_token("https://management.azure.com/.default")
except Exception as ex:
# Fall back to InteractiveBrowserCredential
# in case DefaultAzureCredential not working
print(ex)
credential = InteractiveBrowserCredential()
# Get a handle to the workspace.
# You can find the info on the workspace tab on ml.azure.com
ml_client = MLClient(
credential=credential,
subscription_id=XXXXX,
resource_group_name=XXXXX,
workspace_name=XXXXX,
)
return ml_client
if __name__ == "__main__":
ml_client = create_or_load_ml_client()
print(ml_client)

View file

@ -0,0 +1,37 @@
[flake8]
ignore = E203, W503
max-line-length = 99
max-complexity = 18
select = B,C,E,F,W,T4
[isort]
multi_line_output=3
include_trailing_comma=True
force_grid_wrap=0
use_parentheses=True
ensure_newline_before_comments=True
line_length=99
[mypy]
files=refactor,tests
ignore_missing_imports=True
[coverage:run]
source = refactor
[coverage:report]
exclude_lines =
# exclude pragma again
pragma: no cover
# exclude main
if __name__ == .__main__.:
[coverage:html]
directory = coverage
[coverage:xml]
output = coverage.xml
[tool:pytest]
testpaths=tests/