Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmaspas7

Easiest Solution 2 Pass Your Certification Exams

DP-100 Microsoft Designing and Implementing a Data Science Solution on Azure Free Practice Exam Questions (2025 Updated)

Prepare effectively for your Microsoft DP-100 Designing and Implementing a Data Science Solution on Azure certification with our extensive collection of free, high-quality practice questions. Each question is designed to mirror the actual exam format and objectives, complete with comprehensive answers and detailed explanations. Our materials are regularly updated for 2025, ensuring you have the most current resources to build confidence and succeed on your first attempt.

Page: 2 / 3
Total 476 questions

You need to select a feature extraction method.

Which method should you use?

A.

Mutual information

B.

Mood’s median test

C.

Kendall correlation

D.

Permutation Feature Importance

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Your team is building a data engineering and data science development environment.

The environment must support the following requirements:

support Python and Scala

compose data storage, movement, and processing services into automated data pipelines

the same tool should be used for the orchestration of both data engineering and data science

support workload isolation and interactive workloads

enable scaling across a cluster of machines

You need to create the environment.

What should you do?

A.

Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.

B.

Build the environment in Azure Databricks and use Azure Data Factory for orchestration.

C.

Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.

D.

Build the environment in Azure Databricks and use Azure Container Instances for orchestration.

You are developing a machine learning model.

You must inference the machine learning model for testing.

You need to use a minimal cost compute target

Which two compute targets should you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point

A.

Local web service

B.

Remote VM

C.

Azure Databricks

D.

Azure Machine Learning Kubernetes

E.

Azure Container Instances

You download a .csv file from a notebook in an Azure Machine Learning workspace to a data/sample.csv folder on a compute instance. The file contains 10,000 records. You must generate the summary statistics for the data in the file. The statistics must include the following for each numerical column:

• number of non-empty values

• average value

• standard deviation

• minimum and maximum values

• 25th. 50th. and 75th percentiles

You need to complete the Python code that will generate the summary statistics.

Which code segments should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

You manage an Azure Al Foundry project.

You deploy a large language model from the model catalog.

You need to manually evaluate the model, collect the statistics, and be able to review the results later.

You have an Azure Machine Learning workspace named Workspace^ Workspace1 has a registered MLflow model named model1 with PyFunc flavor. You plan to deploy model1 to an online endpoint named endpoint1 without egress connectivity by using Azure Machine Learning Python SDK v2. You have the following code:

You need to add a parameter to the ManagedOnlineDeployment object to ensure the model deploys successfully.

Solution: Add the code_path parameter.

Does the solution meet the goal?

A.

Yes

B.

No

You create a multi-class image classification deep learning model that uses a set of labeled images. You

create a script file named train.py that uses the PyTorch 1.3 framework to train the model.

You must run the script by using an estimator. The code must not require any additional Python libraries to be installed in the environment for the estimator. The time required for model training must be minimized.

You need to define the estimator that will be used to run the script.

Which estimator type should you use?

A.

TensorFlow

B.

PyTorch

C.

SKLearn

D.

Estimator

You have an Azure Machine Learning workspace.

You plan to tune a model hyperparameter when you train the model.

You need to define a search space that returns a normally distributed value.

Which parameter should you use?

A.

QUniform

B.

LogUniform

C.

Uniform

D.

QLogNormal

You use Azure Machine Learning designer to create a real-time service endpoint. You have a single Azure Machine Learning service compute resource. You train the model and prepare the real-time pipeline for deployment You need to publish the inference pipeline as a web service. Which compute type should you use?

A.

HDInsight

B.

Azure Databricks

C.

Azure Kubernetes Services

D.

the existing Machine Learning Compute resource

E.

a new Machine Learning Compute resource

You are creating a new Azure Machine Learning pipeline using the designer.

The pipeline must train a model using data in a comma-separated values (CSV) file that is published on a

website. You have not created a dataset for this file.

You need to ingest the data from the CSV file into the designer pipeline using the minimal administrative effort.

Which module should you add to the pipeline in Designer?

A.

Convert to CSV

B.

Enter Data ManuallyD

C.

Import Data

D.

Dataset

You have an Azure Machine Learning workspace named Workspace 1 Workspace! has a registered Mlflow model named model 1 with PyFunc flavor

You plan to deploy model1 to an online endpoint named endpoint1 without egress connectivity by using Azure Machine learning Python SDK vl

You have the following code:

You need to add a parameter to the ManagedOnlineDeployment object to ensure the model deploys successfully

Solution: Add the environment parameter.

Does the solution meet the goal?

A.

Yes

B.

No

You are building a binary classification model by using a supplied training set.

The training set is imbalanced between two classes.

You need to resolve the data imbalance.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution NOTE: Each correct selection is worth one point.

A.

Penalize the classification

B.

Resample the data set using under sampling or oversampling

C.

Generate synthetic samples in the minority class.

D.

Use accuracy as the evaluation metric of the model.

E.

Normalize the training feature set.

You manage an Azure Machine Learning workspace. You design a training job that is configured with a serverless compute. The serverless compute must have a specific instance type and count

You need to configure the serverless compute by using Azure Machine Learning Python SDK v2. What should you do?

A.

Specify the compute name by using the compute parameter of the command job

B.

Configure the tier parameter to Dedicated VM.

C.

Initialize and specify the ResourceConfiguration class

D.

Initialize AmICompute class with size and type specification.

A set of CSV files contains sales records. All the CSV files have the same data schema.

Each CSV file contains the sales record for a particular month and has the filename sales.csv. Each file in stored in a folder that indicates the month and year when the data was recorded. The folders are in an Azure blob container for which a datastore has been defined in an Azure Machine Learning workspace. The folders are organized in a parent folder named sales to create the following hierarchical structure:

At the end of each month, a new folder with that month’s sales file is added to the sales folder.

You plan to use the sales data to train a machine learning model based on the following requirements:

You must define a dataset that loads all of the sales data to date into a structure that can be easily converted to a dataframe.

You must be able to create experiments that use only data that was created before a specific previous month, ignoring any data that was added after that month.

You must register the minimum number of datasets possible.

You need to register the sales data as a dataset in Azure Machine Learning service workspace.

What should you do?

A.

Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/sales.csv' file every month. Register the dataset with the name sales_dataset each month, replacing theexisting dataset and specifying a tag named month indicating the month and year it was registered. Usethis dataset for all experiments.

B.

Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv', register the dataset with the name sales_dataset and a tag named month indicating the month and year it was registered, and use this dataset for all experiments.

C.

Create a new tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/ sales.csv' file every month. Register the dataset with the name sales_dataset_MM-YYYY each month with appropriate MM and YYYY values for the month and year. Use the appropriate month-specific dataset for experiments.

D.

Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/sales.csv' file. Register the dataset with the name sales_dataset each month as a new version and with a tag named month indicating the month and year it was registered. Use this dataset for all experiments,identifying the version to be used based on the month tag as necessary.

Page: 2 / 3
Total 476 questions
Copyright © 2014-2025 Solution2Pass. All Rights Reserved