Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer Free Practice Exam Questions (2026 Updated)

Prepare effectively for your Google Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer certification with our extensive collection of free, high-quality practice questions. Each question is designed to mirror the actual exam format and objectives, complete with comprehensive answers and detailed explanations. Our materials are regularly updated for 2026, ensuring you have the most current resources to build confidence and succeed on your first attempt.

Google Professional-Machine-Learning-Engineer Premium Access Download Demo

Page: 2 / 5
Total 296 questions

Question # 26

You have trained an XGBoost model that you plan to deploy on Vertex Al for online prediction. You are now uploading your model to Vertex Al Model Registry, and you need to configure the explanation method that will serve online prediction requests to be returned with minimal latency. You also want to be alerted when feature attributions of the model meaningfully change over time. What should you do?

1 Specify sampled Shapley as the explanation method with a path count of 5.

2 Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses prediction drift as the monitoring objective.

1 Specify Integrated Gradients as the explanation method with a path count of 5.

2 Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses prediction drift as the monitoring objective.

1. Specify sampled Shapley as the explanation method with a path count of 50.

2. Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses training-serving skew as the monitoring objective.

1 Specify Integrated Gradients as the explanation method with a path count of 50.

2. Deploy the model to Vertex Al Endpoints.

3 Create a Model Monitoring job that uses training-serving skew as the monitoring objective.

Explanation:

Sampled Shapley is a fast and scalable approximation of the Shapley value, which is a game-theoretic concept that measures the contribution of each feature to the model prediction. Sampled Shapley is suitable for online prediction requests, as it can return feature attributions with minimal latency. The path count parameter controls the number of samples used to estimate the Shapley value, and a lower value means faster computation. Integrated Gradients is another explanation method that computes the average gradient along the path from a baseline input to the actual input. Integrated Gradients is more accurate than Sampled Shapley, but also more computationally intensive. Therefore, it is not recommended for online prediction requests, especially with a high path count. Prediction drift is the change in the distribution of feature values or labels over time. It can affect the performance and accuracy of the model, and may require retraining or redeploying the model. Vertex AI Model Monitoring allows you to monitor prediction drift on your deployed models and endpoints, and set up alerts and notifications when the drift exceeds a certain threshold. You can specify an email address to receive the notifications, and use the information to retrigger the training pipeline and deploy an updated version of your model. This is the most direct and convenient way to achieve your goal. Training-serving skew is the difference between the data used for training the model and the data used for serving the model. It can also affect the performance and accuracy of the model, and may indicate data quality issues or model staleness. Vertex AI Model Monitoring allows you to monitor training-serving skew on your deployed models and endpoints, and set up alerts and notifications when the skew exceeds a certain threshold. However, this is not relevant to the question, as the question is about the feature attributions of the model, not the data distribution. References :

Vertex AI: Explanation methods

Vertex AI: Configuring explanations

Vertex AI: Monitoring prediction drift

Vertex AI: Monitoring training-serving skew

Question # 27

You are developing a process for training and running your custom model in production. You need to be able to show lineage for your model and predictions. What should you do?

1 Create a Vertex Al managed dataset

2 Use a Vertex Ai training pipeline to train your model

3 Generate batch predictions in Vertex Al

1 Use a Vertex Al Pipelines custom training job component to train your model

2. Generate predictions by using a Vertex Al Pipelines model batch predict component

1 Upload your dataset to BigQuery

2. Use a Vertex Al custom training job to train your model

3 Generate predictions by using Vertex Al SDK custom prediction routines

1 Use Vertex Al Experiments to train your model.

2 Register your model in Vertex Al Model Registry

3. Generate batch predictions in Vertex Al

Question # 28

You developed a custom model by using Vertex Al to predict your application ' s user churn rate You are using Vertex Al Model Monitoring for skew detection The training data stored in BigQuery contains two sets of features - demographic and behavioral You later discover that two separate models trained on each set perform better than the original model

You need to configure a new model mentioning pipeline that splits traffic among the two models You want to use the same prediction-sampling-rate and monitoring-frequency for each model You also want to minimize management effort What should you do?

Keep the training dataset as is Deploy the models to two separate endpoints and submit two Vertex Al Model Monitoring jobs with appropriately selected feature-thresholds parameters

Keep the training dataset as is Deploy both models to the same endpoint and submit a Vertex Al Model Monitoring job with a monitoring-config-from parameter that accounts for the model IDs and feature selections

Separate the training dataset into two tables based on demographic and behavioral features Deploy the models to two separate endpoints, and submit two Vertex Al Model Monitoring jobs

Separate the training dataset into two tables based on demographic and behavioral features. Deploy both models to the same endpoint and submit a Vertex Al Model Monitoring job with a monitoring-config-from parameter that accounts for the model IDs and training datasets

Question # 29

You are creating a model training pipeline to predict sentiment scores from text-based product reviews. You want to have control over how the model parameters are tuned, and you will deploy the model to an endpoint after it has been trained You will use Vertex Al Pipelines to run the pipeline You need to decide which Google Cloud pipeline components to use What components should you choose?

Question # 30

Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers1 account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?

1. Create a Pub/Sub topic for each user

2 Deploy a Cloud Function that sends a notification when your model predicts that a user ' s account balance will drop below the $25 threshold.

1. Create a Pub/Sub topic for each user

2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that

a user ' s account balance will drop below the $25 threshold

1. Build a notification system on Firebase

2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold

1 Build a notification system on Firebase

2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a user ' s account balance will drop below the $25 threshold

Question # 31

You work for a delivery company. You need to design a system that stores and manages features such as parcels delivered and truck locations over time. The system must retrieve the features with low latency and feed those features into a model for online prediction. The data science team will retrieve historical data at a specific point in time for model training. You want to store the features with minimal effort. What should you do?

Store features in Bigtable as key/value data.

Store features in Vertex Al Feature Store.

Store features as a Vertex Al dataset and use those features to tram the models hosted in Vertex Al endpoints.

Store features in BigQuery timestamp partitioned tables, and use the BigQuery Storage Read API to serve the features.

Question # 32

You are developing an ML model in a Vertex Al Workbench notebook. You want to track artifacts and compare models during experimentation using different approaches. You need to rapidly and easily transition successful experiments to production as you iterate on your model implementation. What should you do?

1 Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, and attach dataset and model artifacts as inputs and outputs to each execution.

2 After a successful experiment create a Vertex Al pipeline.

1. Initialize the Vertex SDK with the name of your experiment Log parameters and metrics for each experiment, save your dataset to a Cloud Storage bucket and upload the models to Vertex Al Model Registry.

2 After a successful experiment create a Vertex Al pipeline.

1 Create a Vertex Al pipeline with parameters you want to track as arguments to your Pipeline Job Use the Metrics. Model, and Dataset artifact types from the Kubeflow Pipelines DSL as the inputs and outputs of the components in your pipeline.

2. Associate the pipeline with your experiment when you submit the job.

1 Create a Vertex Al pipeline Use the Dataset and Model artifact types from the Kubeflow Pipelines. DSL as the inputs and outputs of the components in your pipeline.

2. In your training component use the Vertex Al SDK to create an experiment run Configure the log_params and log_metrics functions to track parameters and metrics of your experiment.

Explanation:

Vertex AI is a unified platform for building and managing machine learning solutions on Google Cloud. It provides various services and tools for different stages of the machine learning lifecycle, such as data preparation, model training, deployment, monitoring, and experimentation. Vertex AI Workbench is an integrated development environment (IDE) that allows you to create and run Jupyter notebooks on Google Cloud. You can use Vertex AI Workbench to develop your ML model in Python, using libraries such as TensorFlow, PyTorch, scikit-learn, etc. You can also use the Vertex SDK, which is a Python client library for Vertex AI, to track artifacts and compare models during experimentation. You can use the aiplatform.init function to initialize the Vertex SDK with the name of your experiment. You can use the aiplatform.start_run and aiplatform.end_run functions to create and close an experiment run. You can use the aiplatform.log_params and aiplatform.log_metrics functions to log the parameters and metrics for each experiment run. You can also use the aiplatform.log_datasets and aiplatform.log_model functions to attach the dataset and model artifacts as inputs and outputs to each experiment run. These functions allow you to record and store the metadata and artifacts of your experiments, and compare them using the Vertex AI Experiments UI. After a successful experiment, you can create a Vertex AI pipeline, which is a way to automate and orchestrate your ML workflows. You can use the aiplatform.PipelineJob class to create a pipeline job, and specify the components and dependencies of your pipeline. You can also use the aiplatform.CustomContainerTrainingJob class to create a custom container training job, and use the run method to run the job as a pipeline component. You can use the aiplatform.Model.deploy method to deploy your model as a pipeline component. You can also use the aiplatform.Model.monitor method to monitor your model as a pipeline component. By creating a Vertex AI pipeline, you can rapidly and easily transition successful experiments to production, and reuse and share your ML workflows. This solution requires minimal changes to your code, and leverages the Vertex AI services and tools to streamline your ML development process. References : The answer can be verified from official Google Cloud documentation and resources related to Vertex AI, Vertex AI Workbench, Vertex SDK, and Vertex AI pipelines.

Vertex AI | Google Cloud

Vertex AI Workbench | Google Cloud

Vertex SDK for Python | Google Cloud

Vertex AI pipelines | Google Cloud

Question # 33

You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.

Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.

Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.

Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.

Question # 34

You need to deploy a scikit-learn classification model to production. The model must be able to serve requests 24/7 and you expect millions of requests per second to the production application from 8 am to 7 pm. You need to minimize the cost of deployment What should you do?

Deploy an online Vertex Al prediction endpoint Set the max replica count to 1

Deploy an online Vertex Al prediction endpoint Set the max replica count to 100

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 1.

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 100.

Explanation:

The best option for deploying a scikit-learn classification model to production is to deploy an online Vertex AI prediction endpoint and set the max replica count to 100. This option allows you to leverage the power and scalability of Google Cloud to serve requests 24/7 and handle millions of requests per second. Vertex AI is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex AI can deploy a trained scikit-learn model to an online prediction endpoint, which can provide low-latency predictions for individual instances. An online prediction endpoint consists of one or more replicas, which are copies of the model that run on virtual machines. The max replica count is a parameter that determines the maximum number of replicas that can be created for the endpoint. By setting the max replica count to 100, you can enable the endpoint to scale up to 100 replicas when the traffic increases, and scale down to zero replicas when the traffic decreases. This can help minimize the cost of deployment, as you only pay for the resources that you use. Moreover, you can use the autoscaling algorithm option to optimize the scaling behavior of the endpoint based on the latency and utilization metrics 1 .

The other options are not as good as option B, for the following reasons:

Option A: Deploying an online Vertex AI prediction endpoint and setting the max replica count to 1 would not be able to serve requests 24/7 and handle millions of requests per second. Setting the max replica count to 1 would limit the endpoint to only one replica, which can cause performance issues and service disruptions when the traffic increases. Moreover, se tting the max replica count to 1 would prevent the endpoint from scaling down to zero replicas when the traffic decreases, which can increase the cost of deployment, as you pay for the resources that you do not use 1 .

Option C: Deploying an online Vertex AI prediction endpoint with one GPU per replica and setting the max replica count to 1 would not be able to serve requests 24/7 and handle millions of requests per second, and would increase the cost of deployment. Adding a GPU to each replica would increase the computational power of the endpoint, but it would also in crease the cost of deployment, as GPUs are more expensive than CPUs. Moreover, setting the max replica count to 1 would limit the endpoint to only one replica, which can cause performance issues and service disruptions whe n the traffic increases, and prevent the endpoint from scaling down to zero replicas when the traffic decreases 1 . Furthermore, scikit-learn models do not benefit from GPUs, as scikit-learn is not optimized for GPU acceleration 2 .

Option D: Deploying an online Vertex AI prediction endpoint with one GPU per replica and setting the max replica count to 100 would be able to serve requests 24/7 and handle millions of requests per second, but it would increase the cost of deployment. Adding a GPU to each replica would increase the computational power of the endpoint, but it would also increase the cost of deployment, as GPUs are more expensive than CPUs. Setting the max replica count to 100 would enable the endpoint to scale up to 100 replicas when the traffic increases, and scale down to zero replicas when the traffic decreases, which can help minimize the cost of deployment. However , scikit-learn models do not benefit from GPUs, as scikit-learn is not optimized for GPU acceleration 2 . Therefore, using GPUs for scikit-learn models would be unnecessary and wasteful.

[References:, Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML Systems, Week 2: Serving ML Predictions, Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in production, 3.1 Deploying ML models to production, Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6: Production ML Systems, Section 6.2: Serving ML Predictions, Online prediction, Scaling online prediction, scikit-learn FAQ, ]

Question # 35

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

Normalize the data using Google Kubernetes Engine

Translate the normalization algorithm into SQL for use with BigQuery

Use the normalizer_fn argument in TensorFlow ' s Feature Column API

Normalize the data with Apache Spark using the Dataproc connector for BigQuery

Explanation:

Z-score normalization is a technique that transforms the values of a numeric variable into standardized units, such that the mean is zero and the standard deviation is one. Z-score normalization can help to compare variables with different scales and ranges, and to reduce the effect of outliers and skewness. The formula for z-score normalization is:

z = (x - mu) / sigma

where x is the original value, mu is the mean of the variable, and sigma is the standard deviation of the variable.

Dataflow is a service that allows you to create and run data processing pipelines on Google Cloud. You can use Dataflow to preprocess raw data prior to model training and prediction, such as applying z-score normalization on data stored in BigQuery. However, using Dataflow for this task may not be the most efficient option, as it involves reading and writing data from and to BigQuery, which can be time-consuming and costly. Moreover, using Dataflow requires manual intervention to update the pipeline whenever new training data is added.

A more efficient way to perform z-score normalization on data stored in BigQuery is to translate the normalization algorithm into SQL and use it with BigQuery. BigQuery is a service that allows you to analyze large-scale and complex data using SQL queries. You can use BigQuery to perform z-score normalization on your data using SQL functions such as AVG(), STDDEV_POP(), and OVER(). For example, the following SQL query can normalize the values of a column called temperature in a table called weather:

SELECT (temperature - AVG(temperature) OVER ()) / STDDEV_POP(temperature) OVER () AS normalized_temperature FROM weather;

By using SQL to perform z-score normalization on BigQuery, you can make the process more efficient by minimizing computation time and manual intervention. You can also leverage the scalability and performance of BigQuery to handle large and complex datasets. Therefore, translating the normalization algorithm into SQL for use with BigQuery is the best option for this use case.

Question # 36

You work for a telecommunications company You ' re building a model to predict which customers may fail to pay their next phone bill. The purpose of this model is to proactively offer at-risk customers assistance such as service discounts and bill deadline extensions. The data is stored in BigQuery, and the predictive features that are available for model training include

- Customer_id -Age

- Salary (measured in local currency) -Sex

-Average bill value (measured in local currency)

- Number of phone calls in the last month (integer) -Average duration of phone calls (measured in minutes)

You need to investigate and mitigate potential bias against disadvantaged groups while preserving model accuracy What should you do?

Determine whether there is a meaningful correlation between the sensitive features and the other features Train a BigQuery ML boosted trees classification model and exclude the sensitive features and any meaningfully correlated features

Train a BigQuery ML boosted trees classification model with all features Use the ml. global explain method to calculate the global attribution values for each feature of the model If the feature importance value for any of the sensitive features exceeds a threshold, discard the model and tram without this feature

Train a BigQuery ML boosted trees classification model with all features Use the ml. exflain_predict method to calculate the attribution values for each feature for each customer in a test set If for any individual customer the importance value for any feature exceeds a predefined threshold, discard the model and train the model again without this feature.

Define a fairness metric that is represented by accuracy across the sensitive features Train a BigQuery ML boosted trees classification model with all features Use the trained model to make predictions on a test set Join the data back with the sensitive features, and calculate a fairness metric to investigate whether it meets your requirements.

Explanation:

A fairness metric is a way to measure how well a machine learning model treats different groups of customers, such as by sex or age. A common fairness metric is accuracy, which is the proportion of correct predictions among all predictions. Accuracy across the sensitive features means calculating the accuracy for each group separately, and then comparing them. For example, if the model has 90% accuracy for male customers and 80% accuracy for female customers, there is a 10% accuracy gap that indicates potential bias against female customers.

To investigate and mitigate potential bias, it is important to define a fairness metric and evaluate it on a test set. A test set is a subset of the data that is not used for training the model, but only for evaluating its performance. By joining the test set predictions with the sensitive features, you can calculate the fairness metric and see if it meets your requirements. For example, you may require that the accuracy gap between any two groups is less than 5%. If the fairness metric does not meet your requirements, you may need to adjust the model or the data to reduce bias.

Option A is not the best answer because excluding the sensitive features and any meaningfully correlated features may not eliminate bias. For example, if salary is correlated with sex, and salary is also a predictive feature for the target variable, excluding both features may reduce the model accuracy and still leave some residual bias. Moreover, excluding features based on correlation may not capture the complex interactions and dependencies among the features that may affect bias.

Option B is not the best answer because using the global attribution values for each feature of the model may not reflect the individual-level impact of the features on the predictions. Global attribution values are calculated by averaging the attribution values across all the data points, and they indicate how important each feature is for the overall model performance. However, they do not show how each feature affects each customer’s prediction, which may vary depending on the values of the other features. For example, sex may have a low global attribution value, but it may have a high impact on some customers’ predictions, especially if it interacts with other features such as salary or age.

Option C is not the best answer because discarding the model and training the model again without a feature based on a single customer’s attribution value may not be a robust or scalable way to mitigate bias. Attribution values are calculated by measuring how much each feature contributes to the prediction for a given data point, and they indicate how sensitive the prediction is to the feature value. However, they do not show how the feature affects the overall fairness metric or the model accuracy. For example, sex may have a high attribution value for a customer, but it may not affect the accuracy gap between the groups. Moreover, discarding and retraining the model based on a single customer’s attribution value may not be feasible if there are many customers with high attribution values for different features.

Question # 37

You work for an online publisher that delivers news articles to over 50 million readers. You have built an AI model that recommends content for the company’s weekly newsletter. A recommendation is considered successful if the article is opened within two days of the newsletter’s published date and the user remains on the page for at least one minute.

All the information needed to compute the success metric is available in BigQuery and is updated hourly. The model is trained on eight weeks of data, on average its performance degrades below the acceptable baseline after five weeks, and training time is 12 hours. You want to ensure that the model’s performance is above the acceptable baseline while minimizing cost. How should you monitor the model to determine when retraining is necessary?

Use Vertex AI Model Monitoring to detect skew of the input features with a sample rate of 100% and a monitoring frequency of two days.

Schedule a cron job in Cloud Tasks to retrain the model every week before the newsletter is created.

Schedule a weekly query in BigQuery to compute the success metric.

Schedule a daily Dataflow job in Cloud Composer to compute the success metric.

Explanation:

The best option for monitoring the model to determine when retraining is necessary is to schedule a weekly query in BigQuery to compute the success metric. This option has the following advantages:

It allows the model performance to be evaluated regularly, based on the actual outcome of the recommendations. By computing the success metric, which is the percentage of articles that are opened within two days and read for at least one minute, you can measure how well the model is achieving its objective and compare it with the acceptable baseline.

It leverages the scalability and efficiency of BigQuery, which is a serverless, fully managed, and highly scalable data warehouse that can run complex queries over petabytes of data in seconds. By using BigQuery, you can access and analyze all the information needed to compute the success metric, such as the newsletter publication date, the article opening date, and the user reading time, without worrying about the infrastructure or the cost.

It simplifies the model monitoring and retraining workflow, as the weekly query can be scheduled and executed automatically using BigQuery’s built-in scheduling feature. You can also set up alerts or notifications to inform you when the success metric falls below the acceptable baseline, and trigger the model retraining process accordingly.

The other options are less optimal for the following reasons:

Option A: Using Vertex AI Model Monitoring to detect skew of the input features with a sample rate of 100% and a monitoring frequency of two days introduces additional complexity and overhead. This option requires setting up and managing a Vertex AI Model Monitoring service, which is a managed service that provides various tools and features for machine learning, such as training, tuning, serving, and monitoring. However, using Vertex AI Model Monitoring to detect skew of the input features may not reflect the actual performance of the model, as skew is the discrepancy between the distributions of the features in the training dataset and the serving data, which may not affect the outcome of the recommendations. Moreover, using a sample rate of 100% and a monitoring frequency of two days may incur unnecessary cost and latency, as it requires analyzing all the input features every two days, which may not be needed for the model monitoring.

Option B: Scheduling a cron job in Cloud Tasks to retrain the model every week before the newsletter is created introduces additional cost and risk. This option requires creating and running a cron job in Cloud Tasks, which is a fully managed service that allows you to schedule and execute tasks that are invoked by HTTP requests. However, using Cloud Tasks to retrain the model every week may not be optimal, as it may retrain the model more often than necessary, wasting compute resources and cost. Moreover, using Cloud Tasks to retrain the model before the newsletter is created may introduce risk, as it may deploy a new model version that has not been tested or validated, potentially affecting the quality of the recommendations.

Option D: Scheduling a daily Dataflow job in Cloud Composer to compute the success metric introduces additional complexity and cost. This option requires creating and running a Dataflow job in Cloud Composer, which is a fully managed service that runs Apache Airflow pipelines for workflow orchestration. Dataflow is a fully managed service that runs Apache Beam pipelines for data processing and transformation. However, using Dataflow and Cloud Composer to compute the success metric may not be necessary, as it may add more steps and overhead to the model monitoring process. Moreover, using Dataflow and Cloud Composer to compute the success metric daily may not be optimal, as it may compute the success metric more often than needed, consuming more compute resources and cost.

[:, [BigQuery documentation], [Vertex AI Model Monitoring documentation], [Cloud Tasks documentation], [Cloud Composer documentation], [Dataflow documentation], ]

Question # 38

You built a custom ML model using scikit-learn. Training time is taking longer than expected. You decide to migrate your model to Vertex AI Training, and you want to improve the model’s training time. What should you try out first?

Migrate your model to TensorFlow, and train it using Vertex AI Training.

Train your model in a distributed mode using multiple Compute Engine VMs.

Train your model with DLVM images on Vertex AI, and ensure that your code utilizes NumPy and SciPy internal methods whenever possible.

Train your model using Vertex AI Training with GPUs.

Explanation:

Option A is incorrect because migrating your model to TensorFlow, and training it using Vertex AI Training, is not the easiest way to improve the model’s training time. TensorFlow is a framework that allows you to create and train ML models using Python or other languages. Vertex AI Training is a service that allows you to train and optimize ML models using built-in algorithms or custom containers. However, this option requires significant code changes, as TensorFlow and scikit-learn have different APIs and functionalities. Moreover, this option does not leverage the parallelism or the scalability of the cloud, as it only uses a single instance.

Option B is incorrect because training your model in a distributed mode using multiple Compute Engine VMs, is not the most convenient way to improve the model’s training time. Compute Engine is a service that allows you to create and manage virtual machines that run on Google Cloud. You can use Compute Engine to run your scikit-learn model in a distributed mode, by using libraries such as Dask or Joblib. However, this option requires more effort and resources than option D, as it involves creating and configuring the VMs, installing and maintaining the libraries, and writing and running the distributed code.

Option C is incorrect because training your model with DLVM images on Vertex AI, and ensuring that your code utilizes NumPy and SciPy internal methods whenever possible, is not the most effective way to improve the model’s training time. DLVM (Deep Learning Virtual Machine) images are preconfigured VM images that include popular ML frameworks and tools, such as TensorFlow, PyTorch, or scikit-learn 1 . You can use DLVM images on Vertex AI to train your scikit-learn model, by using a custom container. NumPy and SciPy are libraries that provide numerical and scientific computing functionalities for Python. You can use NumPy and SciPy internal methods to optimize your scikit-learn code, as they are faster and more efficient than pure Python code 2 . However, this option does not leverage the parallelism or the scalability of the cloud, as it only uses a single instance. Moreover, thi s option may not have a significant impact on the training time, as scikit-learn already relies on NumPy and SciPy for most of its operations 3 .

Option D is correct because training your model using Vertex AI Training with GPUs, is the best way to improve the model’s training time. A GPU (Graphics Processing Unit) is a hardware acceler ator that can perform parallel computations faster than a CPU (Central Processing Unit) 4 . Vertex AI Training is a service that allows you to train and optimize ML models using built-in algorithms or custom containers. You can use Vertex AI Training with GPUs to train your scikit-learn model, by using a custom container and specify ing the accelerator type and count 5 . By using Vertex AI Training with GPUs, you can leverage the parallelism and the scalability of the cloud, and speed up the training process significantly, without changing your code.

[References:, DLVM images, NumPy and SciPy, scikit-learn dependencies, GPU overview, Vertex AI Training with GPUs, [scikit-learn overview], [TensorFlow overview], [Compute Engine overview], [Dask overview], [Joblib overview], [Vertex AI Training overview], ]

Question # 39

You work on a team that builds state-of-the-art deep learning models by using the TensorFlow framework. Your team runs multiple ML experiments each week which makes it difficult to track the experiment runs. You want a simple approach to effectively track, visualize and debug ML experiment runs on Google Cloud while minimizing any overhead code. How should you proceed?

Set up Vertex Al Experiments to track metrics and parameters Configure Vertex Al TensorBoard for visualization.

Set up a Cloud Function to write and save metrics files to a Cloud Storage Bucket Configure a Google Cloud VM to host TensorBoard locally for visualization.

Set up a Vertex Al Workbench notebook instance Use the instance to save metrics data in a Cloud Storage bucket and to host TensorBoard locally for visualization.

Set up a Cloud Function to write and save metrics files to a BigQuery table. Configure a Google Cloud VM to host TensorBoard locally for visualization.

Question # 40

You work for a retail company. You have a managed tabular dataset in Vertex Al that contains sales data from three different stores. The dataset includes several features such as store name and sale timestamp. You want to use the data to train a model that makes sales predictions for a new store that will open soon You need to split the data between the training, validation, and test sets What approach should you use to split the data?

Use Vertex Al manual split, using the store name feature to assign one store for each set.

Use Vertex Al default data split.

Use Vertex Al chronological split and specify the sales timestamp feature as the time vanable.

Use Vertex Al random split assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set.

Explanation:

The best option for splitting the data between the training, validation, and test sets, using a managed tabular dataset in Vertex AI that contains sales data from three different stores, is to use Vertex AI default data split. This option allows you to leverage the power and simplicity of Vertex AI to automatically and randomly split your data into the three sets by percentage. Vertex AI is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex AI can support various types of models, such as linear regression, logistic regression, k-means clustering, matrix factorization, and deep neural networks. Vertex AI can also provide various tools and services for data analysis, model development, model deployment, model monitoring, and model governance. A default data split is a data split method that is provided by Vertex AI, and does not require any user input or configuration. A default data split can help you split your data into the training, validation, and test sets by using a random sampling method, and assign a fixed percentage of the data to each set. A default data split can help you simplify the data split process, and works well in most cases. A training set is a subset of the data that is used to train the model, and adjust the model parameters. A training set can help you learn the relationship between the input features and the target variable, and optimize the model performance. A validation set is a subset of the data that is used to validate the model, and tune the model hyperparameters. A validation set can help you evaluate the model performance on unseen data, and avoid overfitting or underfitting. A test set is a subset of the data that is used to test the model, and provide the final evaluation metrics. A test set can help you assess the model performance on new data, and measure the generalization ability of the model. By using Vertex AI default data split, you can split your data into the training, validation, and test sets by using a random sampling method, and assign the following percentages of the data to each set 1 :

The other options are not as good as option B, for the following reasons:

Option A: Using Vertex AI manual split, using the store name feature to assign one store for each set would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. A manual split is a data split method that allows you to control how your data is split into sets, by using the ml_use label or the data filter expression. A manual split can help you customize the data split logic, and handle complex or non-standard data formats. A store name feature is a feature that indicates the name of the store where the sales data was collected. A store name feature can help you identify the source of the data, and group the data by store. However, using Vertex AI manual split, using the store name feature to assign one store for each set would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. You would need to write code, create and configure the ml_use label or the data filter expression, and assign one store for each set. Moreover, this option would not ensure that the data in each set has the same distribution and characteristics as the data in the whole dataset, which could prevent you from learning the general pattern o f the data, and cause bias or variance in the model 2 .

Option C: Using Vertex AI chronological split and specifying the sales timestamp feature as the time variable would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. A chronological split is a data split method that allows you to split your data into sets based on the order of the data. A chronological split can help you preserve the temporal dependency and sequence of the data, and avoid data leakage. A sales timestamp feature is a feature that indicates the date and time when the sales data was collected. A sales timestamp feature can help you track the changes and trends of the data over time, and capture the seasonality and cyclicality of the data. However, using Vertex AI chronological split and specifying the sales timestamp feature as the time variable would not allow you to split your data into representative and balanced sets, and could cause errors or poor performance. You would need to write code, create and configure the time variable, and split the data by the order of the time variable. Moreover, this option would not ensure that the data in each set has the same distribution and characteristics as the data in the whole dataset, which could prevent you from learning the general pattern of the data, and cause bias or variance in the model 3 .

Option D: Using Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set would not allow you to use the default data split method that is provided by Vertex AI, and could increase the complexity and cost of the data split process. A random split is a data split method that allows you to split your data into sets by using a random sampling method, and assign a custom percentage of the data to each set. A random split can help you split your data into representative and balanced sets, and avoid data leakage. However, using Vertex AI random split, assigning 70% of the rows to the training set, 10% to the validation set, and 20% to the test set would not allow you to use the default data split method that is provided by Vertex AI, and could increase the complexity and cost of the data split process. You would need to write code, create and configure the random split method, and assign the custom percentages to each set. Moreover, this option would not use the default data split method that is provided by Vertex AI, which can simplify the data split process, and works well in most cases 1 .

[References:, About data splits for AutoML models | Vertex AI | Google Cloud, Manual split for unstructured data, Mathematical split, ]

Question # 41

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually

takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

Use AI Platform to run distributed training jobs with checkpoints.

Use AI Platform to run distributed training jobs without checkpoints.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs with checkpoints.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs without checkpoints.

Question # 42

You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?

Attach a GPU to the prediction nodes.

Increase the number of workers in your model server.

Schedule scaling of the nodes to match expected demand.

Increase the minReplicaCount in your DeployedModel configuration.

Explanation:

Auto scaling is a feature that allows you to automatically adjust the number of prediction nodes based on the traffic and load of your deployed model 1 . However, auto scaling depends on the CPU utilization of your prediction nodes, which is the percentage of CPU resources used by your model server 1 . If your CPU utilization is low, even during periods of high load, it means that your model server is not fully utilizing the available CPU resources, and thus au to scaling will not trigger more replicas 2 .

On e possible reason for low CPU utilization is that your model server is using a single worker process to handle prediction requests 3 . A worker process is a subprocess that runs your model code and handles prediction requests 3 . If you have only one worker process, it can only handle one request at a time, which can lead to dropped requests when the traffic is high 3 . To increase the CPU utilization and the throughput of your model server, you can increase the number of worker processes, which will allow your model server to handle multiple requests in parallel 3 .

To increase the number of workers i n your model server, you need to modify your custom model server code and use the --workers flag to specify the number of worker processes you want to use 3 . For example, if you are using a Gunicorn server, you can use the following command to start your model server with four worker processes:

gunicorn --bind :$PORT --workers 4 --threads 1 --timeout 60 main:app

By increasing the number of workers in your model server, you can increase the CPU utilization of your prediction nodes, and thus enable auto scaling to scale beyond one replica.

The other options are not suitable for your scenario, because they either do not address the root cause of low CPU utilization, such as attaching a GPU or scheduling scaling, or they do not enable auto scaling, such as increasing the minReplicaCount, which is a fixed number of nodes that will always run regardless of the traffi c 1 .

Question # 43

You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (Pll) to Google Cloud. You want to use the Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the Pll is not accessible by unauthorized individuals?

Stream all files to Google CloudT and then write the data to BigQuery Periodically conduct a bulk scan of the table using the DLP API.

Stream all files to Google Cloud, and write batches of the data to BigQuery While the data is being written to BigQuery conduct a bulk scan of the data using the DLP API.

Create two buckets of data Sensitive and Non-sensitive Write all data to the Non-sensitive bucket Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket

Create three buckets of data: Quarantine, Sensitive, and Non-sensitive Write all data to the Quarantine bucket.

Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket

Question # 44

Your organization wants you to compare various, widely available ML models for Gen AI use cases. The models you plan to compare are also available on Google Cloud. You have received curated internal benchmark datasets from several teams for their specific use cases and tasks. You need to submit a comprehensive report of your recommendations. You want to evaluate the models using the most efficient approach. What should you do?

Use Model Garden to deploy the candidate models to Vertex AI endpoints. Use the Gen AI Evaluation Service API to evaluate the performance of each deployed model on the internal benchmark datasets. Report the best models based on the experiments.

Stream raw data from open-source large language model leaderboards into a BigQuery dataset. Send the data to an internal Looker Studio dashboard. Evaluate the performance of each model by using open-source datasets that are similar to the internal benchmark datasets. Report the best models based on the dashboard metrics.

Review the model cards in Model Garden to evaluate each model ' s performance on open-source datasets that are similar to the internal benchmark datasets. Report the best models based on your analysis.

Download model weights from the respective provider website for each model. Write an inference script to deploy the candidate models to Vertex AI endpoints. Write an evaluation script to compare all deployed models on the internal benchmark datasets by using Vertex AI Experiments. Report the best models based on the experiments.

Question # 45

You are training a custom language model for your company using a large dataset. You plan to use the ReductionServer strategy on Vertex Al. You need to configure the worker pools of the distributed training job. What should you do?

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs Configure the third worker pool to have GPUs: and use the reduction server container image.

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.

Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators and choose a machine type that prioritizes bandwidth.

Configure the machines of the first two pools to have TPUs. and to use a container image where your training code runs Configure the third pool to have TPUs: and use the reductionserver container image.

Google Professional-Machine-Learning-Engineer Premium Access Download Demo

Page: 2 / 5
Total 296 questions

Summer Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmaspas7

Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer Free Practice Exam Questions (2026 Updated)

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation: