Data-Engineer-Associate Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Free Practice Exam Questions (2026 Updated)

Prepare effectively for your Amazon Web Services Data-Engineer-Associate AWS Certified Data Engineer - Associate (DEA-C01) certification with our extensive collection of free, high-quality practice questions. Each question is designed to mirror the actual exam format and objectives, complete with comprehensive answers and detailed explanations. Our materials are regularly updated for 2026, ensuring you have the most current resources to build confidence and succeed on your first attempt.

Amazon Web Services Data-Engineer-Associate Premium Access Download Demo

Page: 5 / 5
Total 302 questions

Question # 86

A company uses Amazon S3 buckets, AWS Glue tables, and Amazon Athena as components of a data lake. Recently, the company expanded its sales range to multiple new states. The company wants to introduce state names as a new partition to the existing S3 bucket, which is currently partitioned by date.

The company needs to ensure that additional partitions will not disrupt daily synchronization between the AWS Glue Data Catalog and the S3 buckets.

Which solution will meet these requirements with the LEAST operational overhead?

Use the AWS Glue API to manually update the Data Catalog.

Run an MSCK REPAIR TABLE command in Athena.

Schedule an AWS Glue crawler to periodically update the Data Catalog.

Run a REFRESH TABLE command in Athena.

Question # 87

A company stores data in a data lake that is in Amazon S3. Some data that the company stores in the data lake contains personally identifiable information (PII). Multiple user groups need to access the raw data. The company must ensure that user groups can access only the PII that they require.

Which solution will meet these requirements with the LEAST effort?

Use Amazon Athena to query the data. Set up AWS Lake Formation and create data filters to establish levels of access for the company ' s IAM roles. Assign each user to the IAM role that matches the user ' s PII access requirements.

Use Amazon QuickSight to access the data. Use column-level security features in QuickSight to limit the PII that users can retrieve from Amazon S3 by using Amazon Athena. Define QuickSight access levels based on the PII access requirements of the users.

Build a custom query builder UI that will run Athena queries in the background to access the data. Create user groups in Amazon Cognito. Assign access levels to the user groups based on the PII access requirements of the users.

Create IAM roles that have different levels of granular access. Assign the IAM roles to IAM user groups. Use an identity-based policy to assign access levels to user groups at the column level.

Explanation:

Amazon Athena is a serverless, interactive query service that enables you to analyze data in Amazon S3 using standard SQL. AWS Lake Formation is a service that helps you build, secure, and manage data lakes on AWS. You can use AWS Lake Formation to create data filters that define the level of access for different IAM roles based on the columns, rows, or tags of the data. By using Amazon Athena to query the data and AWS Lake Formation to create data filters, the company can meet the requirements of ensuring that user groups can access only the PII that they require with the least effort. The solution is to use Amazon Athena to query the data in the data lake that is in Amazon S3. Then, set up AWS Lake Formation and create data filters to establish levels of access for the company’s IAM roles. For example, a data filter can allow a user group to access only the columns that contain the PII that they need, such as name and email address, and deny access to the columns that contain the PII that they do not need, such as phone number and social security number. Finally, assign each user to the IAM role that matches the user’s PII access requirements. This way, the user groups can access the data in the data lake securely and efficiently. The other options are either not feasible or not optimal. Using Amazon QuickSight to access the data (option B) would require the company to pay for the QuickSight service and to configure the column-level security features for each user. Building a custom query builder UI that will run Athena queries in the background to access the data (option C) would require the company to develop and maintain the UI and to integrate it with Amazon Cognito. Creating IAM roles that have different levels of granular access (option D) would require the company to manage multiple IAM roles and policies and to ensure that they are aligned with the data schema. References:

Amazon Athena

AWS Lake Formation

AWS Certified Data Engineer - Associate DEA-C01 Complete Study Guide, Chapter 4: Data Analysis and Visualization, Section 4.3: Amazon Athena

Question # 88

A data engineer needs to debug an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The data engineer enabled the bookmark feature for the AWS Glue job. The data engineer has set the maximum concurrency for the AWS Glue job to 1.

The AWS Glue job is successfully writing the output to Amazon Redshift. However, the Amazon S3 files that were loaded during previous runs of the AWS Glue job are being reprocessed by subsequent runs.

What is the likely reason the AWS Glue job is reprocessing the files?

The AWS Glue job does not have the s3:GetObjectAcl permission that is required for bookmarks to work correctly.

The maximum concurrency for the AWS Glue job is set to 1.

The data engineer incorrectly specified an older version of AWS Glue for the Glue job.

The AWS Glue job does not have a required commit statement.

Question # 89

A data engineer notices slow query performance on a highly partitioned table that is in Amazon Athena. The table contains daily data for the previous 5 years, partitioned by date. The data engineer wants to improve query performance and to automate partition management. Which solution will meet these requirements?

Use an AWS Lambda function that runs daily. Configure the function to manually create new partitions in AW5 Glue for each day ' s data.

Use partition projection in Athena. Configure the table properties by using a date range from 5 years ago to the present.

Reduce the number of partitions by changing the partitioning schema from dairy to monthly granularity.

Increase the processing capacity of Athena queries by allocating more compute resources.

Question # 90

A company stores a 100 MB dataset in an Amazon S3 bucket as an Apache Parquet file. A data engineer needs to profile the data before performing data preparation steps on the data.

Which solution will meet this requirement in the MOST operationally efficient way?

Create a profile job on the dataset in AWS Glue DataBrew. Review the profile job results.

Stream the data into Amazon Managed Service for Apache Flink for SQL queries. Use the Apache Flink dashboard to profile the data.

Ingest the data into Amazon Redshift Spectrum. Use SQL queries to profile the data.

Load the data into an Amazon QuickSight dataset. Build a topic to profile the data with questions.

Question # 91

A data engineer needs to join data from multiple sources to perform a one-time analysis job. The data is stored in Amazon DynamoDB, Amazon RDS, Amazon Redshift, and Amazon S3.

Which solution will meet this requirement MOST cost-effectively?

Use an Amazon EMR provisioned cluster to read from all sources. Use Apache Spark to join the data and perform the analysis.

Copy the data from DynamoDB, Amazon RDS, and Amazon Redshift into Amazon S3. Run Amazon Athena queries directly on the S3 files.

Use Amazon Athena Federated Query to join the data from all data sources.

Use Redshift Spectrum to query data from DynamoDB, Amazon RDS, and Amazon S3 directly from Redshift.

Question # 92

A company runs a data pipeline that uses AWS Step Functions to orchestrate AWS Lambda functions and AWS Glue jobs. The Lambda functions and AWS Glue jobs require access to multiple Amazon RDS databases. The Lambda functions and AWS Glue jobs already have access to the VPC that hosts the RDS databases.

Which solution will meet these requirements in the MOST secure way?

Use the root user of the company’s AWS account to create long-term access keys for the RDS databases. Include the access keys programmatically in the Lambda functions and AWS Glue jobs. Generate new keys every 90 days.

Create an IAM role that has permissions to access the RDS databases. Create a second IAM role for the Lambda functions and AWS Glue jobs that has permissions to assume the IAM role that has access permissions for the RDS databases.

Create an IAM user that can assume IAM roles that have permissions and credentials to access the RDS databases. Assign the IAM user to each of the Lambda functions and AWS Glue jobs.

Create Java Database Connectivity (JDBC) connections between the Lambda functions and AWS Glue jobs and the RDS databases. In the connection string, include the necessary credentials.

Question # 93

A company runs an AWS Glue workflow every day to process time series data from an Amazon S3 bucket. The workflow loads the data into an Amazon Redshift Serverless table. The company observes that some of the jobs in the workflow occasionally fail.

A data engineer must receive a notification when the Redshift table does not contain the most recent data.

Which solution will meet this requirement in the MOST operationally efficient way?

Configure an Amazon EventBridge Scheduler to run an Amazon Macie job to scan the Redshift table for data freshness. Configure Macie to notify an Amazon Simple Notification Service (Amazon SNS) topic when an AWS Glue job fails.

Schedule an AWS Glue Data Quality job to check the freshness of the data. Create an Amazon EventBridge rule to notify an Amazon Simple Notification Service (Amazon SNS) topic when a data quality rule fails.

Load AWS Glue job logs to an Amazon S3 bucket. Configure an Amazon CloudWatch alarm to send a notification when the job logs in the S3 bucket contain Job.State=FAILED.

Create an Amazon CloudWatch dashboard that displays a metric named Failed AWS Glue Jobs that counts AWS Glue job failures during the previous day. Set a CloudWatch alarm to send a notification when the metric value exceeds zero.

Question # 94

A data engineer needs to analyze time-sensitive sales data. The company stores the data in an Amazon S3 bucket. The data engineer uses AWS Glue Data Catalog to access the data.

When performing the analysis, the data engineer notices that some records are missing or out of date.

What is the likely cause of these issues?

AWS Glue Data Catalog is not up to date with the latest S3 partition changes.

Incorrect IAM roles are assigned to the AWS Glue jobs.

Versioning is not enabled on the S3 bucket.

The AWS Glue job schedules overlap with one another.

Amazon Web Services Data-Engineer-Associate Premium Access Download Demo

Page: 5 / 5
Total 302 questions

Summer Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmaspas7

Data-Engineer-Associate Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Free Practice Exam Questions (2026 Updated)

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation: