Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmaspas7

Easiest Solution 2 Pass Your Certification Exams

ACA-BigData1 Alibaba Cloud ACA Big Data Certification Exam Free Practice Exam Questions (2025 Updated)

Prepare effectively for your Alibaba Cloud ACA-BigData1 ACA Big Data Certification Exam certification with our extensive collection of free, high-quality practice questions. Each question is designed to mirror the actual exam format and objectives, complete with comprehensive answers and detailed explanations. Our materials are regularly updated for 2025, ensuring you have the most current resources to build confidence and succeed on your first attempt.

Page: 1 / 2
Total 78 questions

Scenario: Jack is the administrator of project prj1. The project involves a large volume of

sensitive data such as bank account, medical record, etc. Jack wants to properly protect

the data. Which of the follow statements is necessary?

A.

set ProjectACL=true;

B.

add accountprovider ram;

C.

set ProjectProtection=true;

D.

use prj1;

In order to ensure smooth processing of tasks in the Dataworks data development kit, you must

create an AccessKey. An AccessKey is primarily used for access permission verification between various

Alibaba Cloud products. The AccessKey has two parts, they are ____. (Number of correct answers: 2)

Score 2

A.

Access Username

B.

Access Key ID

C.

Access Key Secret

D.

Access Password

If a task node of DataWorks is deleted from the recycle bin, it can still be restored.

A.

True

B.

False

In each release of E-MapReduce, the software and software version are flexible. You can select

multiple software versions.

Score 1

A.

True

B.

False

Users can use major BI tools, such as Tablueu and FineReport, to easily connect to MaxCompute

projects, and perform BI analysis or ad hoc queries. The quick query feature in MaxCompute is called

_________ allows you to provide services by encapsulating project table data in APIs, supporting diverse

application scenarios without data migration.

Score 2

A.

Lightning

B.

MaxCompute Manager

C.

Tunnel

D.

Labelsecurity

A Log table named log in MaxCompute is a partition table, and the partition key is dt. Anew partition is created daily to store the new data of that day. Now we have one

month's data, starting from dt='20180101' to dt='20180131', and we may use ________

to delete the data on 20180101.

A.

delete from log where dt='20180101'

B.

truncate table where dt='20180101'

C.

drop partition log (dt='20180101')

D.

alter table log drop partition(dt='20180101')

A distributed file system like GFS and Hadoop are design to have much larger block(or chunk) size

like 64MB or 128MB, which of the following descriptions are correct? (Number of correct answers: 4)

Score 2

A.

It reduces clients' need to interact with the master because reads and writes on the same block( or

chunck) require only one initial request to the master for block location information

B.

Since on a large block(or chunk), a client is more likely to perform many operations on a given block, it

can reduce network overhead by keeping a persistent TCP connection to the metadata server over an

extended period of time

C.

It reduces the size of the metadata stored on the master

D.

The servers storing those blocks may become hot spots if many clients are accessing the same small

files

E.

If necessary to support even larger file systems, the cost of adding extra memory to the meta data

server is a big price

A business flow in DataWorks integrates different node task types by business type, such a structure

improves business code development facilitation. Which of the following descriptions about the node

type is INCORRECT?

Score 2

A.

A zero-load node is a control node that does not generate any data. The virtual node is generally used

as the root node for planning the overall node workflow.

B.

An ODPS SQL task allows you to edit and maintain the SQL code on the Web, and easily implement

code runs, debug, and collaboration.

C.

The PyODPS node in DataWorks can be integrated with MaxCompute Python SDK. You can edit the

Python code to operate MaxCompute on a PyODPS node in DataWorks.

D.

The SHELL node supports standard SHELL syntax and the interactive syntax. The SHELL task can run on

the default resource group

Which of the following task types does DataWorks support?

(Number of correct answers: 4)

A.

Data Synchronization

B.

SHELL

C.

MaxCompute SQL

D.

MaxCompute MR

E.

Scala

Apache Spark included in Alibaba E-MapReduce(EMR) is a fast and general-purpose cluster computing

system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports

general execution graphs. It also supports a rich set of higher-level tools. Which of the following tools

does not be included in Spark?

Score 2

A.

Spark SQL for SQL and structured data processing

B.

MLlib for machine learning

C.

GraphX for graph processing

D.

TensorFlow for AI

AliOrg Company plans to migrate their data with virtually no downtime. They want all the data

changes to the source database that occur during the migration are continuously replicated to the

target, allowing the source database to be fully operational during the migration process. After the

database migration is completed, the target database will remain synchronized with the source for as

long as you choose, allowing you to switch over the database at a convenient time. Which of the

following Alibaba products is the right choice for you to do it:

Score 2

A.

Log Service

B.

DTS(Data Transmission Service)

C.

Message Service

D.

CloudMonitor

An enterprise uses Alibaba Cloud MaxCompute for storage of service orders, system logs and

management data. Because the security levels for the data are different, it is needed to register multiple

Alibaba Cloud accounts for data management.

Score 1

A.

True

B.

False

Which node type in DataWorks can edit the Python code to operate data in MaxCompute?

Score 2

A.

PyODPS

B.

ODPS MR Node

C.

ODPS Script Node

D.

SHELL node

Alibaba Cloud Elastic MapReduce (E-MapReduce) is a big data processing solution to quickly process

huge amounts of data. Based on open source Apache Hadoop and Apache Spark, E-MapReduce flexibly

manages your big data use cases such as trend analysis, data warehousing, and analysis of continuously

streaming data.

Score 1

A.

True

B.

False

If a MySQL database contains 100 tables, and jack wants to migrate all those tables to MaxCompute

using DataWorks Data Integration, the conventional method would require him to configure 100 data

synchronization tasks. With _______ feature in DataWorks, he can upload all tables at the same time.

Score 2

A.

Full-Database Migration feature

B.

Configure a MySQL Reader plug-in

C.

Configure a MySQL Writer plug-in

D.

Add data sources in Bulk Mode

Your company stores user profile records in an OLTP databases. You want to join these records with

web server logs you have already ingested into the Hadoop file system. What is the best way to obtain

and ingest these user records?

Score 2

A.

Ingest with Hadoop streaming

B.

Ingest using Hive

C.

Ingest with sqoop import

D.

Ingest with Pig's LOAD command

If the DataWorks(MaxCompute) tables in your request belong to two owners. In this case, Data

Guard(DataWorks component) automatically splits your request into two by table owner.

Score 1

A.

True

B.

False

You are working on a project where you need to chain together MapReduce, Hive jobs.

You also need the ability to use forks, decision points, and path joins. Which ecosystem

project should you use to perform these actions?

A.

Apache HUE

B.

Apache Zookeeper

C.

Apache Oozie

D.

Apache Spark

Project is an important concept in MaxCompute. A user can create multiple projects, and each object

belongs to a certain project.

Score 1

A.

True

B.

False

DataWorks can be used to develop and configure data sync tasks. Which of the following statements

are correct? (Number of correct answers: 3)

Score 2

A.

The data source configuration in the project management is required to add data source

B.

Some of the columns in source tables can be extracted to create a mapping relationship between

fields, and constants or variables can't be added

C.

For the extraction of source data, "where" filtering clause can be referenced as the criteria of

incremental synchronization

D.

Clean-up rules can be set to clear or preserve existing data before data write

Page: 1 / 2
Total 78 questions
Copyright © 2014-2025 Solution2Pass. All Rights Reserved