Node
Each type of node is used to perform a specific data operation. For example:
A sync node is used to synchronize data from ApsaraDB for RDS to MaxCompute.
An ODPS SQL node is used to convert data by executing SQL statements that are supported by
MaxCompute.
Each node has zero or more input tables or datasets and generates one or more output tables
or datasets.
Nodes are classified into node tasks, flow tasks, and inner nodes.
Type
Description
Node task
A node task is used to perform a data operation. You can configure
dependencies between a node task and other node tasks or flow tasks to
form a directed acyclic graph (DAG).
Product Introduction
·
Basic concepts
DataWorks
8
>
Document Version:20200903

Flow task
A flow task contains a group of inner nodes that process a workflow. We
recommend that you create less than 10 flow tasks.
Inner nodes in a flow task cannot be depended upon by other flow tasks or
node tasks. You can configure dependencies between a flow task and
other flow tasks or node tasks to form a DAG.
Note
In DataWorks V2.0 and later, you can find the flow tasks that
are created in DataWorks V1.0 but cannot create flow tasks. Instead,
you can create workflows to perform similar operations.
Inner node
An inner node is a node within a flow task. Its features are basically the
same as those of a node task. You can configure dependencies between
inner nodes in a flow task by performing drag-and-drop operations.
However, you cannot configure a recurrence for inner nodes because they
follow the recurrence configuration of the flow task.
Type
Description
Instance
An instance is a snapshot of a node at a specific time point. An instance is generated every time
a node is run as scheduled by the scheduling system or manually triggered. An instance contains
information such as the time point at which the node is run, the running status of the node, and
operational logs.
Assume that Node 1 is configured to run at 02:00 every day. The scheduling system
automatically generates an instance of Node 1 at 23:30 every day. At 02:00 the next day, if the
scheduling system verifies that all the ancestor instances are run, the system automatically
runs the instance of Node 1.
Note
You can query the instance information on the
Cycle Instance
page of
Operation
Center
.
Commit
You can commit nodes and workflows from the development environment to the scheduling
system. The scheduling system runs the code in the committed nodes and workflows as
configured.
Note
The scheduling system runs nodes and workflows only after you commit them.
Script
A script stores code for data analysis. The code in a script can be used only for data query and
analysis. It cannot be committed to the scheduling system for scheduling.
Resource and function
Resources and functions are concepts in MaxCompute. For more information, see
Resource
and
Function
.
