Node Each type of node is used to perform a specific data operation For example

Node each type of node is used to perform a specific

This preview shows page 8 - 10 out of 16 pages.

Node Each type of node is used to perform a specific data operation. For example: A sync node is used to synchronize data from ApsaraDB for RDS to MaxCompute. An ODPS SQL node is used to convert data by executing SQL statements that are supported by MaxCompute. Each node has zero or more input tables or datasets and generates one or more output tables or datasets. Nodes are classified into node tasks, flow tasks, and inner nodes. Type Description Node task A node task is used to perform a data operation. You can configure dependencies between a node task and other node tasks or flow tasks to form a directed acyclic graph (DAG). Product Introduction · Basic concepts DataWorks 8 > Document Version:20200903
Image of page 8
Flow task A flow task contains a group of inner nodes that process a workflow. We recommend that you create less than 10 flow tasks. Inner nodes in a flow task cannot be depended upon by other flow tasks or node tasks. You can configure dependencies between a flow task and other flow tasks or node tasks to form a DAG. Note In DataWorks V2.0 and later, you can find the flow tasks that are created in DataWorks V1.0 but cannot create flow tasks. Instead, you can create workflows to perform similar operations. Inner node An inner node is a node within a flow task. Its features are basically the same as those of a node task. You can configure dependencies between inner nodes in a flow task by performing drag-and-drop operations. However, you cannot configure a recurrence for inner nodes because they follow the recurrence configuration of the flow task. Type Description Instance An instance is a snapshot of a node at a specific time point. An instance is generated every time a node is run as scheduled by the scheduling system or manually triggered. An instance contains information such as the time point at which the node is run, the running status of the node, and operational logs. Assume that Node 1 is configured to run at 02:00 every day. The scheduling system automatically generates an instance of Node 1 at 23:30 every day. At 02:00 the next day, if the scheduling system verifies that all the ancestor instances are run, the system automatically runs the instance of Node 1. Note You can query the instance information on the Cycle Instance page of Operation Center . Commit You can commit nodes and workflows from the development environment to the scheduling system. The scheduling system runs the code in the committed nodes and workflows as configured. Note The scheduling system runs nodes and workflows only after you commit them. Script A script stores code for data analysis. The code in a script can be used only for data query and analysis. It cannot be committed to the scheduling system for scheduling. Resource and function Resources and functions are concepts in MaxCompute. For more information, see Resource and Function .
Image of page 9
Image of page 10

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture