4. Workload Distribution

This demonstrator shows how to implement 2 kind of nodes: Computing Node and Main Node. With these 2 nodes implemented, the user can deploy as many nodes of each kind as desired and check the behavior of a simulated AML-IP network running. They are implemented one in Python and one in C++ to demonstrate as well how to instantiate each kind of node with different APIs, and to prove the communication between the 2 implementations.

4.1. Simulation

4.1.1. AML Mock

For this demo the actual AML Engine is not provided, and it is mocked. This Mock simulates a difficult calculation by converting a string to uppercase and randomly waiting between 1 and 5 seconds in doing so.

4.1.2. Main Node

This node simulates a Main Node. It does not use real AML Jobs, but strings. It is implemented in Python using amlip_py API. There are 2 different ways to run it, an automatic one and a manual one:

4.1.2.1. Automatic version

In this version, the python executable expects input arguments. For each argument, it will convert it to a string (str) and send it as a Job. Once the arguments run out, it will finish execution and destroy the Node.

4.1.2.2. Manual version

In this version the python program expects to receive keyboard input. For each keyboard input received, it will convert it to a string (str) and send it as a Job. When empty string given, it will finish execution and destroy the Node.

4.1.3. Computing Node

This node simulates a Computing Node. It does not use real AML Jobs, but strings. It does not have a real AML Engine but instead the calculation is an upper-case conversion of the string received. It is implemented in C++ using amlip_cpp API.

To run it, one integer argument is required. This will be the number of jobs this Node will answer to before finishing its execution and being destroyed.

4.2. Installation

First of all, check that amlip_demo_nodes sub-package is correctly installed. If it is not, please refer to Build demos.

4.3. Run demo

The demo that is presented here follows the schema of the figure below:

../../_images/workload_distribution_basic_demo.png

4.3.1. Run Main Node

Run the following command:

# Source colcon installation
source install/setup.bash

# To execute Main Node to send 2 jobs
python3 ./install/amlip_demo_nodes/bin/main_node.py first_job "second job"

Take into account that this node will wait until there are Computing Nodes running and available in the same LAN in order to solve the jobs. The expected output is the following:

Main Node AMLMainNode.aa.a5.47.fe ready.
Main Node AMLMainNode.aa.a5.47.fe sending task <first_job>.
# ... Waits for Computing Node
Main Node received solution from AMLComputingNode.d1.c3.86.0a for job <first_job> => <FIRST_JOB>.
Main Node AMLMainNode.aa.a5.47.fe sending task <second job>.
Main Node received solution from AMLComputingNode.d1.c3.86.0a for job <second job> => <SECOND JOB>.
Main Node AMLMainNode.aa.a5.47.fe closing.

4.3.2. Run Computing Node

Run the following command to answer 2 jobs before closing:

# Source colcon installation
source install/setup.bash

# To execute Computing Node to answer 2 jobs
./install/amlip_demo_nodes/bin/computing_node 2

Take into account that this node will wait until it has solved 2 different jobs. If there are more than 1 Computing Node running, one job is only solved by one of them. This execution expects an output similar to the one shown below:

Computing Node ID{AMLComputingNode.d1.c3.86.0a} computing 2 tasks.
# ... Waits for Main Node
 Received Job: <first_job>. Processing...
 Answering Solution: <FIRST_JOB>.
Computing Node ID{AMLComputingNode.d1.c3.86.0a} answered task. 1 remaining.
 Received Job: <second job>. Processing...
 Answering Solution: <SECOND JOB>.
Computing Node ID{AMLComputingNode.d1.c3.86.0a} answered task. 0 remaining.
Computing Node ID{AMLComputingNode.d1.c3.86.0a} closing.

4.4. Bigger scenarios

There is no limit in the number of nodes of each kind that could run in the same network. However, take into account that these nodes are not meant to close nicely if they do not finish their tasks correctly, thus calculate the number of jobs sent in order for all nodes to close gently.