2. TensorFlow Inference¶

Background
Prerequisites
Building the demo
Explaining the demo
- Edge Node
- Inference Node
Run demo
Run multiple nodes of each kind
How to use your own model
Troubleshooting
- TensorFlow using old API
Next Steps

2.1. Background¶

Inference refers to the process of using a trained model to make predictions or draw conclusions based on input data. It involves applying the learned knowledge and statistical relationships encoded in the model to new, unseen data. The inference of an image involves passing the image through a trained AI model to obtain a classification based on the learned knowledge and patterns within the model.

This demo shows how to implement 2 types of nodes, Inference Node and Edge Node, to perform TensorFlow inference on a given image. With these 2 nodes implemented, the user can deploy as many nodes of each kind as desired and check the behavior of a simulated AML-IP network running.

The demo that is presented here follows the schema of the figure below:

TensorFlow is an end-to-end machine learning platform with pre-trained models.
Edge Node simulates an Edge Node. It is implemented in Python using amlip_py API.
Inference Node simulates an Inference Node. It is implemented in Python using amlip_py API.

2.2. Prerequisites¶

First of all, check that amlip_tensorflow_inference_demo sub-package is correctly installed. If it is not, please refer to Build demos.

The demo requires the following tools to be installed in the system:

sudo apt install -y  swig alsa-utils libopencv-dev
pip3 install -U pyttsx3 opencv-python
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# For changes to take effect, close and re-open your current shell.
conda create --name tf python=3.9
conda install -c conda-forge cudatoolkit=11.8.0
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python3 -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Ensure that you have TensorFlow and TensorFlow Hub installed in your Python environment before proceeding. You can install them using pip by executing the following commands:

pip3 install tensorflow tensorflow-hub tensorflow-object-detection-api nvidia-cudnn-cu11==8.6.0.163 protobuf==3.20.*

Additionally, it is required to obtain the TensorFlow model from TensorFlow Hub, follow the steps below:

cd ~/AML-IP-ws/src/AML-IP/amlip_demo_nodes/amlip_tensorflow_inference_demo/resource/tensorflow/models/
wget -O centernet_hourglass_512x512_kpts_1.tar.gz https://tfhub.dev/tensorflow/centernet/hourglass_512x512_kpts/1?tf-hub-format=compressed
mkdir centernet_hourglass_512x512_kpts_1
tar -xvf centernet_hourglass_512x512_kpts_1.tar.gz -C ./centernet_hourglass_512x512_kpts_1

2.3. Building the demo¶

To build the demo, build the packages with Colcon:

colcon build --packages-up-to amlip_demo_nodes

Once AML-IP packages are installed and built, import the libraries using the following command.

source install/setup.bash

2.4. Explaining the demo¶

In this section, we will explore and explain the demo in detail.

2.4.1. Edge Node¶

Edge Node serves as the entity responsible for sending the data to be inferred to the Inference Node. The Edge Node is typically located at the edge of a network or closer to the data source, such as a sensor or a device generating the data.

This is the Python code for the Edge Node application. This code can be found here.

The next block includes the Python header files that allow the use of the AML-IP Python API.

from amlip_py.node.AsyncEdgeNode import AsyncEdgeNode, InferenceListenerLambda
from amlip_py.types.InferenceDataType import InferenceDataType

Let’s continue explaining the global variables. The waiter allows the node to wait for the inference. DOMAIN_ID allows the execution to be isolated because only DomainParticipants with the same Domain Id would be able to communicate to each other.

# Variable to wait to the inference
waiter = BooleanWaitHandler(True, False)

# Domain ID
DOMAIN_ID = 166

The definition of the inference_received function prints the details of the received inference.

def inference_received(
        inference,
        task_id,
        server_id):
    print(f'Edge Node received inference from {server_id}')
    print(f'Edge Node received inference {inference.to_string()}')
    waiter.open()

We define the main function.

def main():

First, we create an instance of AsyncEdgeNode. The first thing the constructor gets is the given name. Then a listener, which is an InferenceListenerLambda object is created with the function inference_received declared above. This function is called each we receive an inference. And also we specified the domain equal to the DOMAIN_ID variable.

    node = AsyncEdgeNode(
        'AMLAsyncEdgeNode',
        listener=InferenceListenerLambda(inference_received),
        domain=DOMAIN_ID)

The next code block loads the image using cv2.imread based on the specified image_path. It converts the size information and the image into bytes and combines the two to send them to the Inference node.

    current_path = os.path.abspath(__file__)
    image_path = current_path.split('amlip_tensorflow_inference_demo', -1)[0]\
        + 'amlip_tensorflow_inference_demo/resource/tensorflow/models/research\
/object_detection/test_images/dog.jpg'
    img = cv2.imread(image_path)
    width = img.shape[1]
    height = img.shape[0]

    # Convert size to bytes
    str_size = str(width) + ' ' + str(height) + ' | '
    bytes_size = bytes(str_size, 'utf-8')
    # Convert image to bytes
    img_bytes = base64.b64encode(img)
    # Size + images
    img_size_bytes = bytes_size + img_bytes

After that, the request_inference method is called to request the inference of the image.

    task_id = node.request_inference(InferenceDataType(img_size_bytes))

Finally, the program waits for the inference solution using waiter.wait.

    waiter.wait()

Once the solution is received, the execution finish.

2.4.2. Inference Node¶

The Inference Node is responsible for making the inferences or predictions on the data it receives using a TensorFlow model. The Inference Node is typically a server or a computing resource equipped with high-performance hardware optimized for executing machine learning models efficiently.

This is the Python code for the Inference Node application. This code can be found here.

The next block includes the Python header files that allow the use of the AML-IP Python API.

from amlip_py.node.AsyncInferenceNode import AsyncInferenceNode, InferenceReplierLambda
from amlip_py.types.InferenceSolutionDataType import InferenceSolutionDataType

Let’s continue explaining the global variables. DOMAIN_ID allows the execution to be isolated because only DomainParticipants with the same Domain Id would be able to communicate to each other. tolerance sets a limit to ignore detections with a probability less than the tolerance.

# Domain ID
DOMAIN_ID = 166

# Not take into account detections with less probability than tolerance
tolerance = 25

It loads the model from TensorFlow based on the specified path.

current_path = os.path.abspath(__file__)
# Initialise model
path = current_path.split('amlip_tensorflow_inference_demo', -1)[0]\
                        + 'amlip_tensorflow_inference_demo/resource/\
tensorflow/models/centernet_hourglass_512x512_kpts_1'
dataset = current_path.split('amlip_tensorflow_inference_demo', -1)[0]\
                           + 'amlip_tensorflow_inference_demo/resource/\
tensorflow/models/research/object_detection/data/mscoco_label_map.pbtxt'

print('Model Handle at TensorFlow Hub: {}'.format(path))
print('loading model...')
hub_model = hub.load(path)

The process_inference function is responsible for computing the inference when data is received. Inference is performed using the input data and the loaded model. Note that detected objects are filtered based on the specified tolerance.

def process_inference(
        inference,
        task_id,
        client_id):
    # Size | Image
    height, width = (inference.to_string().split(' | ', 1)[0]).split()
    image_str = inference.to_string().split(' | ', 1)[1]
    # Convert string to bytes
    img_bytes = base64.b64decode(image_str)
    # Convert bytes to image
    image = np.frombuffer((img_bytes), dtype=np.uint8).reshape((int(width), int(height), 3))
    string_inference = ''
    image_np = np.array(image).reshape((1, int(width), int(height), 3))
    results = hub_model(image_np)
    result = {key: value.numpy() for key, value in results.items()}
    category_index = label_map_util.create_category_index_from_labelmap(dataset,
                                                                        use_display_name=True)
    classes = (result['detection_classes'][0]).astype(int)
    scores = result['detection_scores'][0]
    for i in range(result['detection_boxes'][0].shape[0]):
        if (round(100*scores[i]) > tolerance):
            boxes = result['detection_boxes'][0]
            box = tuple(boxes[i].tolist())
            ymin, xmin, ymax, xmax = box
            string_inference = string_inference + \
                'Box [({}, {}), ({}, {})] {}: {}% \n' \
                .format(xmin, ymin, xmax, ymax, category_index[classes[i]]['name'],
                        round(100*scores[i]))
    print('Inference ready!')
    print('sending inference: ' + string_inference)
    return InferenceSolutionDataType(string_inference)

We define the main function.

def main():

We create an instance of AsyncInferenceNode. The first thing the constructor gets is the name AMLInferenceNode. Then the listener which is an InferenceReplierLambda(process_inference). This means calling the process_inference function to perform the inference requests. And also we specified the domain equal to the DOMAIN_ID variable.

    node = AsyncInferenceNode(
        'AMLInferenceNode',
        listener=InferenceReplierLambda(process_inference),
        domain=DOMAIN_ID)

This starts the inference node. It will start listening for incoming inference requests and call the process_inference function to handle them.

    node.run()

Finally, waits for a SIGINT signal Ctrl+C to stop the node and close it.

    def handler(signum, frame):
        pass
    signal.signal(signal.SIGINT, handler)
    signal.pause()

    node.stop()

2.5. Run demo¶

This demo explains the implemented nodes in amlip_demo_nodes/amlip_tensorflow_inference_demo.

2.5.1. Run Edge Node¶

In the first terminal, run the Edge Node with the following command:

# Source colcon installation
source install/setup.bash

# To execute Edge Node to send an image to inferred
cd ~/AML-IP-ws/src/amlip/amlip_demo_nodes/amlip_tensorflow_inference_demo/amlip_tensorflow_inference_demo
python3 edge_node_async.py

Take into account that this node will wait until there is an Inference Node running and available in the same LAN in order to process the inference. The expected output is the following:

Edge Node AMLEdgeNode.fb.d4.38.13 ready.
Edge Node AMLEdgeNode.fb.d4.38.13 sending data.

Edge Node received inference from AMLInferenceNode.b8.34.4d.a3
Edge Node received inference:
Box [(0.15590962767601013, 0.21641747653484344), (0.7388607263565063, 0.7326743006706238)] bicycle: 97%
Box [(0.16968876123428345, 0.38129815459251404), (0.403958797454834, 0.9422630071640015)] dog: 92%
Box [(0.6158109307289124, 0.13117200136184692), (0.9053990244865417, 0.2978983521461487)] truck: 53%
Box [(0.6158109307289124, 0.13117200136184692), (0.9053990244865417, 0.2978983521461487)] car: 48%
Box [(0.8892407417297363, 0.19558095932006836), (0.933372974395752, 0.2684069573879242)] potted plant: 34%
Box [(0.0753115713596344, 0.15651819109916687), (0.13415342569351196, 0.22736744582653046)] motorcycle: 32%

Edge Node AMLEdgeNode.fb.d4.38.13 closing.

2.5.2. Run Inference Node¶

In the second terminal, run the following command to process the inference:

# Source colcon installation
source install/setup.bash

# To execute Inference Node with pre-trained model from TensorFlow
cd ~/AML-IP-ws/src/AML-IP/amlip_demo_nodes/amlip_tensorflow_inference_demo/amlip_tensorflow_inference_demo
python3 inference_node_async.py

The execution expects an output similar to the one shown below:

2023-02-14 14:50:42.711797: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Inference Node AMLInferenceNode.b8.34.4d.a3 ready.
Model Handle at TensorFlow Hub: /home/user/AML-IP-ws/src/AML-IP/amlip_demo_nodes/amlip_tensorflow_inference_demo/resource/tensorflow/models/centernet_hourglass_512x512_kpts_1
loading model...
WARNING:absl:Importing a function (__inference_batchnorm_layer_call_and_return_conditional_losses_42408) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_batchnorm_layer_call_and_return_conditional_losses_209416) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_batchnorm_layer_call_and_return_conditional_losses_220336) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
...
WARNING:absl:Importing a function (__inference_batchnorm_layer_call_and_return_conditional_losses_55827) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
WARNING:absl:Importing a function (__inference_batchnorm_layer_call_and_return_conditional_losses_56488) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
model loaded!
Selected model:tensorflow
2023-02-14 14:51:14.165305: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:907] Skipping loop optimization for Merge node with control input: StatefulPartitionedCall/cond/then/_918/cond/Assert_2/AssertGuard/branch_executed/_1123
inference ready!
sending inference:
Box [(0.15590962767601013, 0.21641747653484344), (0.7388607263565063, 0.7326743006706238)] bicycle: 97%
Box [(0.16968876123428345, 0.38129815459251404), (0.403958797454834, 0.9422630071640015)] dog: 92%
Box [(0.6158109307289124, 0.13117200136184692), (0.9053990244865417, 0.2978983521461487)] truck: 53%
Box [(0.6158109307289124, 0.13117200136184692), (0.9053990244865417, 0.2978983521461487)] car: 48%
Box [(0.8892407417297363, 0.19558095932006836), (0.933372974395752, 0.2684069573879242)] potted plant: 34%
Box [(0.0753115713596344, 0.15651819109916687), (0.13415342569351196, 0.22736744582653046)] motorcycle: 32%

Inference sent to client AMLEdgeNode.fb.d4.38.13.

Warning

If you encounter an output similar to the next one, follow the set of instructions outlined below:

terminate called after throwing an instance of 'Swig::DirectorMethodException'
    what():  SWIG director method error. In method 'process_inference': AttributeError: module 'tensorflow' has no attribute 'gfile'
Aborted (core dumped)

2.5.3. Next steps¶

Based on the information acquired, we have successfully generated the next image:

2.6. Run multiple nodes of each kind¶

One of the advantages inherent to this architecture lies in its ability to support multiple models operating concurrently across multiple Inference Node, while simultaneously requesting inferences from Edge Node in parallel. This architectural design fosters a highly efficient and scalable system, enabling the execution of diverse inference tasks in a distributed manner.

2.7. How to use your own model¶

To use your own model, simply download it and load it by passing the path to the function:

hub_model = hub.load(your_model_path)

2.8. Troubleshooting¶

2.8.1. TensorFlow using old API¶

Please be aware that Simple TensorFlow Serving is currently not compatible with TensorFlow 2.0 due to its reliance on the older API. It is important to note that in TensorFlow 2.0, the gfile package has been relocated under the tf.io module. Therefore, if you intend to utilize TensorFlow 2.0, please take into consideration this change in the package structure and update your code accordingly. Check following issue for further information.

To update the code, please follow these steps:

Locate the file label_map_util.py. (default path: .local/lib/python3.10/site-packages/object_detection/utils/label_map_util.py)
Navigate to line 132 within the file.
Replace tf.gfile.GFile with tf.io.gfile.GFile.

2.9. Next Steps¶

Now you can develop more functionalities in your application. See also this tutorial which explains how to take the image from a ROSbot 2R Camera.