Faculty of Informatics / Mathematics

Demonstrators

Within the KiWi project, a number of demonstrators are being developed in cooperation with industry and public partners in the fields of agricultural equipment technology, service robotics, oncology, mechanical and plant engineering and predictive quality. The demonstrators are presented below on this page:

Collaborative robot

Monitoring the human worker in collaboration with the robot

The use of robots in highly automated manufacturing continues to increase. In this context, the direct collaboration between humans and robots will play an increasingly important role. The robot takes over repetitive/monotonous tasks, while the human focuses on specialized/difficult tasks that cannot yet be performed by robots. To ensure safe collaboration between the robot and the human, the following factors are important: the worker's awareness of the robot's range of motion and forces, and a defined limitation of the robot to minimize these forces and ranges of motion. Another crucial factor is the robot's awareness of the human worker, meaning the robot can recognize and assess the worker. To address this issue, a system has been developed that monitors the human worker using a computer vision system. The following factors are currently considered:

  1. Is an authorized worker present in front of the robot?
  2. What emotion is the worker displaying? Neutral, friendly, startled, fearful, etc.?
  3. Does the worker show signs of fatigue?

The following diagram shows possible robot reactions based on these recognized characteristics:

Technical implementation

Only neural networks were used to recognize the various features of the worker. Thanks to their data-driven approach, these offer decisive advantages over traditional image processing algorithms, such as the best recognition rates, effective scaling with the data set and simple transfer to new use cases. Thanks to increasingly powerful hardware in the edge area (Nvidia Jetson), recognition takes place in almost real time (approx. 25ms). The following diagram shows the development of neural networks and the architectures used:

Step 1

The worker's face is recognized and cut out of the image using a bounding box. A pre-trained face detection model based on an SSD architecture with a MobileNet backbone is used for this.

Resolution: 320x240 | Recognition time: 5ms

Step 2

Emotion of the worker is recognized. A Convolutional Neural Network (CNN) with a Resnet18 backbone, which was trained using the FER 2013 data set, is used for this.

Resolution: 48x48 | Recognition time: 3ms

Step 3

Authorized workers are recognized. With the help of transfer learning, the backbone of the CNN from step 2 is used, which is optimized for the authorized workers in this case.

Resolution: 48x48 | Recognition time: 3ms

Step 4

Fatigue is detected. For this purpose, relevant facial landmarks are detected using the dlib library, from which the eye area is extracted and analyzed. If the eyes close, a fatigue warning is issued.

Resolution: 112x112 | Recognition time: 8ms