Vincenza Tufano, Annalisa Letizia, Gianluca Toscano
The goal set in this work is to estimate the respiratory rate starting from the reconstruction of the respiratory wave and blood volume pulse signals from facial videos of the subjects for applications in the field of intensive care or in case of impossibility of application of skin sensors. Respiratory rate is a vital parameter, whose monitoring is essential since its change, in most cases, could be associated with a pathological condition.
With the development of telemedicine, and especially telehealth, technology has laid the foundations to meet the need for remote monitoring of vital signs. This has a positive effect not only on the health and efficiency of the service provided but also has an excellent impact on the economy as pursuing this goal has the result of reducing hospitalization costs. For the type of application desired, it was necessary to investigate contact-less techniques to avoid contact between the sensor and the patient’s body to avoid the risk of skin infections and to monitor patients even remotely.
After a bibliographic research phase, on the basis of some criteria that will be illustrated in the next chapter, the technique that best meets our needs was identified, achieving good results not only for the estimation of the respiratory rate but also for that cardiac, being able to work in a multi-tasking mode. This strategy is based on measuring subtle changes in the light reflected by the skin that are captured through a camera. In order to emphasize the regions of the face of greatest interest, a spatial attention is implemented through which it is possible to generate attention masks that assign higher weights to the areas of interest, and which are subsequently combined with the features extracted from the motion branch. The idea behind the project is to start from the chosen method and then make changes to improve its performance.
The goal is to make a change to improve the attention maps that are generated by the aspect branch. This can be achieved by converting frames from the RGB colour space to the YUV colour space.
The YUV colour space is characterized by 3 channels: luminance (Y) and chroma (U, V). By extracting only the V channel for each frame and multiplying it by the attention maps, it is possible to improve the identification of the regions of interest, thus optimizing the learning and prediction of the network.
Development of a model for the prediction of heart and respiratory rate starting from the reconstruction of blood volume pulse and respiratory wave signals using facial videos of subjects.
- Bibliographic research
- State of art analysis
- Pre-processing of signals
- Modification of the architecture of the neural network
- Model training
- Post-processing of signals
- Results analysis
- Implementation of the model in real-time mode
The change of the attention masks of a model already present in the literature led to improvements in the results of the prediction of the respiratory rate reaching a MAE of 1.34 and good results for the estimation of the heart rate with a MAE of 1.79. The results obtained for the heart rate are in line with the other methods present in the literature, while for the respiratory rate a greater measurement accuracy was achieved compared to the other methods analysed.
- Test on other dataset
- Elimination of the aspect branch
- Reduction of the prediction delay for the realtime operating mode