According tomedia reports, in the case of video calls, the system can switch ways to highlight who is speaking, but unfortunately for silent languages like sign language it can’t trigger these algorithms, but now a Google study may change that. This is a real-time sign language detection engine that can tell when someone is doing sign language and when it ends.
A new paper published in ECCV by Google researchers describes how to do this efficiently and with little delay. If sign language detection is successful but causes video delays or downgrades, it won’t work, so their goal is to make sure the model is lightweight and reliable.
It is understood that the system will first run the video through a model called PoseNet, which estimates the position of the body and limbs in each frame. This simplified visual information is sent to a model that trains on posture data from videos using German Sign Language (German Sign Language) and then compares the live image to what it thinks sign language looks like.
This simple process has achieved an 80% accuracy rate in predicting whether a person is doing sign language, plus some additional optimizations, resulting in an accuracy rate of 91.5%.
In order not to add a new “someone is doing sign language” signal to an existing phone, the system uses a clever trick. It uses a virtual source to produce a 20kHz pitch, which is beyond the human auditory range but can be noticed by computer audio systems. This signal is generated when people are doing sign language, making speech detection algorithms think they are speaking loudly.
At the moment, the system is just a demo.