Watch Your Words

GazeReader is a low-cost solution that uses a webcam to detect words that learners of a second language do not know so they can get help.

Nick Bild
8 months agoMachine Learning & AI

In today's interconnected world, the ability to communicate effectively in more than one language has become a highly sought-after skill. The benefits of learning a second language are vast, including enhanced cognitive abilities, improved job prospects, and increased cultural understanding. Despite the advantages, many people struggle with the challenge of learning a new language, often facing barriers that impede their progress. However, with the help of technological solutions, many of these obstacles can be overcome.

Online language courses, language learning apps, and language exchange platforms have become increasingly popular solutions for language learners. These tools offer learners the flexibility to study at their own pace, anytime and anywhere. Apps like Duolingo and Babbel use gamification to make language learning fun and engaging, while language exchange platforms like HelloTalk and Tandem provide opportunities for learners to practice speaking with native speakers in a supportive environment.

But despite these tools, one factor that learners always have to struggle through is the size of the vocabulary of the language that they are studying. When reading texts, they will frequently come across words that they are not familiar with, breaking their concentration and hindering understanding.

A clever use of technology has helped tremendously with this particular issue. Dedicated eye-tracking devices have been leveraged to watch as a learner reads a text. When it is noticed that their gaze is fixed on a particular word for longer than would be expected, it can be inferred that they are probably struggling with understanding. This can be used to trigger, for example, an automated dictionary lookup to provide help without taking the reader completely off task.

However, these systems are complex and expensive, leaving them out of the reach of the typical student of a new language. A team at Tsinghua University took notice of this fact, and worked out a new solution named GazeReader that leverages a simple, inexpensive webcam of the sort that most people already have. By analyzing the images from the webcam with machine learning algorithms, they showed that it was possible to determine what word in a block of text a reader is gazing at.

Webcams, however, provide relatively low resolution data that is certainly not on par with dedicated eye-tracking systems. To achieve acceptable performance given this limitation, the team started by using Brown University’s WebGazer library that predicts gaze location with a webcam. To refine these predictions, this positional information is paired with the text that is being read and embeddings are generated with a Long Short-term Memory model.

The text is then fed into a pre-trained large language model for additional context. It can identify, for example, words that are used with a low frequency, which are more likely to give a reader trouble. Then finally, this additional context information is paired with the gaze information and fed into a neural network classifier that predicts the word a reader is gazing at.

A user study was conducted to assess the effectiveness of the GazeReader platform. Participants were asked to read from the Vocabulary Levels Test while the system analyzed them. After a number of trials, it was observed that the accuracy of GazeReader’s predictions had reached 98.09%. Moreover, these tests revealed the importance of the additional context information provided by the language model. The overall performance of the model was found to be heavily dependent on the context data, which is not entirely surprising given the poor quality of webcam data.

This work focused entirely on the English language, but the researchers do believe that GazeReader could be generalized to any alphabetic language. They intend to explore that possibility further in the future. Another area they wish to evaluate is how to provide assistance to readers when they do encounter an unknown word. To date, they have focused solely on detecting the words that people have difficulties understanding. In any case, the low cost and practical design of GazeReader may help it to find a market among learners of second languages in the years to come.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles