ROBINS: the robot to teach Spanish to deaf children

Learning a written language is important and supremely difficult for this population - sometimes frustrating and riddled with demotivation.

AdvancedFull instructions provided15 days731

Finalist

OpenCV AI Competition 2023

Popular Vote - Finalist

OpenCV AI Competition 2023

ROBINS: the robot to teach Spanish to deaf children

Things used in this project

Hardware components

Raspberry Pi 4 Model B

SparkFun Dual H-Bridge motor drivers L298

Rotary Encoder with Push-Button

Geared DC Motor, 12 V

Adafruit MONSTER M4SK - DIY Electronic Eyes Mask

NXP FRDM Board

NXP MK64FN1M0VLL12

OpenCV AI Kit: OAK-D

LED, RGB

Software apps and online services

OpenCV

TensorFlow

Hand tools and fabrication machines

3D Printer (generic)

Story

Currently, the use of technology provides various tools to face educational challenges in working with deaf children. In particular, in addressing the essential skills for learning to write and read, such as decoding or the ability to read individual words, and linguistic comprehension, understood as the ability to understand the meaning of what is read. Although the specific abilities of the deaf population are not clearly defined, due to the heterogeneity of personal and environmental factors (e.g., degree of hearing loss, age of diagnosis, age of provision of assistive technologies for hearing, preferred form of communication, educational context, hearing of parents), there are advances in the development of pedagogical strategies for teaching (Harris, 2015). In previous studies (Ioannou & Constantinou, 2018; Jouhtimäki et al., 2009; Mazlan et al., 2010; Mueller & Hurtig, 2009), the use of various technologies for this purpose has been promoted, however, the use of the game as a means for learning, this proposal will be deepened.

Learning a written language is both important and supremely difficult for this population - sometimes frustrating and riddled with demotivation. For this reason, it seems essential to offer deaf children ways to learn reading and writing that are motivating through play and at the same time effective. One of the initial skills to learn in reading and writing is the knowledge of letters. Therefore, the project will focus on reinforcing spelling ability. The next step is to be able to identify written words from the letters they are composed of, which has to do with building a vocabulary as a combination of certain written forms and their meaning.

On the other hand, the learning of different skills by children using robots is widely accepted in different fields (Belpaeme et al., 2018). It should be noted that the use of interactive devices, including toys, pets, assistants for people with disabilities, and Information and Communication Technology (ICT) tools for education, are used in different ways to obtain information. beneficial effects on the development of children (Woo et al., 2021). However, to our knowledge, we have not found an interactive robot, that can be modulated for the different activities in the teaching of sign language and, last but not least, that is low cost.

The present project will focus on evaluating and intervening in these fundamental steps of reading through the use of an interactive Robot termed ROBINS from the Spanish ROBot Interactivo para Niños Sordos. ROBINS is a 3D-printed robot with a touch-screen to program several activities and games, and two OAK-D cameras with artificial intelligence to automatically detect sign language (camera one) and sadness, anger, or disgust expressions (camera two) during the activities. In addition, ROBINS can move the eyes, mouth, eyebrows, and cheekbones to express happiness when the kid does correctly the task and surprise when the kid does not. Therefore, early detection and properly-friendly interaction can help reduce the frustration and lack of empathy presented in the classroom (Khan et al, 2019). We hypothesize that robotics plus artificial intelligence must work together to tackle similar problems with low-cost robots that support the education of deaf kids with automatic facial expressions and sign language recognition to ensure proper interactions.

Video 1 - First test with ROBINS

Video 2 - Colombian Sign language interpretation with ROBINS

Schematics

Full manual-guide of ROBINS

The robot was built to allow it to convey three fundamental facial expressions: neutral countenance, a dude expression, and a smile, which when combined with the nose's illuminated light color and the display with animated eyes, provide feedback to the user. The robot’s frame is formed from modular pieces made with 3d printers and PLA material, allowing for effortless customization and assembly. These components seamlessly come together to create a skeletal structure Fig 1a, that provides support for the robot's motors, electronics, processing unit, and camera. The body contains the robot’s electronics and drivers for the motors and also supports the computing unit (CPU), and the screen, and then the assembly is covered by the toy’s fabric.

Code

#! /usr/bin/python3
import serial.tools.list_ports
import time
#from multiprocessing import Process, Queue
#from PlayPS4 import MyController

counter=0
ports = []
starT = 0.1
end_T = 0.1
comm1 = serial.Serial()
#MyController.ps4Cini("fdfdfd")

def Available():
    global starT
    global end_T
    global comm1
    starT = time.process_time()
    for port in serial.tools.list_ports.comports():
        ports.append(port.name)
    #print(ports)
    #print([port.device for port in serial.tools.list_ports.comports()])
    print ('listo:')
    if ports[0] == 'ttyACM0':
        comm1 = serial.Serial(port='/dev/ttyACM0',baudrate = 230400,parity=serial.PARITY_NONE,stopbits=serial.STOPBITS_ONE,bytesize=serial.EIGHTBITS,timeout=1)
        print(type(comm1))
        print ("ok")
        #comm1.write("rgb,__red\n".encode('utf-8'))
        end_T = time.process_time()#time.time_ns()
        print("Time taken", end_T-starT, "ns")
        return True
    
    else:
        print ("Nope")
        end_T = time.process_time()
        return False
    
def Uart_comm(cmnd):
    if Available():
        comm1.write(cmnd.encode('utf-8'))
        comm1.close()

#! /usr/bin/python
# -*- coding: utf-8 -*-

# Python ctypes bindings for VLC
#
# Copyright (C) 2009-2017 the VideoLAN team
# $Id: $
#
# Authors: Olivier Aubert <contact at olivieraubert.net>
#          Jean Brouwers <MrJean1 at gmail.com>
#          Geoff Salmon <geoff.salmon at gmail.com>
#
# This library is free software; you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as
# published by the Free Software Foundation; either version 2.1 of the
# License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA

"""This module provides bindings for the LibVLC public API, see
U{http://wiki.videolan.org/LibVLC}.

You can find the documentation and a README file with some examples
at U{https://www.olivieraubert.net/vlc/python-ctypes/}.

Basically, the most important class is L{Instance}, which is used
to create a libvlc instance.  From this instance, you then create
L{MediaPlayer} and L{MediaListPlayer} instances.

Alternatively, you may create instances of the L{MediaPlayer} and
L{MediaListPlayer} class directly and an instance of L{Instance}
will be implicitly created.  The latter can be obtained using the
C{get_instance} method of L{MediaPlayer} and L{MediaListPlayer}.
"""

import ctypes
from ctypes.util import find_library
import os
import sys
import functools

# Used by EventManager in override.py
import inspect as _inspect
import logging
logger = logging.getLogger(__name__)

__version__ = "3.0.18121"
__libvlc_version__ = "3.0.18"
__generator_version__ = "1.21"
build_date  = "Wed Nov 16 12:04:29 2022 3.0.18"

# The libvlc doc states that filenames are expected to be in UTF8, do
# not rely on sys.getfilesystemencoding() which will be confused,
# esp. on windows.
DEFAULT_ENCODING = 'utf-8'

if sys.version_info[0] > 2:
    str = str
    unicode = str
    bytes = bytes
    basestring = (str, bytes)
    PYTHON3 = True
    def str_to_bytes(s):
        """Translate string or bytes to bytes.
        """
        if isinstance(s, str):
            return bytes(s, DEFAULT_ENCODING)
        else:
            return s

    def bytes_to_str(b):
        """Translate bytes to string.
        """
        if isinstance(b, bytes):
            return b.decode(DEFAULT_ENCODING)
        else:
            return b

    def len_args(func):
        """Return number of positional arguments.
        """
        return len(_inspect.signature(func).parameters)

else:
    str = str
    unicode = unicode
    bytes = str
    basestring = basestring
    PYTHON3 = False
    def str_to_bytes(s):
        """Translate string or bytes to bytes.
        """
        if isinstance(s, unicode):
            return s.encode(DEFAULT_ENCODING)
        else:
            return s

    def bytes_to_str(b):
        """Translate bytes to unicode string.
        """
        if isinstance(b, str):
            return unicode(b, DEFAULT_ENCODING)
        else:
            return b

    def len_args(func):
        """Return number of positional arguments.
        """
        return len(_inspect.getargspec(func).args)

# Internal guard to prevent internal classes to be directly
# instanciated.
_internal_guard = object()

def find_lib():
    dll = None
    plugin_path = os.environ.get('PYTHON_VLC_MODULE_PATH', None)
    if 'PYTHON_VLC_LIB_PATH' in os.environ:
        try:
            dll = ctypes.CDLL(os.environ['PYTHON_VLC_LIB_PATH'])
        except OSError:
            logger.error("Cannot load lib specified by PYTHON_VLC_LIB_PATH env. variable")
            sys.exit(1)
    if plugin_path and not os.path.isdir(plugin_path):
        logger.error("Invalid PYTHON_VLC_MODULE_PATH specified. Please fix.")
        sys.exit(1)
    if dll is not None:
        return dll, plugin_path

    if sys.platform.startswith('win'):
        libname = 'libvlc.dll'
        p = find_library(libname)
        if p is None:
            try:  # some registry settings
                # leaner than win32api, win32con
                if PYTHON3:
                    import winreg as w
                else:
                    import _winreg as w
                for r in w.HKEY_LOCAL_MACHINE, w.HKEY_CURRENT_USER:
                    try:
                        r = w.OpenKey(r, 'Software\\VideoLAN\\VLC')
                        plugin_path, _ = w.QueryValueEx(r, 'InstallDir')
                        w.CloseKey(r)
                        break
                    except w.error:
                        pass
            except ImportError:  # no PyWin32
                pass
            if plugin_path is None:
                # try some standard locations.
                programfiles = os.environ["ProgramFiles"]
                homedir = os.environ["HOMEDRIVE"]
                for p in ('{programfiles}\\VideoLan{libname}', '{homedir}:\\VideoLan{libname}',
                          '{programfiles}{libname}',           '{homedir}:{libname}'):
                    p = p.format(homedir = homedir,
                                 programfiles = programfiles,
                                 libname = '\\VLC\\' + libname)
                    if os.path.exists(p):
                        plugin_path = os.path.dirname(p)
                        break
            if plugin_path is not None:  # try loading
                 # PyInstaller Windows fix
                if 'PyInstallerCDLL' in ctypes.CDLL.__name__:
                    ctypes.windll.kernel32.SetDllDirectoryW(None)
                p = os.getcwd()
                os.chdir(plugin_path)
                 # if chdir failed, this will raise an exception
                dll = ctypes.CDLL('.\\' + libname)
                 # restore cwd after dll has been loaded
                os.chdir(p)
            else:  # may fail
                dll = ctypes.CDLL('.\\' + libname)
        else:
            plugin_path = os.path.dirname(p)
            dll = ctypes.CDLL(p)

    elif sys.platform.startswith('darwin'):
        # FIXME: should find a means to configure path
        d = '/Applications/VLC.app/Contents/MacOS/'
        c = d + 'lib/libvlccore.dylib'
        p = d + 'lib/libvlc.dylib'
        if os.path.exists(p) and os.path.exists(c):
            # pre-load libvlccore VLC 2.2.8+
            ctypes.CDLL(c)
            dll = ctypes.CDLL(p)
            for p in ('modules', 'plugins'):
                p = d + p
                if os.path.isdir(p):
                    plugin_path = p
                    break
        else:  # hope, some [DY]LD_LIBRARY_PATH is set...
            # pre-load libvlccore VLC 2.2.8+
            ctypes.CDLL('libvlccore.dylib')
            dll = ctypes.CDLL('libvlc.dylib')

    else:
        # All other OSes (linux, freebsd...)
        p = find_library('vlc')
        try:
            dll = ctypes.CDLL(p)
        except OSError:  # may fail
            dll = None
        if dll is None:
            try:
                dll = ctypes.CDLL('libvlc.so.5')
            except:
                raise NotImplementedError('Cannot find libvlc lib')

    return (dll, plugin_path)

# plugin_path used on win32 and MacOS in override.py
dll, plugin_path  = find_lib()

class VLCException(Exception):
    """Exception raised by libvlc methods.
    """
    pass

try:
    _Ints = (int, long)
except NameError:  # no long in Python 3+
    _Ints =  int
_Seqs = (list, tuple)

# Used for handling *event_manager() methods.
class memoize_parameterless(object):
    """Decorator. Caches a parameterless method's return value each time it is called.

    If called later with the same arguments, the cached value is returned
    (not reevaluated).
    Adapted from https://wiki.python.org/moin/PythonDecoratorLibrary
    """
    def __init__(self, func):
        self.func = func
        self._cache = {}

    def __call__(self, obj):
        try:
            return self._cache[obj]
        except KeyError:
            v = self._cache[obj] = self.func(obj)
            return v

    def __repr__(self):
        """Return the function's docstring.
        """
        return self.func.__doc__

    def __get__(self, obj, objtype):
      """Support instance methods.
      """
      return functools.partial(self.__call__, obj)

# Default instance. It is used to instanciate classes directly in the
# OO-wrapper.
_default_instance = None

def get_default_instance():
    """Return the default VLC.Instance.
    """
    global _default_instance
    if _default_instance is None:
        _default_instance = Instance()
    return _default_instance

def try_fspath(path):
    """Try calling os.fspath
    os.fspath is only available from py3.6
    """
    try:
        return os.fspath(path)
    except (AttributeError, TypeError):
        return path

_Cfunctions = {}  # from LibVLC __version__
_Globals = globals()  # sys.modules[__name__].__dict__

def _Cfunction(name, flags, errcheck, *types):
    """(INTERNAL) New ctypes function binding.
    """
    if hasattr(dll, name) and name in _Globals:
        p = ctypes.CFUNCTYPE(*types)
        f = p((name, dll), flags)
        if errcheck is not None:
            f.errcheck = errcheck
        # replace the Python function
        # in this module, but only when
        # running as python -O or -OO
        if __debug__:
            _Cfunctions[name] = f
        else:
            _Globals[name] = f
        return f
    raise NameError('no function %r' % (name,))

def _Cobject(cls, ctype):
    """(INTERNAL) New instance from ctypes.
    """
    o = object.__new__(cls)
    o._as_parameter_ = ctype
    return o

def _Constructor(cls, ptr=_internal_guard):
    """(INTERNAL) New wrapper from ctypes.
    """
    if ptr == _internal_guard:
        raise VLCException("(INTERNAL) ctypes class. You should get references for this class through methods of the LibVLC API.")
    if ptr is None or ptr == 0:
        return None
    return _Cobject(cls, ctypes.c_void_p(ptr))

class _Cstruct(ctypes.Structure):
    """(INTERNAL) Base class for ctypes structures.
    """
    _fields_ = []  # list of 2-tuples ('name', ctypes.<type>)

    def __str__(self):
        l = [' %s:\t%s' % (n, getattr(self, n)) for n, _ in self._fields_]
        return '\n'.join([self.__class__.__name__] + l)

    def __repr__(self):
        return '%s.%s' % (self.__class__.__module__, self)

class _Ctype(object):
    """(INTERNAL) Base class for ctypes.
    """
    @staticmethod
    def from_param(this):  # not self
        """(INTERNAL) ctypes parameter conversion method.
        """
        if this is None:
            return None
        return this._as_parameter_

class ListPOINTER(object):
    """Just like a POINTER but accept a list of etype elements as an argument.
    """
    def __init__(self, etype):
        self.etype = etype

    def from_param(self, param):
        if isinstance(param, _Seqs):
            return (self.etype * len(param))(*param)
        else:
            return ctypes.POINTER(param)

# errcheck functions for some native functions.
def string_result(result, func, arguments):
    """Errcheck function. Returns a string and frees the original pointer.

    It assumes the result is a char *.
    """
    if result:
        # make a python string copy
        s = bytes_to_str(ctypes.string_at(result))
        # free original string ptr
        libvlc_free(result)
        return s
    return None

def class_result(classname):
    """Errcheck function. Returns a function that creates the specified class.
    """
    def wrap_errcheck(result, func, arguments):
        if result is None:
            return None
        return classname(result)
    return wrap_errcheck

# Wrapper for the opaque struct libvlc_log_t
class Log(ctypes.Structure):
    pass
Log_ptr = ctypes.POINTER(Log)

# Wrapper for the opaque struct libvlc_media_thumbnail_request_t
class MediaThumbnailRequest:
    def __new__(cls, *args):
        if len(args) == 1 and isinstance(args[0], _Ints):
            return _Constructor(cls, args[0])

# FILE* ctypes wrapper, copied from
# http://svn.python.org/projects/ctypes/trunk/ctypeslib/ctypeslib/contrib/pythonhdr.py
class FILE(ctypes.Structure):
    pass
FILE_ptr = ctypes.POINTER(FILE)

if PYTHON3:
    PyFile_FromFd = ctypes.pythonapi.PyFile_FromFd
    PyFile_FromFd.restype = ctypes.py_object
    PyFile_FromFd.argtypes = [ctypes.c_int,
                              ctypes.c_char_p,
                              ctypes.c_char_p,
                              ctypes.c_int,
                              ctypes.c_char_p,
                              ctypes.c_char_p,
                              ctypes.c_char_p,
                              ctypes.c_int ]

    PyFile_AsFd = ctypes.pythonapi.PyObject_AsFileDescriptor
    PyFile_AsFd.restype = ctypes.c_int
    PyFile_AsFd.argtypes = [ctypes.py_object]
else:
    PyFile_FromFile = ctypes.pythonapi.PyFile_FromFile
    PyFile_FromFile.restype = ctypes.py_object
    PyFile_FromFile.argtypes = [FILE_ptr,
                                ctypes.c_char_p,
                                ctypes.c_char_p,
                                ctypes.CFUNCTYPE(ctypes.c_int, FILE_ptr)]

    PyFile_AsFile = ctypes.pythonapi.PyFile_AsFile
    PyFile_AsFile.restype = FILE_ptr
    PyFile_AsFile.argtypes = [ctypes.py_object]

def module_description_list(head):
    """Convert a ModuleDescription linked list to a Python list (and release the former).
    """
    r = []
    if head:
        item = head
        while item:
            item = item.contents
            r.append((item.name, item.shortname, item.longname, item.help))
            item = item.next
        libvlc_module_description_list_release(head)
    return r

def track_description_list(head):
    """Convert a TrackDescription linked list to a Python list (and release the former).
    """
    r = []
    if head:
        item = head
        while item:
            item = item.contents
            r.append((item.id, item.name))
            item = item.next
        try:
            libvlc_track_description_release(head)
        except NameError:
            libvlc_track_description_list_release(head)

    return r

# Generated enum types #

class _Enum(ctypes.c_uint):
    '''(INTERNAL) Base class
    '''
    _enum_names_ = {}

    def __str__(self):
        n = self._enum_names_.get(self.value, '') or ('FIXME_(%r)' % (self.value,))
        return '.'.join((self.__class__.__name__, n))

    def __hash__(self):
        return self.value

    def __repr__(self):
        return '.'.join((self.__class__.__module__, self.__str__()))

    def __eq__(self, other):
        return ( (isinstance(other, _Enum) and self.value == other.value)
              or (isinstance(other, _Ints) and self.value == other) )

    def __ne__(self, other):
        return not self.__eq__(other)

class LogLevel(_Enum):
    '''Logging messages level.
\note future libvlc versions may define new levels.
    '''
    _enum_names_ = {
        0: 'DEBUG',
        2: 'NOTICE',
        3: 'WARNING',
        4: 'ERROR',
    }
LogLevel.DEBUG   = LogLevel(0)
LogLevel.ERROR   = LogLevel(4)
LogLevel.NOTICE  = LogLevel(2)
LogLevel.WARNING = LogLevel(3)

class DialogQuestionType(_Enum):
    '''@defgroup libvlc_dialog libvlc dialog
@ingroup libvlc
@{
@file
libvlc dialog external api.
    '''
    _enum_names_ = {
        0: 'DIALOG_QUESTION_NORMAL',
        1: 'DIALOG_QUESTION_WARNING',
        2: 'DIALOG_QUESTION_CRITICAL',
    }
DialogQuestionType.DIALOG_QUESTION_CRITICAL = DialogQuestionType(2)
DialogQuestionType.DIALOG_QUESTION_NORMAL   = DialogQuestionType(0)
DialogQuestionType.DIALOG_QUESTION_WARNING  = DialogQuestionType(1)

class EventType(_Enum):
    '''Event types.
    '''
    _enum_names_ = {
        0: 'MediaMetaChanged',
        1: 'MediaSubItemAdded',
        2: 'MediaDurationChanged',
        3: 'MediaParsedChanged',
        4: 'MediaFreed',
        5: 'MediaStateChanged',
        6: 'MediaSubItemTreeAdded',
        0x100: 'MediaPlayerMediaChanged',
        257: 'MediaPlayerNothingSpecial',
        258: 'MediaPlayerOpening',
        259: 'MediaPlayerBuffering',
        260: 'MediaPlayerPlaying',
        261: 'MediaPlayerPaused',
        262: 'MediaPlayerStopped',
        263: 'MediaPlayerForward',
        264: 'MediaPlayerBackward',
        265: 'MediaPlayerEndReached',
        266: 'MediaPlayerEncounteredError',
        267: 'MediaPlayerTimeChanged',
        268: 'MediaPlayerPositionChanged',
        269: 'MediaPlayerSeekableChanged',
        270: 'MediaPlayerPausableChanged',
        271: 'MediaPlayerTitleChanged',
        272: 'MediaPlayerSnapshotTaken',
        273: 'MediaPlayerLengthChanged',
        274: 'MediaPlayerVout',
        275: 'MediaPlayerScrambledChanged',
        276: 'MediaPlayerESAdded',
        277: 'MediaPlayerESDeleted',
        278: 'MediaPlayerESSelected',
        279: 'MediaPlayerCorked',
        280: 'MediaPlayerUncorked',
        281: 'MediaPlayerMuted',
        282: 'MediaPlayerUnmuted',
        283: 'MediaPlayerAudioVolume',
        284: 'MediaPlayerAudioDevice',
        285: 'MediaPlayerChapterChanged',
        0x200: 'MediaListItemAdded',
        513: 'MediaListWillAddItem',
        514: 'MediaListItemDeleted',
        515: 'MediaListWillDeleteItem',
        516: 'MediaListEndReached',
        0x300: 'MediaListViewItemAdded',
        769: 'MediaListViewWillAddItem',
        770: 'MediaListViewItemDeleted',
        771: 'MediaListViewWillDeleteItem',
        0x400: 'MediaListPlayerPlayed',
        1025: 'MediaListPlayerNextItemSet',
        1026: 'MediaListPlayerStopped',
        0x500: 'MediaDiscovererStarted',
        1281: 'MediaDiscovererEnded',
        1282: 'RendererDiscovererItemAdded',
        1283: 'RendererDiscovererItemDeleted',
        0x600: 'VlmMediaAdded',
        1537: 'VlmMediaRemoved',
        1538: 'VlmMediaChanged',
        1539: 'VlmMediaInstanceStarted',
        1540: 'VlmMediaInstanceStopped',
        1541: 'VlmMediaInstanceStatusInit',
        1542: 'VlmMediaInstanceStatusOpening',
        1543: 'VlmMediaInstanceStatusPlaying',
        1544: 'VlmMediaInstanceStatusPause',
        1545: 'VlmMediaInstanceStatusEnd',
        1546: 'VlmMediaInstanceStatusError',
    }
EventType.MediaDiscovererEnded          = EventType(1281)
EventType.MediaDiscovererStarted        = EventType(0x500)
EventType.MediaDurationChanged          = EventType(2)
EventType.MediaFreed                    = EventType(4)
EventType.MediaListEndReached           = EventType(516)
EventType.MediaListItemAdded            = EventType(0x200)
EventType.MediaListItemDeleted          = EventType(514)
EventType.MediaListPlayerNextItemSet    = EventType(1025)
EventType.MediaListPlayerPlayed         = EventType(0x400)
EventType.MediaListPlayerStopped        = EventType(1026)
EventType.MediaListViewItemAdded        = EventType(0x300)
EventType.MediaListViewItemDeleted      = EventType(770)
EventType.MediaListViewWillAddItem      = EventType(769)
EventType.MediaListViewWillDeleteItem   = EventType(771)
EventType.MediaListWillAddItem          = EventType(513)
EventType.MediaListWillDeleteItem       = EventType(515)
EventType.MediaMetaChanged              = EventType(0)
EventType.MediaParsedChanged            = EventType(3)
EventType.MediaPlayerAudioDevice        = EventType(284)
EventType.MediaPlayerAudioVolume        = EventType(283)
EventType.MediaPlayerBackward           = EventType(264)
EventType.MediaPlayerBuffering          = EventType(259)
EventType.MediaPlayerChapterChanged     = EventType(285)
EventType.MediaPlayerCorked             = EventType(279)
EventType.MediaPlayerESAdded            = EventType(276)
EventType.MediaPlayerESDeleted          = EventType(277)
EventType.MediaPlayerESSelected         = EventType(278)
EventType.MediaPlayerEncounteredError   = EventType(266)
EventType.MediaPlayerEndReached         = EventType(265)
EventType.MediaPlayerForward            = EventType(263)
EventType.MediaPlayerLengthChanged      = EventType(273)
EventType.MediaPlayerMediaChanged       = EventType(0x100)
EventType.MediaPlayerMuted              = EventType(281)
EventType.MediaPlayerNothingSpecial     = EventType(257)
EventType.MediaPlayerOpening            = EventType(258)
EventType.MediaPlayerPausableChanged    = EventType(270)
EventType.MediaPlayerPaused             = EventType(261)
EventType.MediaPlayerPlaying            = EventType(260)
EventType.MediaPlayerPositionChanged    = EventType(268)
EventType.MediaPlayerScrambledChanged   = EventType(275)
EventType.MediaPlayerSeekableChanged    = EventType(269)
EventType.MediaPlayerSnapshotTaken      = EventType(272)
EventType.MediaPlayerStopped            = EventType(262)
EventType.MediaPlayerTimeChanged        = EventType(267)
EventType.MediaPlayerTitleChanged       = EventType(271)
EventType.MediaPlayerUncorked           = EventType(280)
EventType.MediaPlayerUnmuted            = EventType(282)
EventType.MediaPlayerVout               = EventType(274)
EventType.MediaStateChanged             = EventType(5)
EventType.MediaSubItemAdded             = EventType(1)
EventType.MediaSubItemTreeAdded         = EventType(6)
EventType.RendererDiscovererItemAdded   = EventType(1282)
EventType.RendererDiscovererItemDeleted = EventType(1283)
EventType.VlmMediaAdded                 = EventType(0x600)
EventType.VlmMediaChanged               = EventType(1538)
EventType.VlmMediaInstanceStarted       = EventType(1539)
EventType.VlmMediaInstanceStatusEnd     = EventType(1545)
EventType.VlmMediaInstanceStatusError   = EventType(1546)
EventType.VlmMediaInstanceStatusInit    = EventType(1541)
EventType.VlmMediaInstanceStatusOpening = EventType(1542)
EventType.VlmMediaInstanceStatusPause   = EventType(1544)
EventType.VlmMediaInstanceStatusPlaying = EventType(1543)
EventType.VlmMediaInstanceStopped       = EventType(1540)
EventType.VlmMediaRemoved               = EventType(1537)

class Meta(_Enum):
    '''Meta data types.
    '''
    _enum_names_ = {
        0: 'Title',
        1: 'Artist',
        2: 'Genre',
        3: 'Copyright',
        4: 'Album',
        5: 'TrackNumber',
        6: 'Description',
        7: 'Rating',
        8: 'Date',
        9: 'Setting',
        10: 'URL',
        11: 'Language',
        12: 'NowPlaying',
        13: 'Publisher',
        14: 'EncodedBy',
        15: 'ArtworkURL',
        16: 'TrackID',
        17: 'TrackTotal',
        18: 'Director',
        19: 'Season',
        20: 'Episode',
        21: 'ShowName',
        22: 'Actors',
        23: 'AlbumArtist',
        24: 'DiscNumber',
        25: 'DiscTotal',
    }
Meta.Actors      = Meta(22)
Meta.Album       = Meta(4)
Meta.AlbumArtist = Meta(23)
Meta.Artist      = Meta(1)
Meta.ArtworkURL  = Meta(15)
Meta.Copyright   = Meta(3)
Meta.Date        = Meta(8)
Meta.Description = Meta(6)
Meta.Director    = Meta(18)
Meta.DiscNumber  = Meta(24)
Meta.DiscTotal   = Meta(25)
Meta.EncodedBy   = Meta(14)
Meta.Episode     = Meta(20)
Meta.Genre       = Meta(2)
Meta.Language    = Meta(11)
Meta.NowPlaying  = Meta(12)
Meta.Publisher   = Meta(13)
Meta.Rating      = Meta(7)
Meta.Season      = Meta(19)
Meta.Setting     = Meta(9)
Meta.ShowName    = Meta(21)
Meta.Title       = Meta(0)
Meta.TrackID     = Meta(16)
Meta.TrackNumber = Meta(5)
Meta.TrackTotal  = Meta(17)
Meta.URL         = Meta(10)

class State(_Enum):
    '''Note the order of libvlc_state_t enum must match exactly the order of
See mediacontrol_playerstatus, See input_state_e enums,
and videolan.libvlc.state (at bindings/cil/src/media.cs).
expected states by web plugins are:
idle/close=0, opening=1, playing=3, paused=4,
stopping=5, ended=6, error=7.
    '''
    _enum_names_ = {
        0: 'NothingSpecial',
        1: 'Opening',
        2: 'Buffering',
        3: 'Playing',
        4: 'Paused',
        5: 'Stopped',
        6: 'Ended',
        7: 'Error',
    }
State.Buffering      = State(2)
State.Ended          = State(6)
State.Error          = State(7)
State.NothingSpecial = State(0)
State.Opening        = State(1)
State.Paused         = State(4)
State.Playing        = State(3)
State.Stopped        = State(5)

class TrackType(_Enum):
    '''N/A
    '''
    _enum_names_ = {
        -1: 'unknown',
        0: 'audio',
        1: 'video',
        2: 'ext',
    }
TrackType.audio   = TrackType(0)
TrackType.ext     = TrackType(2)
TrackType.unknown = TrackType(-1)
TrackType.video   = TrackType(1)

class VideoOrient(_Enum):
    '''N/A
    '''
    _enum_names_ = {
        0: 'top_left',
        1: 'top_right',
        2: 'bottom_left',
        3: 'bottom_right',
        4: 'left_top',
        5: 'left_bottom',
        6: 'right_top',
        7: 'right_bottom',
    }
VideoOrient.bottom_left  = VideoOrient(2)
VideoOrient.bottom_right = VideoOrient(3)
VideoOrient.left_bottom  = VideoOrient(5)
VideoOrient.left_top     = VideoOrient(4)
VideoOrient.right_bottom = VideoOrient(7)
VideoOrient.right_top    = VideoOrient(6)
VideoOrient.top_left     = VideoOrient(0)
VideoOrient.top_right    = VideoOrient(1)

class VideoProjection(_Enum):
    '''N/A
    '''
    _enum_names_ = {
        0: 'rectangular',
        1: 'equirectangular',
        0x100: 'cubemap_layout_standard',
    }
VideoProjection.cubemap_layout_standard = VideoProjection(0x100)
VideoProjection.equirectangular         = VideoProjection(1)
VideoProjection.rectangular             = VideoProjection(0)

class MediaType(_Enum):
    '''Media type
See libvlc_media_get_type.
    '''
    _enum_names_ = {
        0: 'unknown',
        1: 'file',
        2: 'directory',
        3: 'disc',
        4: 'stream',
        5: 'playlist',
    }
MediaType.directory = MediaType(2)
MediaType.disc      = MediaType(3)
MediaType.file      = MediaType(1)
MediaType.playlist  = MediaType(5)
MediaType.stream    = MediaType(4)
MediaType.unknown   = MediaType(0)

class MediaParseFlag(_Enum):
    '''Parse flags used by libvlc_media_parse_with_options()
See libvlc_media_parse_with_options.
    '''
    _enum_names_ = {
        0x0: 'local',
        0x1: 'network',
        0x2: 'fetch_local',
        0x4: 'fetch_network',
        0x8: 'do_interact',
    }
MediaParseFlag.do_interact   = MediaParseFlag(0x8)
MediaParseFlag.fetch_local   = MediaParseFlag(0x2)
MediaParseFlag.fetch_network = MediaParseFlag(0x4)
MediaParseFlag.local         = MediaParseFlag(0x0)
MediaParseFlag.network       = MediaParseFlag(0x1)

class MediaParsedStatus(_Enum):
    '''Parse status used sent by libvlc_media_parse_with_options() or returned by
libvlc_media_get_parsed_status()
See libvlc_media_parse_with_options
See libvlc_media_get_parsed_status.
    '''
    _enum_names_ = {
        1: 'skipped',
        2: 'failed',
        3: 'timeout',
        4: 'done',
    }
MediaParsedStatus.done    = MediaParsedStatus(4)
MediaParsedStatus.failed  = MediaParsedStatus(2)
MediaParsedStatus.skipped = MediaParsedStatus(1)
MediaParsedStatus.timeout = MediaParsedStatus(3)

class MediaSlaveType(_Enum):
    '''Type of a media slave: subtitle or audio.
    '''
    _enum_names_ = {
        0: 'subtitle',
        1: 'audio',
    }
MediaSlaveType.audio    = MediaSlaveType(1)
MediaSlaveType.subtitle = MediaSlaveType(0)

class MediaDiscovererCategory(_Enum):
    '''Category of a media discoverer
See libvlc_media_discoverer_list_get().
    '''
    _enum_names_ = {
        0: 'devices',
        1: 'lan',
        2: 'podcasts',
        3: 'localdirs',
    }
MediaDiscovererCategory.devices   = MediaDiscovererCategory(0)
MediaDiscovererCategory.lan       = MediaDiscovererCategory(1)
MediaDiscovererCategory.localdirs = MediaDiscovererCategory(3)
MediaDiscovererCategory.podcasts  = MediaDiscovererCategory(2)

class PlaybackMode(_Enum):
    '''Defines playback modes for playlist.
    '''
    _enum_names_ = {
        0: 'default',
        1: 'loop',
        2: 'repeat',
    }
PlaybackMode.default = PlaybackMode(0)
PlaybackMode.loop    = PlaybackMode(1)
PlaybackMode.repeat  = PlaybackMode(2)

class VideoMarqueeOption(_Enum):
    '''Marq options definition.
    '''
    _enum_names_ = {
        0: 'Enable',
        1: 'Text',
        2: 'Color',
        3: 'Opacity',
        4: 'Position',
        5: 'Refresh',
        6: 'Size',
        7: 'Timeout',
        8: 'X',
        9: 'Y',
    }
VideoMarqueeOption.Color    = VideoMarqueeOption(2)
VideoMarqueeOption.Enable   = VideoMarqueeOption(0)
VideoMarqueeOption.Opacity  = VideoMarqueeOption(3)
VideoMarqueeOption.Position = VideoMarqueeOption(4)
VideoMarqueeOption.Refresh  = VideoMarqueeOption(5)
VideoMarqueeOption.Size     = VideoMarqueeOption(6)
VideoMarqueeOption.Text     = VideoMarqueeOption(1)
VideoMarqueeOption.Timeout  = VideoMarqueeOption(7)
VideoMarqueeOption.X        = VideoMarqueeOption(8)
VideoMarqueeOption.Y        = VideoMarqueeOption(9)

class NavigateMode(_Enum):
    '''Navigation mode.
    '''
    _enum_names_ = {
        0: 'activate',
        1: 'up',
        2: 'down',
        3: 'left',
        4: 'right',
        5: 'popup',
    }
NavigateMode.activate = NavigateMode(0)
NavigateMode.down     = NavigateMode(2)
NavigateMode.left     = NavigateMode(3)
NavigateMode.popup    = NavigateMode(5)
NavigateMode.right    = NavigateMode(4)
NavigateMode.up       = NavigateMode(1)

class Position(_Enum):
    '''Enumeration of values used to set position (e.g. of video title).
    '''
    _enum_names_ = {
        -1: 'disable',
        0: 'center',
        1: 'left',
        2: 'right',
        3: 'top',
        4: 'top_left',
        5: 'top_right',
        6: 'bottom',
        7: 'bottom_left',
        8: 'bottom_right',
    }
Position.bottom       = Position(6)
Position.bottom_left  = Position(7)
Position.bottom_right = Position(8)
Position.center       = Position(0)
Position.disable      = Position(-1)
Position.left         = Position(1)
Position.right        = Position(2)
Position.top          = Position(3)
Position.top_left     = Position(4)
Position.top_right    = Position(5)

class TeletextKey(_Enum):
    '''Enumeration of teletext keys than can be passed via
libvlc_video_set_teletext().
    '''
    _enum_names_ = {
        7471104: 'red',
        6750208: 'green',
        7929856: 'yellow',
        6422528: 'blue',
        6881280: 'index',
    }
TeletextKey.blue   = TeletextKey(6422528)
TeletextKey.green  = TeletextKey(6750208)
TeletextKey.index  = TeletextKey(6881280)
TeletextKey.red    = TeletextKey(7471104)
TeletextKey.yellow = TeletextKey(7929856)

class VideoLogoOption(_Enum):
    '''Option values for libvlc_video_{get,set}_logo_{int,string}.
    '''
    _enum_names_ = {
        0: 'logo_enable',
        1: 'logo_file',
        2: 'logo_x',
        3: 'logo_y',
        4: 'logo_delay',
        5: 'logo_repeat',
        6: 'logo_opacity',
        7: 'logo_position',
    }
VideoLogoOption.logo_delay    = VideoLogoOption(4)
VideoLogoOption.logo_enable   = VideoLogoOption(0)
VideoLogoOption.logo_file     = VideoLogoOption(1)
VideoLogoOption.logo_opacity  = VideoLogoOption(6)
VideoLogoOption.logo_position = VideoLogoOption(7)
VideoLogoOption.logo_repeat   = VideoLogoOption(5)
VideoLogoOption.logo_x        = VideoLogoOption(2)
VideoLogoOption.logo_y        = VideoLogoOption(3)

class VideoAdjustOption(_Enum):
    '''Option values for libvlc_video_{get,set}_adjust_{int,float,bool}.
    '''
    _enum_names_ = {
        0: 'Enable',
        1: 'Contrast',
        2: 'Brightness',
        3: 'Hue',
        4: 'Saturation',
        5: 'Gamma',
    }
VideoAdjustOption.Brightness = VideoAdjustOption(2)
VideoAdjustOption.Contrast   = VideoAdjustOption(1)
VideoAdjustOption.Enable     = VideoAdjustOption(0)
VideoAdjustOption.Gamma      = VideoAdjustOption(5)
VideoAdjustOption.Hue        = VideoAdjustOption(3)
VideoAdjustOption.Saturation = VideoAdjustOption(4)

class AudioOutputDeviceTypes(_Enum):
    '''Audio device types.
    '''
    _enum_names_ = {
        -1: 'Error',
        1: 'Mono',
        2: 'Stereo',
        4: '_2F2R',
        5: '_3F2R',
        6: '_5_1',
...

This file has been truncated, please download it to see its full contents.

import cv2
import numpy as np
from collections import namedtuple
from math import ceil, sqrt, exp, pi, floor, sin, cos, atan2


class HandRegion:
    def __init__(self, pd_score, pd_box, pd_kps=0):
        self.pd_score = pd_score # Palm detection score 
        self.pd_box = pd_box # Palm detection box [x, y, w, h] normalized
        self.pd_kps = pd_kps # Palm detection keypoints

    def print(self):
        attrs = vars(self)
        print('\n'.join("%s: %s" % item for item in attrs.items()))


SSDAnchorOptions = namedtuple('SSDAnchorOptions',[
        'num_layers',
        'min_scale',
        'max_scale',
        'input_size_height',
        'input_size_width',
        'anchor_offset_x',
        'anchor_offset_y',
        'strides',
        'aspect_ratios',
        'reduce_boxes_in_lowest_layer',
        'interpolated_scale_aspect_ratio',
        'fixed_anchor_size'])

def calculate_scale(min_scale, max_scale, stride_index, num_strides):
    if num_strides == 1:
        return (min_scale + max_scale) / 2
    else:
        return min_scale + (max_scale - min_scale) * stride_index / (num_strides - 1)

def generate_anchors(options):
    """
    option : SSDAnchorOptions
    # https://github.com/google/mediapipe/blob/master/mediapipe/calculators/tflite/ssd_anchors_calculator.cc
    """
    anchors = []
    layer_id = 0
    n_strides = len(options.strides)
    while layer_id < n_strides:
        anchor_height = []
        anchor_width = []
        aspect_ratios = []
        scales = []
        # For same strides, we merge the anchors in the same order.
        last_same_stride_layer = layer_id
        while last_same_stride_layer < n_strides and \
                options.strides[last_same_stride_layer] == options.strides[layer_id]:
            scale = calculate_scale(options.min_scale, options.max_scale, last_same_stride_layer, n_strides)
            if last_same_stride_layer == 0 and options.reduce_boxes_in_lowest_layer:
                # For first layer, it can be specified to use predefined anchors.
                aspect_ratios += [1.0, 2.0, 0.5]
                scales += [0.1, scale, scale]
            else:
                aspect_ratios += options.aspect_ratios
                scales += [scale] * len(options.aspect_ratios)
                if options.interpolated_scale_aspect_ratio > 0:
                    if last_same_stride_layer == n_strides -1:
                        scale_next = 1.0
                    else:
                        scale_next = calculate_scale(options.min_scale, options.max_scale, last_same_stride_layer+1, n_strides)
                    scales.append(sqrt(scale * scale_next))
                    aspect_ratios.append(options.interpolated_scale_aspect_ratio)
            last_same_stride_layer += 1
        
        for i,r in enumerate(aspect_ratios):
            ratio_sqrts = sqrt(r)
            anchor_height.append(scales[i] / ratio_sqrts)
            anchor_width.append(scales[i] * ratio_sqrts)

        stride = options.strides[layer_id]
        feature_map_height = ceil(options.input_size_height / stride)
        feature_map_width = ceil(options.input_size_width / stride)

        for y in range(feature_map_height):
            for x in range(feature_map_width):
                for anchor_id in range(len(anchor_height)):
                    x_center = (x + options.anchor_offset_x) / feature_map_width
                    y_center = (y + options.anchor_offset_y) / feature_map_height
                    # new_anchor = Anchor(x_center=x_center, y_center=y_center)
                    if options.fixed_anchor_size:
                        new_anchor = [x_center, y_center, 1.0, 1.0]
                        # new_anchor.w = 1.0
                        # new_anchor.h = 1.0
                    else:
                        new_anchor = [x_center, y_center, anchor_width[anchor_id], anchor_height[anchor_id]]
                        # new_anchor.w = anchor_width[anchor_id]
                        # new_anchor.h = anchor_height[anchor_id]
                    anchors.append(new_anchor)
        
        layer_id = last_same_stride_layer
    return np.array(anchors)


def decode_bboxes(score_thresh, scores, bboxes, anchors):
    """
    wi, hi : NN input shape
    mediapipe/calculators/tflite/tflite_tensors_to_detections_calculator.cc
    # Decodes the detection tensors generated by the model, based on
    # the SSD anchors and the specification in the options, into a vector of
    # detections. Each detection describes a detected object.

    https://github.com/google/mediapipe/blob/master/mediapipe/modules/palm_detection/palm_detection_cpu.pbtxt :
    node {
        calculator: "TensorsToDetectionsCalculator"
        input_stream: "TENSORS:detection_tensors"
        input_side_packet: "ANCHORS:anchors"
        output_stream: "DETECTIONS:unfiltered_detections"
        options: {
            [mediapipe.TensorsToDetectionsCalculatorOptions.ext] {
            num_classes: 1
            num_boxes: 896
            num_coords: 18
            box_coord_offset: 0
            keypoint_coord_offset: 4
            num_keypoints: 7
            num_values_per_keypoint: 2
            sigmoid_score: true
            score_clipping_thresh: 100.0
            reverse_output_order: true

            x_scale: 128.0
            y_scale: 128.0
            h_scale: 128.0
            w_scale: 128.0
            min_score_thresh: 0.5
            }
        }
    }

    scores: shape = [number of anchors 896]
    bboxes: shape = [ number of anchors x 18], 18 = 4 (bounding box : (cx,cy,w,h) + 14 (7 palm keypoints)
    """
    regions = []
    scores = 1 / (1 + np.exp(-scores))
    detection_mask = scores > score_thresh
    det_scores = scores[detection_mask]
    if det_scores.size == 0: return regions
    det_bboxes = bboxes[detection_mask]
    det_anchors = anchors[detection_mask]
    scale = 128 # x_scale, y_scale, w_scale, h_scale

    # cx, cy, w, h = bboxes[i,:4]
    # cx = cx * anchor.w / wi + anchor.x_center 
    # cy = cy * anchor.h / hi + anchor.y_center
    # lx = lx * anchor.w / wi + anchor.x_center 
    # ly = ly * anchor.h / hi + anchor.y_center
    det_bboxes = det_bboxes* np.tile(det_anchors[:,2:4], 9) / scale + np.tile(det_anchors[:,0:2],9)
    # w = w * anchor.w / wi (in the prvious line, we add anchor.x_center and anchor.y_center to w and h, we need to substract them now)
    # h = h * anchor.h / hi
    det_bboxes[:,2:4] = det_bboxes[:,2:4] - det_anchors[:,0:2]
    # box = [cx - w*0.5, cy - h*0.5, w, h]
    det_bboxes[:,0:2] = det_bboxes[:,0:2] - det_bboxes[:,3:4] * 0.5

    for i in range(det_bboxes.shape[0]):
        score = det_scores[i]
        box = det_bboxes[i,0:4]
        kps = []
        # 0 : wrist
        # 1 : index finger joint
        # 2 : middle finger joint
        # 3 : ring finger joint
        # 4 : little finger joint
        # 5 : 
        # 6 : thumb joint
        # for j, name in enumerate(["0", "1", "2", "3", "4", "5", "6"]):
        #     kps[name] = det_bboxes[i,4+j*2:6+j*2]
        for kp in range(7):
            kps.append(det_bboxes[i,4+kp*2:6+kp*2])
        regions.append(HandRegion(float(score), box, kps))
    return regions

def non_max_suppression(regions, nms_thresh):

    # cv2.dnn.NMSBoxes(boxes, scores, 0, nms_thresh) needs:
    # boxes = [ [x, y, w, h], ...] with x, y, w, h of type int
    # Currently, x, y, w, h are float between 0 and 1, so we arbitrarily multiply by 1000 and cast to int
    # boxes = [r.box for r in regions]
    boxes = [ [int(x*1000) for x in r.pd_box] for r in regions]        
    scores = [r.pd_score for r in regions]
    indices = cv2.dnn.NMSBoxes(boxes, scores, 0, nms_thresh)
    return [regions[i[0]] for i in indices]

def normalize_radians(angle):
    return angle - 2 * pi * floor((angle + pi) / (2 * pi))

def rot_vec(vec, rotation):
    vx, vy = vec
    return [vx * cos(rotation) - vy * sin(rotation), vx * sin(rotation) + vy * cos(rotation)]

def detections_to_rect(regions):
    # https://github.com/google/mediapipe/blob/master/mediapipe/modules/hand_landmark/palm_detection_detection_to_roi.pbtxt
    # # Converts results of palm detection into a rectangle (normalized by image size)
    # # that encloses the palm and is rotated such that the line connecting center of
    # # the wrist and MCP of the middle finger is aligned with the Y-axis of the
    # # rectangle.
    # node {
    #   calculator: "DetectionsToRectsCalculator"
    #   input_stream: "DETECTION:detection"
    #   input_stream: "IMAGE_SIZE:image_size"
    #   output_stream: "NORM_RECT:raw_roi"
    #   options: {
    #     [mediapipe.DetectionsToRectsCalculatorOptions.ext] {
    #       rotation_vector_start_keypoint_index: 0  # Center of wrist.
    #       rotation_vector_end_keypoint_index: 2  # MCP of middle finger.
    #       rotation_vector_target_angle_degrees: 90
    #     }
    #   }
    
    target_angle = pi * 0.5 # 90 = pi/2
    for region in regions:
        
        region.rect_w = region.pd_box[2]
        region.rect_h = region.pd_box[3]
        region.rect_x_center = region.pd_box[0] + region.rect_w / 2
        region.rect_y_center = region.pd_box[1] + region.rect_h / 2

        x0, y0 = region.pd_kps[0] # wrist center
        x1, y1 = region.pd_kps[2] # middle finger
        rotation = target_angle - atan2(-(y1 - y0), x1 - x0)
        region.rotation = normalize_radians(rotation)
        
def rotated_rect_to_points(cx, cy, w, h, rotation, wi, hi):
    b = cos(rotation) * 0.5
    a = sin(rotation) * 0.5
    points = []
    p0x = cx - a*h - b*w
    p0y = cy + b*h - a*w
    p1x = cx + a*h - b*w
    p1y = cy - b*h - a*w
    p2x = int(2*cx - p0x)
    p2y = int(2*cy - p0y)
    p3x = int(2*cx - p1x)
    p3y = int(2*cy - p1y)
    p0x, p0y, p1x, p1y = int(p0x), int(p0y), int(p1x), int(p1y)
    return [(p0x,p0y), (p1x,p1y), (p2x,p2y), (p3x,p3y)]

def rect_transformation(regions, w, h):
    """
    w, h : image input shape
    """
    # https://github.com/google/mediapipe/blob/master/mediapipe/modules/hand_landmark/palm_detection_detection_to_roi.pbtxt
    # # Expands and shifts the rectangle that contains the palm so that it's likely
    # # to cover the entire hand.
    # node {
    # calculator: "RectTransformationCalculator"
    # input_stream: "NORM_RECT:raw_roi"
    # input_stream: "IMAGE_SIZE:image_size"
    # output_stream: "roi"
    # options: {
    #     [mediapipe.RectTransformationCalculatorOptions.ext] {
    #     scale_x: 2.6
    #     scale_y: 2.6
    #     shift_y: -0.5
    #     square_long: true
    #     }
    # }
    scale_x = 2.6
    scale_y = 2.6
    shift_x = 0
    shift_y = -0.5
    for region in regions:
        width = region.rect_w
        height = region.rect_h
        rotation = region.rotation
        if rotation == 0:
            region.rect_x_center_a = (region.rect_x_center + width * shift_x) * w
            region.rect_y_center_a = (region.rect_y_center + height * shift_y) * h
        else:
            x_shift = (w * width * shift_x * cos(rotation) - h * height * shift_y * sin(rotation)) #/ w
            y_shift = (w * width * shift_x * sin(rotation) + h * height * shift_y * cos(rotation)) #/ h
            region.rect_x_center_a = region.rect_x_center*w + x_shift
            region.rect_y_center_a = region.rect_y_center*h + y_shift

        # square_long: true
        long_side = max(width * w, height * h)
        region.rect_w_a = long_side * scale_x
        region.rect_h_a = long_side * scale_y
        region.rect_points = rotated_rect_to_points(region.rect_x_center_a, region.rect_y_center_a, region.rect_w_a, region.rect_h_a, region.rotation, w, h)

def warp_rect_img(rect_points, img, w, h):
        src = np.array(rect_points[1:], dtype=np.float32) # rect_points[0] is left bottom point !
        dst = np.array([(0, 0), (h, 0), (h, w)], dtype=np.float32)
        mat = cv2.getAffineTransform(src, dst)
        return cv2.warpAffine(img, mat, (w, h))

def distance(a, b):
    """
    a, b: 2 points in 3D (x,y,z)
    """
    return np.linalg.norm(a-b)

def angle(a, b, c):
    # https://stackoverflow.com/questions/35176451/python-code-to-calculate-angle-between-three-point-using-their-3d-coordinates
    # a, b and c : points as np.array([x, y, z]) 
    ba = a - b
    bc = c - b
    cosine_angle = np.dot(ba, bc) / (np.linalg.norm(ba) * np.linalg.norm(bc))
    angle = np.arccos(cosine_angle)

    return np.degrees(angle)

#! /usr/bin/python3
import numpy as np
import vlc #importing vlc player module
from pyPS4Controller.controller import Controller
import serial.tools.list_ports
import pyautogui
import time
import os, signal
import threading

Direc = "//home/ur/Videos/playlist/"
Img_Path = '//home/ur/Videos/'
files = os.listdir(Direc)
media_player = vlc.MediaPlayer() # creating vlc media player object
Imedia = vlc.Instance
#media_player.toggle_fullscreen()
i = 0
h="Load Img"
timer_ready = False

ports = []
starT = 0.1
end_T = 0.1
comm1 = serial.Serial()

songs = os.listdir(Direc)
songs = np.sort(songs)
print(f"Files in the directory: {Direc}")
print('---')
print(*songs, sep='\n')
print('---')

def Available():
    global comm1
    for port in serial.tools.list_ports.comports():
        ports.append(port.name)
    print ('listo:')
    if ports[0] == 'ttyACM0':
        comm1 = serial.Serial(port='/dev/ttyACM0',baudrate = 230400,parity=serial.PARITY_NONE,stopbits=serial.STOPBITS_ONE,bytesize=serial.EIGHTBITS,timeout=1)
        print(type(comm1))
        print ("ok")
        return True
    
    else:
        print ("Nope")
        return False
    
def Uart_comm(cmnd):
    if Available():
        comm1.write(cmnd.encode('utf-8'))
        comm1.close()

def LoadMidImg(s):
        global starT
        global end_T
        global timer_ready
        print (s)
        if timer_ready == True:
                media = vlc.Media(Img_Path + 'Controller01.png')
                media_player.set_media(media)
                media_player.play()
                #time.sleep(2.5)
                timer_ready = False
        end_T = time.process_time()#time.time_ns()
        timevar = end_T-starT
        print("Time taken", str(timevar), "ns")

def CreateTimer():
        global timer_ready
        if timer_ready == False:
                timer_ready = True
                t = threading.Timer(10.0, LoadMidImg, [h])
                t.start()
                print ('timer ready')
        # else:
                # t.cancel()
                # t = threading.Timer(10.0, LoadMidImg, [h])
                # t.start()
                # print ('timer ready')

class MyController(Controller):
    
    def __init__(self, **kwargs):
        Controller.__init__(self, **kwargs)
        
    def get_uppercase(self, str_name):
        uppercase = []
        for i in range(len(str_name)):
            if str_name[i].isupper():
                uppercase.append(str_name[i])
        if(len(uppercase) > 0):
            return "".join(uppercase)
        else:
            return "No found uppercase"
                
    def on_x_press(self):
       print("Hello")
       #media_player.play()

    def on_x_release(self):
       print("Goodbye")
       media_player.stop()
    def on_square_press(self):
        media_player.pause()
    def on_triangle_press(self):
        global i
        global songs
        media = vlc.Media('//home/ur/Videos/playlist/' + songs[i])
        media_player.set_media(media)
        media_player.play()
        
        mediaval = media_player.audio_get_track_description()
        print (mediaval)

    def on_right_arrow_press(self):
        global i
        global songs
        global timer_ready
        global starT
        global end_T

        starT = time.process_time()
        upletters = self.get_uppercase(songs[i])
        print(upletters)
        print (len(upletters))
        Uart_comm("rgb,yello\n")
        time.sleep(0.2)
        Uart_comm("chars," + upletters + "  \n")
        media = vlc.Media('//home/ur/Videos/playlist/' + songs[i])
        media_player.set_media(media)
        media_player.play()
        time.sleep(0.2)
        MediaDuration = media_player.get_length()
        print(MediaDuration)
        i = i + 1
        if i >= len(songs):
            i = 0

        CreateTimer()

    def on_left_arrow_press(self):
        global i
        global songs
        i = i - 1
        if i < 0:
            i = len(songs) - 1
        upletters = self.get_uppercase(songs[i])
        print(upletters)
        print (len(upletters))
        media = vlc.Media('//home/ur/Videos/playlist/' + songs[i])
        media_player.set_media(media)
        media_player.play()
        
    def on_R1_press(self):
        media_player.set_time(media_player.get_time() + 1000)

    def on_L1_press(self):
        media_player.set_time(media_player.get_time() - 1000)
        
    def on_options_press(self):
        pyautogui.press('s')
        time.sleep(1)
        
    def on_circle_press(self):
        pyautogui.press('w')
        time.sleep(1)

    def on_share_press(self):
        pyautogui.press("q")
        time.sleep(2)
        os.kill(os.getppid(), signal.SIGHUP)
        #os._exit(0)
        
    def on_R2_press(self, value):
        value = (value + 32431) * 0.0015335
        value = int(value)
        stvalue = str(value)
        print(stvalue)
        print (len(stvalue))
        
    def on_L2_press(self, value):
        value = (value + 32431) * 0.0015335
        value = int(value)
        stvalue = str(value)
        print(stvalue)
        if (len(stvalue) < 2):
                #Uart_comm("spm,S180,P" + stvalue + " \n")
                print("spm,S180,P0" + stvalue + "\n")
        else:
                print("spm,S180,P" + stvalue + "\n")

controller = MyController(interface="/dev/input/js0", connecting_using_ds4drv=False)
controller.listen(timeout=60)

#! /usr/bin/python3
import numpy as np
import copy
import itertools
import collections
from collections import namedtuple
import mediapipe_utils as mpu
import depthai as dai
import cv2
from pathlib import Path
import time
import argparse
import uart_comm

video_en = True
# from multiprocessing import Process, Queue
#from uart_comm import CommRob

characters = ['A', 'B', 'C', 'D', 
              'E', 'F', 'G', 'H', 
              'I', 'K', 'L', 'M', 
              'N', 'O', 'P', 'Q', 
              'R', 'S', 'T', 'U', 
              'V', 'W', 'X', 'Y']

FINGER_COLOR = [(128, 128, 128), (80, 190, 168), 
         (234, 187, 105), (175, 119, 212), 
         (81, 110, 221)]

JOINT_COLOR = [(0, 0, 0), (125, 255, 79), 
            (255, 102, 0), (181, 70, 255), 
            (13, 63, 255)]

# def to_planar(arr: np.ndarray, shape: tuple) -> list:

def to_planar(arr: np.ndarray, shape: tuple) -> np.ndarray:
    resized = cv2.resize(arr, shape, interpolation=cv2.INTER_NEAREST).transpose(2,0,1)
    return resized

class HandTrackerASL:
    def __init__(self,
                pd_path="models/palm_detection_6_shaves.blob", 
                pd_score_thresh=0.65, pd_nms_thresh=0.3,
                lm_path="models/hand_landmark_6_shaves.blob",
                lm_score_threshold=0.5,
                show_landmarks=True,
                show_hand_box=True,
                asl_path="models/hand_asl_6_shaves.blob",
                asl_recognition=True,
                show_asl=True):

        self.pd_path = pd_path
        self.pd_score_thresh = pd_score_thresh
        self.pd_nms_thresh = pd_nms_thresh
        self.lm_path = lm_path
        self.lm_score_threshold = lm_score_threshold
        self.asl_path = asl_path
        self.show_landmarks=show_landmarks
        self.show_hand_box = show_hand_box
        self.asl_recognition = asl_recognition
        self.show_asl = show_asl

        anchor_options = mpu.SSDAnchorOptions(num_layers=4, 
                                min_scale=0.1484375,
                                max_scale=0.75,
                                input_size_height=128,
                                input_size_width=128,
                                anchor_offset_x=0.5,
                                anchor_offset_y=0.5,
                                strides=[8, 16, 16, 16],
                                aspect_ratios= [1.0],
                                reduce_boxes_in_lowest_layer=False,
                                interpolated_scale_aspect_ratio=1.0,
                                fixed_anchor_size=True)
        
        self.anchors = mpu.generate_anchors(anchor_options)
        self.nb_anchors = self.anchors.shape[0]
        print(f"{self.nb_anchors} anchors have been created")

        self.preview_width = 700#576
        self.preview_height = 400 #324

        self.frame_size = None

        self.ft = cv2.freetype.createFreeType2()
        self.ft.loadFontData(fontFileName='HelveticaNeue.ttf', id=0)

        self.right_char_queue = collections.deque(maxlen=5)
        self.left_char_queue = collections.deque(maxlen=5)

        self.previous_right_char = ""
        self.right_sentence = ""
        self.previous_right_update_time = time.time()
        self.previous_left_char = ""
        self.left_sentence = ""
        self.previous_left_update_time = time.time()


    def create_pipeline(self):
        print("Creating pipeline...")
        pipeline = dai.Pipeline()
        pipeline.setOpenVINOVersion(version = dai.OpenVINO.Version.VERSION_2021_2)
        self.pd_input_length = 128

        print("Creating Color Camera...")
        cam = pipeline.createColorCamera()
        cam.setPreviewSize(self.preview_width, self.preview_height)
        cam.setInterleaved(False)
        cam.setBoardSocket(dai.CameraBoardSocket.RGB)
        cam_out = pipeline.createXLinkOut()
        cam_out.setStreamName("cam_out")
        cam.preview.link(cam_out.input)

        print("Creating Palm Detection Neural Network...")
        pd_nn = pipeline.createNeuralNetwork()
        pd_nn.setBlobPath(str(Path(self.pd_path).resolve().absolute()))
        pd_in = pipeline.createXLinkIn()
        pd_in.setStreamName("pd_in")
        pd_in.out.link(pd_nn.input)
        pd_out = pipeline.createXLinkOut()
        pd_out.setStreamName("pd_out")
        pd_nn.out.link(pd_out.input)

        print("Creating Hand Landmark Neural Network...")          
        lm_nn = pipeline.createNeuralNetwork()
        lm_nn.setBlobPath(str(Path(self.lm_path).resolve().absolute()))
        self.lm_input_length = 224
        lm_in = pipeline.createXLinkIn()
        lm_in.setStreamName("lm_in")
        lm_in.out.link(lm_nn.input)
        lm_out = pipeline.createXLinkOut()
        lm_out.setStreamName("lm_out")
        lm_nn.out.link(lm_out.input)

        print("Creating Hand ASL Recognition Neural Network...")          
        asl_nn = pipeline.createNeuralNetwork()
        asl_nn.setBlobPath(str(Path(self.asl_path).resolve().absolute()))
        self.asl_input_length = 224
        asl_in = pipeline.createXLinkIn()
        asl_in.setStreamName("asl_in")
        asl_in.out.link(asl_nn.input)
        asl_out = pipeline.createXLinkOut()
        asl_out.setStreamName("asl_out")
        asl_nn.out.link(asl_out.input)

        print("Pipeline created.")
        return pipeline


    def pd_postprocess(self, inference):
        scores = np.array(inference.getLayerFp16("classificators"), dtype=np.float16) # 896
        bboxes = np.array(inference.getLayerFp16("regressors"), dtype=np.float16).reshape((self.nb_anchors,18)) # 896x18
        # Decode bboxes
        self.regions = mpu.decode_bboxes(self.pd_score_thresh, scores, bboxes, self.anchors)
        # Non maximum suppression
        self.regions = mpu.non_max_suppression(self.regions, self.pd_nms_thresh)
        mpu.detections_to_rect(self.regions)
        mpu.rect_transformation(self.regions, self.frame_size, self.frame_size)

    
    def lm_postprocess(self, region, inference):
        region.lm_score = inference.getLayerFp16("Identity_1")[0]    
        region.handedness = inference.getLayerFp16("Identity_2")[0]
        lm_raw = np.array(inference.getLayerFp16("Squeeze"))
        
        lm = []
        for i in range(int(len(lm_raw)/3)):
            # x,y,z -> x/w,y/h,z/w (here h=w)
            lm.append(lm_raw[3*i:3*(i+1)]/self.lm_input_length)
        region.landmarks = lm


    def lm_render(self, frame, original_frame, region):
        cropped_frame = None
        hand_bbox = []
        if region.lm_score > self.lm_score_threshold:
            palmar = True
            src = np.array([(0, 0), (1, 0), (1, 1)], dtype=np.float32)
            dst = np.array([ (x, y) for x,y in region.rect_points[1:]], dtype=np.float32) # region.rect_points[0] is left bottom point !
            mat = cv2.getAffineTransform(src, dst)
            lm_xy = np.expand_dims(np.array([(l[0], l[1]) for l in region.landmarks]), axis=0)
            lm_xy = np.squeeze(cv2.transform(lm_xy, mat)).astype(np.int)
            if self.show_landmarks:
                list_connections = [[0, 1, 2, 3, 4], 
                                    [5, 6, 7, 8], 
                                    [9, 10, 11, 12],
                                    [13, 14 , 15, 16],
                                    [17, 18, 19, 20]]
                palm_line = [np.array([lm_xy[point] for point in [0, 5, 9, 13, 17, 0]])]

                # Draw lines connecting the palm
                if region.handedness > 0.5:
                    # Simple condition to determine if palm is palmar or dorasl based on the relative 
                    # position of thumb and pinky finger
                    if lm_xy[4][0] > lm_xy[20][0]:
                        cv2.polylines(frame, palm_line, False, (255, 255, 255), 2, cv2.LINE_AA)
                    else:
                        cv2.polylines(frame, palm_line, False, (128, 128, 128), 2, cv2.LINE_AA)
                else:
                    # Simple condition to determine if palm is palmar or dorasl based on the relative 
                    # position of thumb and pinky finger
                    if lm_xy[4][0] < lm_xy[20][0]:
                        cv2.polylines(frame, palm_line, False, (255, 255, 255), 2, cv2.LINE_AA)
                    else:
                        cv2.polylines(frame, palm_line, False, (128, 128, 128), 2, cv2.LINE_AA)
                
                # Draw line for each finger
                for i in range(len(list_connections)):
                    finger = list_connections[i]
                    line = [np.array([lm_xy[point] for point in finger])]
                    if region.handedness > 0.5:
                        if lm_xy[4][0] > lm_xy[20][0]:
                            palmar = True
                            cv2.polylines(frame, line, False, FINGER_COLOR[i], 2, cv2.LINE_AA)
                            for point in finger:
                                pt = lm_xy[point]
                                cv2.circle(frame, (pt[0], pt[1]), 3, JOINT_COLOR[i], -1)
                        else:
                            palmar = False
                    else:
                        if lm_xy[4][0] < lm_xy[20][0]:
                            palmar = True
                            cv2.polylines(frame, line, False, FINGER_COLOR[i], 2, cv2.LINE_AA)
                            for point in finger:
                                pt = lm_xy[point]
                                cv2.circle(frame, (pt[0], pt[1]), 3, JOINT_COLOR[i], -1)
                        else:
                            palmar = False
                    
                    # Use different colour for the hand to represent dorsal side
                    if not palmar:
                        cv2.polylines(frame, line, False, (128, 128, 128), 2, cv2.LINE_AA)
                        for point in finger:
                            pt = lm_xy[point]
                            cv2.circle(frame, (pt[0], pt[1]), 3, (0, 0, 0), -1)

            # Calculate the bounding box for the entire hand
            max_x = 0
            max_y = 0
            min_x = frame.shape[1]
            min_y = frame.shape[0]
            for x,y in lm_xy:
                if x < min_x:
                    min_x = x
                if x > max_x:
                    max_x = x
                if y < min_y:
                    min_y = y
                if y > max_y:
                    max_y = y

            box_width = max_x - min_x
            box_height = max_y - min_y
            x_center = min_x + box_width / 2
            y_center = min_y + box_height / 2

            # Enlarge the hand bounding box for drawing use
            draw_width = box_width/2 * 1.2
            draw_height = box_height/2 * 1.2
            draw_size = max(draw_width, draw_height)

            draw_min_x = int(x_center - draw_size)
            draw_min_y = int(y_center - draw_size)
            draw_max_x = int(x_center + draw_size)
            draw_max_y = int(y_center + draw_size)

            hand_bbox = [draw_min_x, draw_min_y, draw_max_x, draw_max_y]

            if self.show_hand_box:

                cv2.rectangle(frame, (draw_min_x, draw_min_y), (draw_max_x, draw_max_y), (36, 152, 0), 2)

                palmar_text = ""
                if region.handedness > 0.5:
                    palmar_text = "Right: "
                else:
                    palmar_text = "Left: "
                if palmar:
                    palmar_text = palmar_text + "Palmar"
                else:
                    palmar_text = palmar_text + "Dorsal"
                self.ft.putText(img=frame, text=palmar_text , org=(draw_min_x + 1, draw_max_x + 15 + 1), fontHeight=14, color=(0, 0, 0), thickness=-1, line_type=cv2.LINE_AA, bottomLeftOrigin=True)
                self.ft.putText(img=frame, text=palmar_text , org=(draw_min_x, draw_max_x + 15), fontHeight=14, color=(255, 255, 255), thickness=-1, line_type=cv2.LINE_AA, bottomLeftOrigin=True)

            if self.asl_recognition:
                # Enlarge the hand bounding box for image cropping 
                new_width = box_width/2 * 1.5
                new_height = box_height/2 * 1.5

                new_size = max(new_width, new_height)

                min_x = int(x_center - new_size)
                min_y = int(y_center - new_size)
                max_x = int(x_center + new_size)
                max_y = int(y_center + new_size)
                
                if min_x < 0:
                    min_x = 0
                if min_y < 0:
                    min_y = 0
                if max_x > frame.shape[1]:
                    max_x = frame.shape[1] - 1
                if max_y > frame.shape[0]:
                    max_y = frame.shape[0] - 1

                # Crop out the image of the hand
                cropped_frame = original_frame[min_y:max_y, min_x:max_x]

        return cropped_frame, region.handedness, hand_bbox


    def run(self):
        global video_en
        device = dai.Device(self.create_pipeline())
        device.startPipeline()

        q_video = device.getOutputQueue(name="cam_out", maxSize=1, blocking=False)
        q_pd_in = device.getInputQueue(name="pd_in")
        q_pd_out = device.getOutputQueue(name="pd_out", maxSize=4, blocking=True)
        q_lm_out = device.getOutputQueue(name="lm_out", maxSize=4, blocking=True)
        q_lm_in = device.getInputQueue(name="lm_in")
        q_asl_out = device.getOutputQueue(name="asl_out", maxSize=4, blocking=True)
        q_asl_in = device.getInputQueue(name="asl_in")

        while True:
            in_video = q_video.get()
            video_frame = in_video.getCvFrame()

            h, w = video_frame.shape[:2]
            self.frame_size = max(h, w)
            self.pad_h = int((self.frame_size - h)/2)
            self.pad_w = int((self.frame_size - w)/2)

            video_frame = cv2.copyMakeBorder(video_frame, self.pad_h, self.pad_h, self.pad_w, self.pad_w, cv2.BORDER_CONSTANT)
            
            frame_nn = dai.ImgFrame()
            frame_nn.setWidth(self.pd_input_length)
            frame_nn.setHeight(self.pd_input_length)
            frame_nn.setData(to_planar(video_frame, (self.pd_input_length, self.pd_input_length)))
            q_pd_in.send(frame_nn)

            annotated_frame = video_frame.copy()

            # Get palm detection
            inference = q_pd_out.get()
            self.pd_postprocess(inference)

            # Send data for hand landmarks
            for i,r in enumerate(self.regions):
                img_hand = mpu.warp_rect_img(r.rect_points, video_frame, self.lm_input_length, self.lm_input_length)
                nn_data = dai.NNData()   
                nn_data.setLayer("input_1", to_planar(img_hand, (self.lm_input_length, self.lm_input_length)))
                q_lm_in.send(nn_data)

            # Retrieve hand landmarks
            for i,r in enumerate(self.regions):
                inference = q_lm_out.get()
                self.lm_postprocess(r, inference)
                hand_frame, handedness, hand_bbox = self.lm_render(video_frame, annotated_frame, r)
                # ASL recognition
                if hand_frame is not None and self.asl_recognition:
                    hand_frame = cv2.resize(hand_frame, (self.asl_input_length, self.asl_input_length), interpolation=cv2.INTER_NEAREST)
                    hand_frame = hand_frame.transpose(2,0,1)
                    nn_data = dai.NNData()
                    nn_data.setLayer("input", hand_frame)
                    q_asl_in.send(nn_data)
                    asl_result = np.array(q_asl_out.get().getFirstLayerFp16())
                    asl_idx = np.argmax(asl_result)
                    # Recognized ASL character is associated with a probability
                    asl_char = [characters[asl_idx], round(asl_result[asl_idx] * 100, 1)]
                    selected_char = asl_char
                    prueba = 'Vino'
                    print("prub")
                    print(prueba[0], selected_char[0],selected_char[1])
                    uart_comm.Uart_comm("AIchar," + selected_char[0] + " \n")
                    # if selected_char[0] == prueba[0]:
                        # uart_comm.Uart_comm("FE,_smile\n")
                    # if characters[asl_idx] == 'I':
                        # uart_comm.Uart_comm("AIchar," + selected_char[0] + " \n")
                    # if characters[asl_idx] == 'C':
                        # #queueS1.put("ccccc\n")
                        # uart_comm.Uart_comm("rgb,yello\n")
                    # if characters[asl_idx] == 'L':
                        # #queueS1.put("ddddd\n")
                        # uart_comm.Uart_comm("FE,surpri\n")
                    # if characters[asl_idx] == 'W':
                        # #queueS1.put("gtgtgt\n")
                        # uart_comm.Uart_comm("rgb,__off\n")
                    
                    current_char_queue = None
                    if handedness > 0.5:
                        current_char_queue = self.right_char_queue
                    else:
                        current_char_queue = self.left_char_queue
                    current_char_queue.append(selected_char)
                    # Peform filtering of recognition resuls using the previous 5 results
                    # If there aren't enough reults, take the first result as output
                    if len(current_char_queue) < 5:
                        selected_char = current_char_queue[0]
                    else:
                        char_candidate = {}
                        for i in range(5):
                            if current_char_queue[i][0] not in char_candidate:
                                char_candidate[current_char_queue[i][0]] = [1, current_char_queue[i][1]]
                            else:
                                char_candidate[current_char_queue[i][0]][0] += 1
                                char_candidate[current_char_queue[i][0]][1] += current_char_queue[i][1]
                        most_voted_char = ""
                        max_votes = 0
                        most_voted_char_prob = 0
                        for key in char_candidate:
                            if char_candidate[key][0] > max_votes:
                                max_votes = char_candidate[key][0]
                                most_voted_char = key
                                most_voted_char_prob = round(char_candidate[key][1] / char_candidate[key][0], 1)
                        selected_char = (most_voted_char, most_voted_char_prob)
                    
                    if self.show_asl:
                        gesture_string = "Letter: " + selected_char[0] + ", " + str(selected_char[1]) + "%"
                        textSize = self.ft.getTextSize(gesture_string, fontHeight=14, thickness=-1)[0]
                        cv2.rectangle(video_frame, (hand_bbox[0] - 5, hand_bbox[1]), (hand_bbox[0] + textSize[0] + 5, hand_bbox[1] - 18), (36, 152, 0), -1)
                        self.ft.putText(img=video_frame, text=gesture_string , org=(hand_bbox[0], hand_bbox[1] - 5), fontHeight=14, color=(255, 255, 255), thickness=-1, line_type=cv2.LINE_AA, bottomLeftOrigin=True)

            video_frame = video_frame[self.pad_h:self.pad_h+h, self.pad_w:self.pad_w+w]
            if video_en:
                cv2.imshow("hand tracker", video_frame)
            key = cv2.waitKey(1) 
            if key == ord('q') or key == 27:
                break
            elif key == 32:
                # Pause on space bar
                cv2.waitKey(0)
            elif key == ord('1'):
                self.show_hand_box = not self.show_hand_box
            elif key == ord('2'):
                self.show_landmarks = not self.show_landmarks
            elif key == ord('s'):
                video_en = False
                #self.show_asl = not self.show_asl
            elif key == ord('w'):
                video_en = True
                #self.show_asl = not self.show_asl

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--pd_m", default="models/palm_detection_6_shaves.blob", type=str,
                        help="Path to a blob file for palm detection model (default=%(default)s)")
    parser.add_argument("--lm_m", default="models/hand_landmark_6_shaves.blob", type=str,
                        help="Path to a blob file for landmark model (default=%(default)s)")
    parser.add_argument("--asl_m", default="models/hand_asl_6_shaves.blob", type=str,
                        help="Path to a blob file for ASL recognition model (default=%(default)s)")
    parser.add_argument('-asl', '--asl', default=True, 
                        help="enable ASL recognition")
    args = parser.parse_args()
    
    ht = HandTrackerASL(pd_path=args.pd_m, lm_path=args.lm_m, asl_path=args.asl_m, asl_recognition=args.asl)
    ht.run()