Research & Scholarly Activity

Research Laboratory: 

Research Areas:

  • AI Ethics: Bias and Fairness algorithms, Privacy, Human-AI Interaction
  • Computer Vision (CV): Computational modeling, Facial Expression Recognition, Body pose estimation, Gender Classification, Image processing in Marine Science
  • Human-Computer Interaction (HCI):  Visual Search, Eye-tracking, Cognitive load, Designing for Gender and Diversity, Inclusiveness, Refugee Informatics
  • Human-Automation Interaction (HAI): User Performance, Eye-tracking, Cognitive Load, Error Detection, HCI for social good

 

Research Lab: Room: MH325

 

 

Sample Research Projects: 

Gender Classification Accuracy via Two-Dimensional Body Joints using C5 pre-trained model based on ResNet-152

With the increasing demand for gender-related data in various applications such as border security systems and targeted marketing, gender of female and male classification has gained significant importance in the field of computer vision and deep learning. Existing research has focused on gender classification through facial image, external appearance, or gait movement. However, there is a lack of studies specifically targeting gender classification using two-dimensional body joints. This paper introduces a novel prediction pipeline to enhance the accuracy of gender classification based solely on two-dimensional joint images. Our proposed approach utilizes deep learning (Convolutional Neural Network) technique. We conducted experiments on the BBC Pose and Short BBC Pose datasets, preprocessing the images by filtering out frames with missing human figures, removing background noise, and labeling the joints through transfer learning with a C5 pre-trained model based on ResNet-152. Our results demonstrate that the deep learning method outperforms other approaches, successfully classifying gender (female and male) using two-dimensional joint images and achieving an accuracy of 66.5%

 

Analyzing Bias in Refugee Perception through Face Swapping: An Eye-Tracking Study with Ordinary Procrustes Analysis (OPA)

Refugee populations face ethnocentric biases, with Syrian refugees particularly affected by Western media bias, unlike coverage of the Ukrainian crisis. This study assesses bias among Ukrainian and Syrian refugees using human-computer interaction methods. Using eye-tracking, we analyzed participant decision-making and pupil size data to study attitudes toward refugees. Original refugee images and face swapping via Ordinary Procrustes Analysis were employed. Results reveal a correlation between authentic refugee images and increased donations, underscoring the potential of computer vision and HCI for social good.

 

Untraining Ethnocentric Biases about Gender Roles (Designing for Gender and Diversity, Inclusiveness)

Human interaction with art and how ethnocentric and gender biases apply in this context via the use of human-computer interface design is poorly studied to date. We leverage art as stimulus to untrain gender bias. The interface includes digital representations of a database of 19th century Middle Eastern paintings by European artists. This study offers quantitative insight into measuring biases, thoughtful interaction with art as stimulus, and how we can start to untrain these ethnocentric or gender biases. 

table_1

Measuring Initial Cognitive Load to Predict User Response in Complex Visual Tasks 

We show how cognitive load measurement during the initiation of a complex visual task can predict user response. We measure cognitive load using pupil size and microsaccade rate. Our study. aims to find a significant correlation between initial cognitive load and final user response to the task. This study provides new insights into the initial cognitive processes that would have practical applications in adaptive user interface design, early warning controls, and detection in human performance.

 

Visual Task Classification using Classic Machine Learning and CNNs 

Understanding what type of visual task is being performed is important to develop intelligent user interfaces. In this work, we investigates standard machine and deep learning methods to identify the task type using eye-tracking data - including both raw numerical data and the visual representations of the user gaze scan paths and pupil size. To this end, we experimented with computer vision algorithms such as Convolutional Neural Networks (CNNs) and compared the results to classic machine learning algorithms. 

Studies can be carried out on not only identifying the task type from the observations, but also on identifying user attributes like age from the viewing patterns. The domains of user behaviour and user psychology combined with computational techniques open the paths to many possible research ideas.

(a) Scan path of "Waldo" image in (left) Free-viewing and (right) Fixation conditions where darker points show points looked at repeatedly.  (b) Scan path of "Waldo" image with colors reflecting pupil dilation in (left) Free-viewing and (right) Fixation conditions.

Figure 1. (a) Scan path of "Waldo" image. Darker points show points looked at repeatedly.  (b) Scan path of "Waldo" image with colors reflecting pupil dilation.

(a) Original Scan path of a random user looking at a Puzzle image for Fixation condition, (b) Synthetically produced scan path for a Puzzle image for fixation condition.

Figure 2. (a) Original Scan path of a random user looking at a Puzzle image, (b) Synthetically produced scan path for a Puzzle image.

 

Improving the Performance of Deep Learning in Facial Emotion Recognition

In this study, we use models such as VGG16, Resnet-50, Inception v3, and SeNet-50 and apply transfer learning with preprocessing techniques on the FER-2013 dataset. We also combine these models with image processing filters such as Unsharp Mask and Histogram equalization resulting in an ensemble model. 

Better detection of human emotions can help children with autism, blind people to read facial expressions, robots to better interact with humans, and ensure driver safety by monitoring attention while driving. FER can also enhance the emotional intelligence of applications and improve customer experience by using emotion recognition.

CNN model architecture: 5 convolutional layers, 3 max pooling layers, and 3 fully connected layers

Figure 3. CNN model architecture: 5 convolutional layers, 3 max pooling layers, and 3 fully connected layers