Arkaprabha Basu

About

Hello! I am a PhD student at the Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, and a CoSTAR Fellow. My research focuses on Vision-Language Models (VLMs), diffusion models, MLLM hallucination detection, and multimodal medical analysis.

Previously, I worked as a Project Associate at TCG CREST, Kolkata, where I developed multimodal explainable AI systems and medical report generation using large language models including Llama, Gemma-2, and Phi-3.

I obtained my Master of Technology in Computer Science from the University of Hyderabad in 2023, during internship, I worked on digital reconstruction of heritage temple tiles using GANs at the Indian Statistical Institute, Kolkata.

Research Interests

My work spans computer vision, natural language processing, and their intersection in multimodal learning. Current focus areas include:

Vision-Language Models: Studying hallucination phenomena in MLLMs and developing methods for better alignment between visual and textual representations.

Medical AI: Building explainable systems for clinical report generation, radiology image analysis, and transforming complex medical reports into patient-friendly language.

Generative Models: Exploring diffusion models and GANs for image super-resolution, heritage reconstruction, and creative applications.

Publications

ARREST: Adversarial Resilient Regulation Enhancing Safety and Truth in Large Language Models Accepted
Sharanya Dasgupta, Arkaprabha Basu, Sujoy Nath, Swagatam Das
EACL 2026 (CORE Rank: A)
HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs Oral
Sujoy Nath, Arkaprabha Basu, Sharanya Dasgupta, Swagatam Das
ICVGIP 2026
HalluShift: Measuring Distribution Shifts towards Hallucination Detection in Large Language Models
Sharanya Dasgupta, Sujoy Nath, Arkaprabha Basu, Pourya Shamsolmoali, Swagatam Das
IJCNN 2025 (CORE Rank: B)
From Complexity to Clarity: Transforming Chest X-ray Reports with Chained Prompting
Sujoy Nath, Arkaprabha Basu, Kushal Bose, Swagatam Das
AAAI 2025 Student Abstract (CORE Rank: A*)
Prompting to Bridge the Gap: Accelerating Research in Multi-Modal Medical Report Generation for Alzheimer's Disease Workshop
Arkaprabha Basu, Avisek Gupta, Swagatam Das, Pourya Shamsolmoali
IJCNN 2025 · INNS DLIA Workshop (CORE Rank: B)
Improved Alzheimer's Disease Detection with Dynamic Attention Guided Multi-modal Fusion
Arkaprabha Basu, Sourav Raha, Avisek Gupta, Swagatam Das
ICPR 2024 (CORE Rank: B)
Digital Restoration of Cultural Heritage With Data-Driven Computing: A Survey
Arkaprabha Basu, Sandip Paul, Sreeya Ghosh, Swagatam Das, Bhabatosh Chanda, Chakravarthy Bhagvati, Václav Snasel
IEEE Access (Q1, Top 8%)
Do Pre-processing and Class Imbalance Matter to the Deep Image Classifiers for COVID-19 Detection? An Explainable Analysis
Arkaprabha Basu, Sourav Das, Sankha Mullick, Swagatam Das
IEEE TAI 2022 (Q1, Top 20%)
On Regenerative and Discriminative Learning from Digital Heritages: A Fractal Dimension based Approach
Sreeya Ghosh, Arkaprabha Basu, Sandip Paul, Swagatam Das, Bhabatosh Chanda
ADCIS 2022
Information Preservation with Wasserstein Autoencoders: Generation Consistency and Adversarial Robustness
Anish Chakrabarty, Arkaprabha Basu, Swagatam Das
Statistics & Computing (Q1)
Near-Perfect Image Super-Resolution: Exploring Multiple Divergence Measures in Fully Convolutional GAN Under Review
Arkaprabha Basu, Kushal Bose, Sankha Subhra Mullick, Anish Chakrabarty, Swagatam Das
Elsevier EAAI (Q1, Top 12%)
Face Mask Recognition using Advanced Face Cut Algorithm for COVID-19 Safety Measures
Arkaprabha Basu, Md Firoj Ali
ICCCNT 2021

Blog Posts

I write about AI, deep learning, and research on Medium.

Research Experience

PhD Student — CVSSP, University of Surrey
Guildford, UK · October 2025 — Present
Research focus: Vision-Language Models (VLM), diffusion models, MLLM hallucination & alignment, multimodal medical analysis. CoSTAR Fellow.
Project Associate — TCG CREST
Kolkata, India · February 2024 — December 2024
Developed multimodal explainable AI systems and medical report generation using Llama, Gemma-2, and Phi-3. Focused on clinical readability and faithfulness objectives.
Project Linked Person — Indian Statistical Institute
Kolkata, India · July 2022 — March 2023
Temple tile reconstruction (Bankura Terracotta): Developed 3D-aware few-shot GAN with super-resolution, automated defect detection/replacement pipeline, and Streamlit visualization tool.
Project Linked Person — Indian Statistical Institute
Kolkata, India · January 2021 — October 2021
COVID-19 detection from chest X-rays: Image enhancement + class-imbalance aware ResNet50; XAI methods to highlight abnormal rib-cage regions.

Industry Experience

Data Scientist — HappyMonk.ai
Bengaluru, India · April 2023 — December 2023
Edge anomaly detection for multi-camera surveillance using YOLOv8/YOLOv5, SlowFast & Re-ID. Optimized TAO/ETLT → ONNX → TensorRT pipelines. Deployed on NVIDIA DeepStream.

Education

PhD in Computer Vision / AI
CVSSP — University of Surrey · 2025 — Present
Master of Technology in Computer Science
University of Hyderabad · 2021 — 2023 · CGPA: 9.3/10
Master of Science in Computer Science
Pondicherry Central University · 2018 — 2020 · CGPA: 8.74/10
Bachelor of Science in Computer Science
Ramakrishna Mission Residential College (Autonomous), Narendrapur · 2015 — 2018 · 82.13%

Curriculum Vitae

Download my full CV: PDF

Thesis Projects

Digital Reconstruction of Temple Tiles
ISI Kolkata & University of Hyderabad · July 2022 — March 2023
Handwritten Character Recognition
Pondicherry Central University · December 2019 — May 2020