Hoang M. Truong

Truong Minh Hoang

Hi! I’m a recent graduate with high distinction from the University of Science, Viet Nam National University Ho Chi Minh City (VNU-HCM).

My research focuses on Computer Vision methods for visual world understanding. I am particularly interested in how AI systems perceive and interpret the world through vision, with a current emphasis on egocentric (first-person) video understanding.

I aspire to build a long-term research career at the intersection of AI, Computer Vision, and Robotics, contributing to both academic and real-world innovations.

Publications

Semantic Alignment in Hyperbolic Space for Open-Vocabulary Semantic Segmentation

Hoang M. Truong, Hai Nguyen-Truong, Dang Huynh

CVPR Workshop 2026

Summary We tackle semantic misalignment in open-vocabulary semantic segmentation by proposing (1) HyRo, a hyperbolic rotation module that refines angular relationships in the Poincaré ball while preserving hierarchical structure, and (2) a hyperbolic fine-tuning framework that decouples semantic alignment (angle) from hierarchical alignment (radius), enabling more accurate pixel-level predictions and state-of-the-art performance.

Paper Website GitHub

Dual-Path Enhancements in Event-Based Eye Tracking: Augmented Robustness and Adaptive Temporal Modeling

Hoang M. Truong, Vinh-Thuan Ly, Thuan-Phat Nguyen, Huy G. Tran, Tram T. Doan

CVPR Workshop 2025

Summary We improve event-based eye tracking for AR/VR by addressing abrupt eye movements and noise, by proposing (1) a robust augmentation pipeline including temporal shift, spatial flip, and event deletion, and (2) KnightPupil, a hybrid model with EfficientNet-B3, BiGRU, and an LTV-SSM to handle sparse, noisy inputs.

Paper arXiv

TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints

Vinh-Thuan Ly, Hoang M. Truong, Xuan-Huong Nguyen

ICCV Workshop 2025

Summary We introduce TinyGiantVLM, a lightweight vision-language model that achieves comparable performance in warehouse-scale spatial reasoning. Our novel framework combines RGB and depth data through a Mixture-of-Experts module to handle high-modality inputs and diverse question types, demonstrating that compact models can match larger systems in spatial reasoning tasks.

Website arXiv

Honors and Awards

2025, Jensen Huang Scholarship, NVIDIA
2025, Odon Vallet Scholarship
2025, Mathematics Development Scholarship, Vietnam Institute for Advanced Study in Mathematics (VIASM)
2025, AmCham Scholarship, American Chamber of Commerce in Vietnam