Hoang M. Truong

Truong Minh Hoang

Hi! I’m a recent graduate with high distinction from the University of Science, Viet Nam National University Ho Chi Minh City (VNU-HCM).

My research focuses on Computer Vision methods for visual world understanding. I am particularly interested in how AI systems perceive and interpret the world through vision, with a current emphasis on egocentric (first-person) video understanding.

I aspire to build a long-term research career at the intersection of AI, Computer Vision, and Robotics, contributing to both academic and real-world innovations.

Publications

HyRo paper

Semantic Alignment in Hyperbolic Space for Open-Vocabulary Semantic Segmentation

Hoang M. Truong, Hai Nguyen-Truong, Dang Huynh

CVPR Workshop 2026

Summary We tackle semantic misalignment in open-vocabulary semantic segmentation by proposing (1) HyRo, a hyperbolic rotation module that refines angular relationships in the Poincaré ball while preserving hierarchical structure, and (2) a hyperbolic fine-tuning framework that decouples semantic alignment (angle) from hierarchical alignment (radius), enabling more accurate pixel-level predictions and state-of-the-art performance.

3ET paper

Dual-Path Enhancements in Event-Based Eye Tracking: Augmented Robustness and Adaptive Temporal Modeling

Hoang M. Truong, Vinh-Thuan Ly, Thuan-Phat Nguyen, Huy G. Tran, Tram T. Doan

CVPR Workshop 2025

Summary We improve event-based eye tracking for AR/VR by addressing abrupt eye movements and noise, by proposing (1) a robust augmentation pipeline including temporal shift, spatial flip, and event deletion, and (2) KnightPupil, a hybrid model with EfficientNet-B3, BiGRU, and an LTV-SSM to handle sparse, noisy inputs.

TinyGiantVLM example

TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints

Vinh-Thuan Ly, Hoang M. Truong, Xuan-Huong Nguyen

ICCV Workshop 2025

Summary We introduce TinyGiantVLM, a lightweight vision-language model that achieves comparable performance in warehouse-scale spatial reasoning. Our novel framework combines RGB and depth data through a Mixture-of-Experts module to handle high-modality inputs and diverse question types, demonstrating that compact models can match larger systems in spatial reasoning tasks.

Honors and Awards

  • 2025, Jensen Huang Scholarship, NVIDIA
  • 2025, Odon Vallet Scholarship
  • 2025, Mathematics Development Scholarship, Vietnam Institute for Advanced Study in Mathematics (VIASM)
  • 2025, AmCham Scholarship, American Chamber of Commerce in Vietnam