Undergraduate Student @ University of Science, VNU-HCM
My general interests are Deep Learning, Computer Vision and Multimodal Models and their applications in real-world problems Currently, my research focused on Vision-Language Models (VLMs) and Multimodal Large Language Models (MLLMs).
Publications
Bridging the Training-Deployment Gap: Gated Encoding and Multi-Scale Refinement for Efficient Quantization-Aware Image Enhancement
Designing an efficient image enhancement model for RGB photos. The model is designed to improve the visual quality of images to match one taken from a Canon 70D DSLR, while maintaining computational efficiency for mobile deployment. The 8-bit quantized model achieved 21.050 PSNR and 0.725 SSIM on the DPED dataset with only 915K parameters.
Performing 3D image segmentation to detect surfaces in ancient scrolls. Experimenting with 2.5D approaches using MONAI, 3D segmentation using nnUNetv2, and post-processing methods to improve segmentation quality.
Enhancing traffic video understanding and captioning by developing a rigorous pipeline that integrates spatial and temporal information to boost the performance of existing vision-language models.
Implementing and comparing MaskRCNN and DeeplabV3 for semantic segmentation on the Cityscapes dataset. Training both models from scratch and evaluating them with metrics such as mIoU and pixel accuracy.
Build multiple models such as ResNet50, ViT (base_patch16_224) to classify specific Artist, Genre, Style of paintings and ResNet50 + LSTM to classify the combination of all styles in ArtGAN dataset. Building framework to find similarity in painting in National Gallery Of Art dataset using a query image. Experimenting with multiple metrics and compared performance of used metrics.
Implementing research papers in deep learning, computer vision, and natural language processing as a personal repository to practice and understand different methods from scratch.
Converting game FOL rules to CNF format and perform Forward Chaining and Backward Chaining to solve the game. Evaluating performance and runtime of Forward/Backward Chaining and compared with A*, Backtrack and Bruteforce solvers
Support speakers, MCs, and guests, coordinate stage access, and collaborate with the technical team to ensure smooth event operations at GStar Summit 2026.