Back to Experience
Current Position

Ph.D. Student

University of California, Los Angeles

Ph.D in Physics and Biology in Medicine Graduate Program

September 2022 - Present

Los Angeles, CA

Research Focus

Leading cutting-edge research in Vision-Language Foundation models for medical imaging applications. My work focuses on developing robust AI systems that can understand and reason about both visual and textual information, particularly in high-stakes medical domains where accuracy and reliability are paramount.

Currently pursuing a Ph.D. in Physics and Biology in Medicine, with a research emphasis on Vision-Language Foundation models, self-supervised learning techniques, and parameter-efficient fine-tuning for medical image analysis.

Major Research Projects

Dec 2023 - Present
Active

Vision-Language Foundation Models

Objective: Develop robust foundation models combining visual and textual understanding for accurate anatomical structure segmentation in CT scans, particularly focusing on lung segmentation and pulmonary disease detection.

Technical Approach

  • Dual-loss Training: Combining cosine similarity loss between image and text embeddings with distillation loss using MedSAM as teacher model
  • Zero-shot Classification: Generate high-resolution, multi-channel probability maps for various anatomical regions
  • Pseudo-labeling Pipeline: Use foundation model outputs to create training labels for fine-tuning H-SAM architecture
  • Knowledge Distillation: Transfer knowledge from large foundation models to efficient deployment models

Expected Impact

This approach aims to outperform current state-of-the-art CT segmentation methods while providing comprehensive understanding of anatomical structures. The research has potential for direct clinical translation in pulmonary medicine.

Foundation Models Knowledge Distillation Medical Imaging Zero-shot Learning
Jul 2023 - Present
Ongoing

Fine-tuning Vision Foundation Models with Deep Layer Adapters

Purpose: Explore parameter-efficient fine-tuning strategies for vision transformers in medical image segmentation tasks.

Methodology

  • Dataset: VinDr-RibCXR (245 chest X-ray images for rib segmentation)
  • Models: SAM-Adapter with ViT-B (12 blocks) and ViT-H (32 blocks)
  • Training: 50 epochs, batch size 2, AdamW optimizer, IoU loss
  • Experiment Design: Systematic placement of adapters starting from deepest layers

Key Findings

ViT-B Results

Optimal: 7-9 adapters in deepest layers

F1 Score: 0.8240-0.8259 (vs 0.7773 baseline)

ViT-H Results

Optimal: 20-24 adapters in deepest layers

F1 Score: 0.8457-0.8469 (vs 0.7697 baseline)

Key Insight: Adding adapters to the deepest 70-75% of layers in both architectures led to optimal performance improvements.

Parameter-Efficient Fine-tuning Vision Transformers Adapter Networks Medical Segmentation
Oct 2022 - Jul 2023
Completed

Self-Supervised Learning for Chest X-ray Segmentation

Challenge: Address limited annotated data in medical image segmentation by leveraging large-scale unannotated datasets.

Two-Phase Approach

  • Phase 1: ConvNeXt-based DINO pretraining on 148,447 unannotated chest X-rays
  • Phase 2: Fine-tuning on limited annotated datasets (3, 10, 50 cases)
  • Task: Multi-label segmentation of lungs, cardio mediastinum, and airways

Breakthrough Results

  • Dramatic Improvement: DSC improved from 0.24 to 0.56 with only 3 annotated cases
  • Consistent Benefits: Performance gains across all anatomical structures
  • Model Scaling: Larger models consistently performed better
  • Training Strategy: Freezing pretrained encoder sometimes enhanced performance
Self-Supervised Learning DINO Data Scarcity ConvNeXt
2024 - Present
Active

Automatic Radiology Report Generation (RRG) Evaluation

Problem: Current evaluation methods for automatic radiology report generation lack clinical factual accuracy assessment and nuanced understanding.

Two-Stage Solution

Stage 1: JUN Metric Development

Develop an interpretable, component-based metric (JUN - Judging Understanding of Nuance) that assesses:

  • Clinical concepts accuracy
  • Factual status verification
  • Descriptive modifiers correctness
Stage 2: Large-Scale Dataset Generation

Leverage JUN to generate comprehensive dataset of (Reference, Candidate, Detailed Score Vector) tuples for training next-generation LLM-based evaluators.

Clinical Relevance

This work aims to provide robust tools for developing safer AI in clinical documentation, ensuring AI-generated reports meet the rigorous standards required in healthcare settings.

Clinical AI Report Generation Evaluation Metrics LLM Training

Research Publications from UCLA

Journal Publications

  • Jin Kim, Muhammad Wahi-Anwa, Sangyun Park, Shawn Shin, John M. Hoffman, Matthew S. Brown - "Autonomous Computer Vision Development with Agentic AI", arXiv
  • Jin Kim, Matthew Brown, Dan Ruan - "Dual-path Radiology Report Generation: Fusing Pathology Classification with Language Model", MICCAI 2025 Workshop - Vision-Language Model for medical applications
  • Lasse Hansen, ..., Jin Kim, Dan Ruan, ... - "Learn2Reg 2024: New Benchmark Datasets Driving Progress on New Challenges", Learn2Reg 2024 (under review)

Conference Papers

  • Sangyun Park*, Jin Kim*, Yuchen Cui, Matthew Sherman Brown - "TRACE: Textual Reasoning for Affordance Coordinate Extraction", ICCV 2025 Workshop (Under Review) - Vision-Language Model application
  • Sangyun Park*, Jin Kim* - "SPAR: Spatial Precision with Articulated Reasoning", ICCV 2025 Workshop (Under Review)
  • Jin Kim, Matthew Brown, Dan Ruan - "Improving Foundation Models with Deep Layer Adapters for Medical Image Segmentation", RSNA 2024 (Oral) - Medical Image Analysis presentation
  • Jin Kim, Matthew Brown, Dan Ruan - "Self-Supervised Learning Without Annotations to Improve Lung Chest X-Ray Segmentation", SPIE 2024 🏆 - Medical Image Analysis presentation

Key Achievements

🏆

SPIE 2024 Winner

1st place at SPIE 2024 Live Demonstrations Workshop for "SimpleMind: A Cognitive AI software environment"

📄

15+ Publications

First author on multiple high-impact papers in top-tier venues including RSNA, SPIE, and MICCAI

🎓

PhD Qualifier

Passed qualifying exam on "Leveraging Foundation Models, Knowledge Distillation, and Pseudo-Labeling for Robust Lung Segmentation"

👥

Research Mentorship

Mentored undergraduate students in deep learning and medical image analysis research projects

Technical Skills Developed

AI/ML Frameworks

PyTorch TensorFlow Transformers MedSAM DINO SAM

Medical Imaging

CT Segmentation Chest X-Ray Analysis Medical Image Registration DICOM Processing 3D Image Analysis

Research Methods

Self-Supervised Learning Knowledge Distillation Foundation Models Vision-Language Models Parameter-Efficient Fine-tuning