TADP

Text-image Alignment for Diffusion-based Perception. CVPR 2024.

TADP leverages text-image alignment in diffusion models for dense visual perception tasks. By extracting and aligning internal representations from text-to-image diffusion models, TADP achieves strong performance on monocular depth estimation and semantic segmentation without task-specific training data.

Published at CVPR 2024.

Links: