[TIL] Video Diffusion Model과 시뮬레이터 — 다은이의 컴퓨터 공부

오늘의 세미나 주제는 .. Video Diffusion model이 real world의 dynamics를 반영할 수 있는 시뮬레이터로서 기능할 수 있을지이다.

☑️ Learning Interactive Real-World Simulators (Jan 2024) - ICLR24 Outstanding paper

Good world simulator가 있다면, human은 diverse scene에 대해 더 많은 interaction이 가능할 것
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
이 paper에서는 action-in-video-out conditional video generation으로 시뮬레이션을 진행
- We can simulate the visual outcome of both high-level instructions such as “open the drawer”

위와 같이 생성된 simulation dataset은 이런식으로 OOD domain의 성능을 높일 수 있다. (Data augmentation의 기능을 한다.)

☑️ Diffusion Models Are Real-Time Game Engines (Aug 2024)

the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality.
- Video Diffusion Model로 Real-time으로 user와 interact하는 게임을 만듦.
- DOOM이라는 게임을 뉴럴넷에서 train 함
아래와 같이 train 했다고 함

저작자표시 (새창열림)

'Computer Vision💖 > Video' 카테고리의 다른 글

[Daily] VideoChat-R1: Enhancing Spatio-TemporalPerception via Reinforcement Fine-Tuning (0)	2025.04.11
[Daily] Video-R1: Reinforcing Video Reasoning in MLLMs (0)	2025.04.09
[Daily] Token-Efficient Long Video Understanding for Multimodal LLMs (0)	2025.03.17
[TIL] Long Video Understanding (0)	2024.09.06

티스토리툴바