A multi-modal translation layer that overlays bilingual subtitles on videos, translates text inside images, and preserves formatting in PDFs using GPT-5 and Gemini models.