VLM Scene Intelligence
Moving beyond basic OCR, mask leverages VLM to understand the context of comics, UI elements, charts, and rough sketches.
Translate text with masks and understand visuals with VLM. Privacy-first, on-device magic for macOS.
Perfect for Comics, Games, Slides, & Hand-drawn notes.
ドカン!!
このプロトタイプ、来週までに仕上げる。
Boom!
This prototype must be finished by next week.
Select only what matters. Translation overlays naturally in place without breaking your visual continuity.
Powered by local native APIs and advanced visual models.
Moving beyond basic OCR, mask leverages VLM to understand the context of comics, UI elements, charts, and rough sketches.
Apple native Vision OCR handles extraction locally. Images do not leave your device unless you explicitly route to cloud VLMs.
Seamlessly switch between OpenAI, Gemini, Qwen, DeepSeek, or run entirely offline with Ollama based on your needs.
Traditional OCR just reads characters. mask's VLM actually understands the scene in front of it.
| Scenario | Standard OCR | mask with VLM |
|---|---|---|
| Whiteboard notes with sketches | Fragmented words, garbage characters | Recognizes context as "App architecture diagram" |
| Stylized Comic Sound Effects | Fails to detect stylized fonts | Interprets visual “ドカン” and outputs “Boom!” |
| Data charts and graphs | Translates axis labels in isolation | Summarizes the chart trend automatically |
Translate UI menus and story dialogue on-the-fly without alt-tabbing breaking your immersion.
Mask just the speech bubbles. Keep the beautiful art untouched while instantly reading the dialogue.
Translate dense foreign slides while retaining the understanding of complex diagrams and flowcharts.
Digitize and translate rough handwritten concepts and diagrams shared by international colleagues.
Your screen is your business. We keep it that way.
Native SCKit API
100% On-device OCR
Rendered instantly
Experience the magic of ambient translation on your Mac today.
Download mask for macOS