Enables text-only LLMs to analyze images by routing them to an OpenAI-compatible vision backend, supporting local files, URLs, and data URLs.