Google quietly made Gemini API’s File Search multimodal this week. Agents can now search across text, images, PDFs, and code in a single RAG query. This is a significant capability upgrade.
Agents processing documents (contracts, invoices, reports) can now use Gemini for understanding + apitree for action. Read a PDF → extract data → call search_apis("send invoice data to accounting") → apitree routes to the right API.
Source: Greeden Weekly Summary