- Today On AI
- Posts
- Baidu’s ERNIE 4.5 Outperforms GPT and Gemini in Visual AI Benchmarks
Baidu’s ERNIE 4.5 Outperforms GPT and Gemini in Visual AI Benchmarks
AND: China’s Open AI Surge Raises Red Flags for U.S. Innovation and Democracy

✨TodayOnAI’s Daily Drop
Baidu’s ERNIE 4.5 Outperforms GPT and Gemini in Visual AI Benchmarks
China’s Open AI Surge Raises Red Flags for U.S. Innovation and Democracy
ChatGPT Pilots Multi-User Conversations in Key Markets
💬 Let’s Fix This Prompt
🧰 Today’s AI Toolbox Pick
| 📌 The TodayOnAI Brief |
Baidu

🚀 TodayOnAI Insight: Baidu has unveiled ERNIE-4.5-VL-28B-A3B-Thinking, a high-efficiency multimodal AI that’s quietly outperforming GPT-5 and Gemini 2.5 on key visual benchmarks. Designed with enterprise data in mind, this lightweight model targets a critical gap in AI deployment: automating insights from complex non-text sources like schematics, dashboards, and video.
🔍 Key Takeaways:
Efficient multimodal design: ERNIE activates just 3B parameters during inference, significantly reducing compute costs while processing visual, textual, and numerical data.
Benchmark leader: It tops key tests like ChartQA (87.1 vs GPT's 78.2) and VLMs Are Blind (77.3 vs GPT's 69.6), showcasing strong visual reasoning capabilities.
Enterprise focus: Targets underutilized data formats—engineering diagrams, factory video, compliance footage—often ignored by text-first models.
Tool-enabled automation: Goes beyond perception; can run structured queries, use external tools, zoom in on images, and search for unknown objects autonomously.
Deployment readiness: Ships under Apache 2.0 for commercial use; requires 80GB GPU per card and supports fine-tuning with ERNIEKit for high-value tasks.
💡 Why This Stands Out: ERNIE 4.5 signals a shift from perception to action in enterprise AI. Its ability to extract structured insights from dense visual data reflects a growing demand for AI that does more than summarize text. As models become agents, the question isn’t whether AI can see—it’s whether it can solve. Baidu seems to think it can.
OPEN AI

🚀 TodayOnAI Insight: Databricks co-founder Andy Konwinski warns that China is outpacing the U.S. in open AI research—a shift he sees as an existential risk to democracy and innovation. At the Cerebral Valley AI Summit, he called for a recommitment to open science to sustain U.S. leadership in AI.
🔍 Key Takeaways:
Konwinski claims PhD students increasingly cite Chinese research as more compelling than U.S. work, reflecting a broader shift in innovation momentum.
He co-founded Laude, a venture firm and research accelerator focused on advancing open AI projects.
Chinese AI labs, such as DeepSeek and Alibaba’s Qwen, are open-sourcing key innovations—encouraging global collaboration and faster progress.
In contrast, U.S. firms like OpenAI, Meta, and Anthropic keep breakthroughs proprietary and recruit top researchers away from academia.
Konwinski argues that transformative advances, like the Transformer architecture, only emerge through open research—not closed labs.
💡 Why This Stands Out: Konwinski’s critique highlights a growing tension in the U.S. AI ecosystem: short-term commercialization vs. long-term research leadership. As China fosters a more open and collaborative research culture, the U.S. risks stalling its innovation engine by hoarding talent and IP. If foundational breakthroughs depend on academic freedom, is the U.S. model becoming self-defeating?
OPEN AI

🚀 TodayOnAI Insight: OpenAI has begun testing group chat in ChatGPT, enabling collaborative conversations for up to 20 users across mobile and web. This early rollout marks a strategic step toward turning ChatGPT into a more socially interactive platform.
🔍 Key Takeaways:
New group chat feature lets users collaborate in real time within ChatGPT.
Currently piloting in Japan, New Zealand, South Korea, and Taiwan.
Available to Free, Plus, and Team users with support for up to 20 participants.
Conversations use GPT‑4 Turbo (GPT-4.1 Auto) with access to search, image generation, and file uploads.
AI usage limits only apply when ChatGPT responds — not to human-to-human messages.
Group chats are invite-only, support emoji reactions, and offer parental controls for users under 18.
💡 Why This Stands Out: Group chat brings ChatGPT closer to a hybrid model—part assistant, part social hub. By blending collaboration, real-time AI access, and moderated spaces, OpenAI is positioning ChatGPT not just as a productivity tool, but as a new kind of digital commons. Will AI-driven group chats become the norm for work, learning, and online interaction?
| 💬 Let’s Fix This Prompt |
✨ See how a simple prompt upgrade can unlock better AI output.
🔹 The Original Prompt
"Generate blog ideas for a tech company."
At first glance, this prompt might seem okay. But it's too broad — and that limits the quality of AI-generated results. Let’s improve it using prompt engineering best practices.
✅ The Improved Prompt
Generate a list of unique, engaging blog post ideas for a B2B tech company that wants to attract decision-makers in mid-sized companies. Focus on topics related to emerging technology trends, industry insights, and practical solutions their software offers. Include suggested titles and a 1–2 sentence summary for each idea.
💡 Why It's Better
Specific audience: Targets decision-makers in mid-sized companies.
Contextual focus: Emphasizes emerging tech and practical solutions.
Actionable output: Requests summaries and titles to spark execution.
Tone and style: Guides the type of content (insightful, engaging, relevant).
🛠️ Learn how to adapt this prompt for SaaS, AI tools, dev teams & more →
Read the full PromptPilot breakdown
💡 Bonus Tool: Want to generate and master prompts instantly?
👉 Try PromptPilot by TodayOnAI (Free to use)
| 🧠 Smart Picks |
📰 More from the AI World
LinkedIn adds AI-powered search to help users find people
Milestone raises $10M to make sure AI rhymes with ROI
Anthropic announces $50 billion data center plan
Figma bets on India to expand beyond design