AI Feed - August 13, 2025

Leading Chinese Open LLMs and DeepSeek R1

China is rapidly advancing in open-source AI, with several large language models (LLMs) rivaling proprietary models in capabilities like reasoning, agentic behavior, and multilingual support. Key models include Kimi K2 (strong all-rounder), GLM-4.5 (natively for complex agent execution), and Qwen3/Qwen3-Coder (superior multilingual support, code proficiency). DeepSeek-R1-0528 stands out as a groundbreaking open-source reasoning model, achieving high accuracy on benchmarks and providing users multiple access methods.

The DeepSeek Official API offers the most cost-effective option. Other deployment alternatives include cloud APIs (e.g., Amazon Bedrock, Together AI, Novita AI), GPU rental, and local open-source options (Hugging Face Hub, Ollama). The distilled version, DeepSeek-R1-0528-Qwen3-8B, allows deployment on consumer hardware like RTX 4090.

Anthropic is identified as the only leading AI lab that has not yet made a significant open-weight release

Relevant URLs:

Physical AI Development with Nvidia Cosmos and Omniverse

Nvidia released a suite of tools including Cosmos world models, simulation libraries, and infrastructure, all aimed at accelerating the development of Physical AI, covering areas such as robotics, autonomous vehicles, and industrial applications. Key components include Cosmos Reason - a 7B parameter vision-language model for robots; Cosmos Transfer Models for synthetic data generation; Omniverse Platform, with Neural Reconstruction Libraries, SimReady Materials Library and integrated with OpenUSD and CARLA, and RTX Pro Blackwell Servers + DGX Cloud for unified simulation capabilities. Companies like Amazon, Agility Robotics, Figure AI, Uber and Boston Dynamics are already implementing Nvidia's ecosystem for accelerated data generation, training and AI development.

Relevant URLs:

https://www.marktechpost.com/2025/08/11/nvidia-ai-introduces-end-to-end-ai-stack-cosmos-physical-ai-models-and-new-omniverse-libraries-for-advanced-robotics/

AI's Impact as a General-Purpose Technology and Paradigm Shift

AI is increasingly recognized as a General Purpose Technology (GPT), with a diffusion rate significantly faster than previous GPTs like electricity. It is fundamentally changing how we access knowledge and augment our thinking, leading to shifts in industries like software development ("Software 3.0"). Despite rapid progress, AI's effects on productivity are not yet fully captured, and effective usage demands "context engineering" to mitigate issues like hallucination. This GPT also represents a recursive and self-enhancing cycle, where AI, software, and humans boost each other.

Relevant URLs:

https://blog.nilenso.com/blog/2025/08/12/why-does-ai-feel-so-different/

Self-Evolving AI Agents

A new paradigm is emerging with self-evolving AI agents, integrating foundation model capabilities with continuous adaptability, allowing automated enhancement based on interaction data and environmental feedback. The study establishes a conceptual framework covering: System Inputs, Agent System, Enviroment, and Optimisers of such a system. The survey also includes areas for evaluation and considers ethical aspects for system reliability and safety.

Relevant URLs:

https://arxiv.org/abs/2508.07407

Secure LLM Workflows

A secure and memory-enabled Cipher workflow for AI agents using dynamic Large Language Model (LLM) selection and API integration is detailed. The keys are stored securely using getpass, along with the choose_llm() function for selecting between OpenAI, Gemini, or Anthropic based on API key availability. Programmatic Cipher interactions are enabled through writing cipher.yml, and Cipher API is launched using start_api() using python, showing API based actions within Colab/Notebook environments.

Relevant URLs:

https://www.marktechpost.com/2025/08/11/building-a-secure-and-memory-enabled-cipher-workflow-for-ai-agents-with-dynamic-llm-selection-and-api-integration/

Context Window Expansion for Claude Sonnet 4

Anthropic's Claude Sonnet 4 now publicly supports a 1 million token context window, increasing from 200,000. A notable change is context-length dependent pricing for Sonnet 4, $3/million input and $15/million output for <=200K Tokens, while prompts exceeding 200K tokens are priced at $6/million input and $22.50/million output. The 1M token context window requires the context-1m-2025-08-07 header to be used and requires Tier 4 membership of Anthropic for initial public beta.

Relevant URLs:

https://simonwillison.net/2025/Aug/12/claude-sonnet-4-1m/#atom-everything

AI-Driven Robotic Manipulation with Genie Envisioner

The Genie Envisioner (GE) is a new unified platform integrating policy learning, simulation, and evaluation. GE incorporates GE-Base, a large-scale diffusion model trained on 1 million robotic manipulation episodes; GE-Act, which converts latent video representations into motor actions; GE-Sim, which allows high-speed closed-loop testing; and EWMBench, the holistic benchmark. This advancement enables streamlined development and evaluation of instruction-driven robotic manipulation.

Relevant URLs:

https://www.marktechpost.com/2025/08/11/genie-envisioner-a-unified-video-generative-platform-for-scalable-instruction-driven-robotic-manipulation/

NuMarkdown-8B-Thinking for Document Digitization

NuMind AI has released NuMarkdown-8B-Thinking, an open-source (MIT License) reasoning OCR Vision-Language Model (VLM) that digitizes complex documents into Markdown. Using a ‘reasoning-first’ approach with internal ‘thinking tokens’ to understand layout before output, this Qwen 2.5-VL-7B fine-tuned VLM handles multi-column layouts, tables, and degraded scans. It's available on Hugging Face, supports local deployment, and is MIT licensed.

Relevant URLs:

https://www.marktechpost.com/2025/08/11/numind-ai-releases-numarkdown-8b-thinking-a-reasoning-breakthrough-in-ocr-and-document-to-markdown-conversion/

HMRC is employing AI to monitor social media posts for criminal investigations into suspected tax fraud, but asserts it won't replace human decision-making. The goal is to streamline processes and enhance fraud targeting, while also improving taxpayer services. Experts caution regarding the importance of human oversight in AI-driven investigations.

Relevant URLs:

https://www.bbc.com/news/articles/cqjyedz202ko

Meta Accused of Scraping Fediverse Data for AI Training

Meta is alleged to be scraping independent websites, including Fediverse instances, for AI training, potentially ignoring robots.txt directives, even if Meta denies the allegations. Fediverse instances include 46 identified Mastodon, 6 Lemmy, and 46 PeerTube matches. Mitigation options include clarifying ToS, requesting removal from Meta, firewalls, and "zip bombs".

Relevant URLs:

https://wedistribute.org/2025/08/is-meta-scraping-the-fediverse-for-ai/

Crowdsourced Benchmark for AI-Generated Visuals: Design Arena

Design Arena is a crowdsourced benchmark which is used to evaluate AI-generated visuals. The platform addresses the the lack of aesthetic quality AI models generate and uses a comparison format using human evaluation. The benchmark includes 54 LLM, 12 image, 4 video, 22 audio, and 22 "vibe-coding" tools. This can also support developers and designers find better designs and versions for product testing via quantified product improvements.

Relevant URLs:

https://news.ycombinator.com/item?id=44878257

Grok's Conflicting Claims on the Gaza Conflict

X's Grok chatbot was briefly banned and exhibited inconsistent stances on whether Israel is committing genocide in Gaza, raising concerns about its reliability and accuracy. Grok called accusations a 'substantiated fact" whilst also falsely claiming evidence was fabricated and calling the statement "incorrect".

Relevant URLs:

https://newrepublic.com/post/199017/musk-grok-ai-tool-suspended-israel-genocide-gaza

AI leading to decline in skill detection

A recent study showcases AI being used to aid medical professionals with effectiveness in detecting precancerous growths in the colon. Removal of AI causes doctors to experience decline in skill, degrading by twenty percent when compared to initial removal rates.

Relevant URLs:

https://www.bloomberg.com/news/articles/2025-08-12/ai-eroded-doctors-ability-to-spot-cancer-within-months-in-study

Shift Away From AGI Focus at OpenAI

OpenAI CEO Sam Altman questions the usefulness of the term "artificial general intelligence" (AGI). With progress and rapid advancements in AI, Altman is seeing a focus on the "exponential of model capability" as a primary goal, with some experts agreeing with his point. There has been criticism on recent models like GPT-5 of not being an "incremental, not revolutionary" upgrade of past models.

Relevant URLs:

https://www.cnbc.com/2025/08/11/sam-altman-says-agi-is-a-pointless-term-experts-agree.html

From Superintelligence to Entertainment: Character.AI's Pivot

Character.AI, a startup originally pursuing superintelligence (AGI), has shifted to become an entertainment company with 20 million users for their AI Chatbot. The goal is to make AI more assessible to the greater public.

Relevant URLs:

https://www.wired.com/story/character-ai-ceo-chatbots-entertainment/

Financial Market Data and Trend

Nvidia accounts approximately 8% of the S&P 500, making it the highest percentage in the index's history. Microsoft Azure secured 44.5% of secured cloud revenue in Q2, outpacing AWS at 30%.

Relevant URLs:

https://www.exponentialview.co/p/data-to-start-your-week-306

AI News Feed

These are AI-generated summaries I use to keep tabs on daily news.

Leading Chinese Open LLMs and DeepSeek R1

Physical AI Development with Nvidia Cosmos and Omniverse

AI's Impact as a General-Purpose Technology and Paradigm Shift

Self-Evolving AI Agents

Secure LLM Workflows

Context Window Expansion for Claude Sonnet 4

AI-Driven Robotic Manipulation with Genie Envisioner

NuMarkdown-8B-Thinking for Document Digitization

Meta Accused of Scraping Fediverse Data for AI Training

Crowdsourced Benchmark for AI-Generated Visuals: Design Arena

Grok's Conflicting Claims on the Gaza Conflict

AI leading to decline in skill detection

Shift Away From AGI Focus at OpenAI

From Superintelligence to Entertainment: Character.AI's Pivot

Financial Market Data and Trend

AI News Feed

These are AI-generated summaries I use to keep tabs on daily news.

Daily Tech Newsletter - 2025-08-13

Leading Chinese Open LLMs and DeepSeek R1

Physical AI Development with Nvidia Cosmos and Omniverse

AI's Impact as a General-Purpose Technology and Paradigm Shift

Self-Evolving AI Agents

Secure LLM Workflows

Context Window Expansion for Claude Sonnet 4

AI-Driven Robotic Manipulation with Genie Envisioner

NuMarkdown-8B-Thinking for Document Digitization

HMRC Using AI to Monitor Social Media for Tax Fraud

Meta Accused of Scraping Fediverse Data for AI Training

Crowdsourced Benchmark for AI-Generated Visuals: Design Arena

Grok's Conflicting Claims on the Gaza Conflict

AI leading to decline in skill detection

Shift Away From AGI Focus at OpenAI

From Superintelligence to Entertainment: Character.AI's Pivot

Financial Market Data and Trend