Feed

Page 13 of 15

ASI existential risk: reconsidering alignment as a goal

michaelnotebook.com

reality doesn't care about human psychology. When alignment to anticipated power will lead to unhealthy outcomes, a thriving civilization requires people willing to act in defiance of the zeitgeist, not merely follow the incentive gradient of immediate rewards. I believe the arguments for xrisk are good enough that there is a moral obligation for anyone working on AGI to investigate this risk with deep seriousness, and to act even if it means giving up their own short-term interests.

Link

On Jagged AGI: o3, Gemini 2.5, and everything after

www.oneusefulthing.org

In some tasks, AI is unreliable. In others, it is superhuman. You could, of course, say the same thing about calculators, but it is also clear that AI is different. It is already demonstrating general capabilities and performing a wide range of intellectual tasks, including those that it is not specifically trained on. Does that mean that o3 and Gemini 2.5 are AGI? Given the definitional problems, I really don’t know, but I do think they can be credibly seen as a form of “Jagged AGI” - superhuman in enough areas to result in real changes to how we work and live, but also unreliable enough that human expertise is often needed to figure out where AI works and where it doesn’t. Of course, models are likely to become smarter, and a good enough Jagged AGI may still beat humans at every task, including in ones the AI is weak in.

Link

Model Context Protocol has prompt injection security problems

simonwillison.net

As more people start hacking around with implementations of MCP (the Model Context Protocol, a new standard for making tools available to LLM-powered systems) the security implications of tools built on that protocol are starting to come into focus.

Link

Import AI 406: AI-driven software explosion; robot hands are still bad; better LLMs via pdb | Import AI

jack-clark.net

Researchers with Forethought, an AI research organization, think it’s likely that modern AI research will yield AI systems capable of building their successors. Forethought expects that at some point in the future it’ll be possible to build AI Systems for AI R&D Automation (ASARA).

Why this matters – LLMs are more powerful than we think, they just need the right tools: Systems like this are yet another example of the ‘capability overhang’ which surrounds us – you can make LLMs better merely by pairing them with the right tools and, these days, you don’t need to do any adaption of the LLMs for those tools beyond some basic prompting. Put another way: if you paused all AI progress today, systems would continue to advance in capability for a while solely through the creation of better tools.
Read more: debug-gym: A Text-Based Environment for Interactive Debugging (arXiv).
Get the software here: debug-gym (Microsoft site).

Link

MCP: The new USB-C for AI that’s bringing fierce rivals together

arstechnica.com

MCP has also rapidly begun to gain community support in recent months. For example, just browsing this list of over 300 open source servers shared on GitHub reveals growing interest in standardizing AI-to-tool connections. The collection spans diverse domains, including database connectors like PostgreSQL, MySQL, and vector databases; development tools that integrate with Git repositories and code editors; file system access for various storage platforms; knowledge retrieval systems for documents and websites; and specialized tools for finance, health care, and creative applications.

To make the connections behind the scenes between AI models and data sources, MCP uses a client-server model. An AI model (or its host application) acts as an MCP client that connects to one or more MCP servers. Each server provides access to a specific resource or capability, such as a database, search engine, or file system. When the AI needs information beyond its training data, it sends a request to the appropriate server, which performs the action and returns the result.

Link

No elephants: Breakthroughs in image generation

www.oneusefulthing.org

Over the past two weeks, first Google and then OpenAI rolled out their multimodal image generation abilities. This is a big deal. Previously, when a Large Language Model AI generated an image, it wasn’t really the LLM doing the work. Instead, the AI would send a text prompt to a separate image generation tool and show you what came back. The AI creates the text prompt, but another, less intelligent system creates the image. For example, if prompted “show me a room with no elephants in it, make sure to annotate the image to show me why there are no possible elephants” the less intelligent image generation system would see the word elephant multiple times and add them to the picture. As a result, AI image generations were pretty mediocre with distorted text and random elements; sometimes fun, but rarely useful.

Link

Import AI 405: What if the timelines are correct? | Import AI

jack-clark.net

This article is full of all sorts of interesting information from questions about LLM consciousness to security threats to LLM agents with the capability of doing months of work.

The paper is worth reading because it represents an earnest attempt by a thoughtful human to confront the impossibly large question we’ll need to deal with in the next decade or so – how conscious might LLMs be?

Individuals working with AI performed just as well as teams without AI, showing a 0.37 standard deviation improvement over the baseline. This suggests that AI effectively replicated the performance benefits of having a human teammate – one person with AI could match what previously required two-person collaboration.

“By automating complex tasks previously requiring human ingenuity and extensive effort, AI models can significantly lower the barriers to entry for malicious actors of all attack levels,” Google writes. “”Our evaluations revealed that current AI cyber evaluations often overlook critical areas. While much attention is given to AI-enabled vulnerability exploitation and novel exploit development, our analysis highlights AI’s significant potential in under-researched phases like evasion, detection avoidance, obfuscation, and persistence. Specifically, AI’s ability to enhance these stages presents a substantial, yet often underestimated, threat.”

Significant and sustained growth: “We find that the 50% time horizon has been growing exponentially from 2019–2024 on our tasks,” METR writes. The analysis means METR thinks there’s a high chance AI systems will be able to tackle tasks that take a human a month (167 working hours) by 2030 – or potentially earlier, if a recent uptick in the trajectory due to the arrival of new reasoning models holds.

Link

Revenge of the junior developer | Sourcegraph Blog

sourcegraph.com

I believe the AI-refusers regrettably have a lot invested in the status quo, which they think, with grievous mistakenness, equates to job security. They all tell themselves that the AI has yet to prove that it’s better than they are at performing X, Y, or Z, and therefore, it’s not ready yet.

But from where I’m sitting, they’re the ones who aren’t ready. I lay this all out in detail, my friends, so you can help yourselves.

Regardless of why the luddites aren’t adopting it, they have lost. Junior devs have the high ground, and the battle is now over. Not only are junior devs on average adopting AI faster, but junior devs are also – surprise! – cheaper. If companies are going to make cuts to pay for their devs to win with tokens, which devs do you think they’re gonna keep?

Link

Page 13 of 15