Feed

Page 15 of 16

Import AI 406: AI-driven software explosion; robot hands are still bad; better LLMs via pdb | Import AI

jack-clark.net

Researchers with Forethought, an AI research organization, think it’s likely that modern AI research will yield AI systems capable of building their successors. Forethought expects that at some point in the future it’ll be possible to build AI Systems for AI R&D Automation (ASARA).

Why this matters – LLMs are more powerful than we think, they just need the right tools: Systems like this are yet another example of the ‘capability overhang’ which surrounds us – you can make LLMs better merely by pairing them with the right tools and, these days, you don’t need to do any adaption of the LLMs for those tools beyond some basic prompting. Put another way: if you paused all AI progress today, systems would continue to advance in capability for a while solely through the creation of better tools.
Read more: debug-gym: A Text-Based Environment for Interactive Debugging (arXiv).
Get the software here: debug-gym (Microsoft site).

Link

MCP: The new USB-C for AI that’s bringing fierce rivals together

arstechnica.com

MCP has also rapidly begun to gain community support in recent months. For example, just browsing this list of over 300 open source servers shared on GitHub reveals growing interest in standardizing AI-to-tool connections. The collection spans diverse domains, including database connectors like PostgreSQL, MySQL, and vector databases; development tools that integrate with Git repositories and code editors; file system access for various storage platforms; knowledge retrieval systems for documents and websites; and specialized tools for finance, health care, and creative applications.

To make the connections behind the scenes between AI models and data sources, MCP uses a client-server model. An AI model (or its host application) acts as an MCP client that connects to one or more MCP servers. Each server provides access to a specific resource or capability, such as a database, search engine, or file system. When the AI needs information beyond its training data, it sends a request to the appropriate server, which performs the action and returns the result.

Link

No elephants: Breakthroughs in image generation

www.oneusefulthing.org

Over the past two weeks, first Google and then OpenAI rolled out their multimodal image generation abilities. This is a big deal. Previously, when a Large Language Model AI generated an image, it wasn’t really the LLM doing the work. Instead, the AI would send a text prompt to a separate image generation tool and show you what came back. The AI creates the text prompt, but another, less intelligent system creates the image. For example, if prompted “show me a room with no elephants in it, make sure to annotate the image to show me why there are no possible elephants” the less intelligent image generation system would see the word elephant multiple times and add them to the picture. As a result, AI image generations were pretty mediocre with distorted text and random elements; sometimes fun, but rarely useful.

Link

Import AI 405: What if the timelines are correct? | Import AI

jack-clark.net

This article is full of all sorts of interesting information from questions about LLM consciousness to security threats to LLM agents with the capability of doing months of work.

The paper is worth reading because it represents an earnest attempt by a thoughtful human to confront the impossibly large question we’ll need to deal with in the next decade or so – how conscious might LLMs be?

Individuals working with AI performed just as well as teams without AI, showing a 0.37 standard deviation improvement over the baseline. This suggests that AI effectively replicated the performance benefits of having a human teammate – one person with AI could match what previously required two-person collaboration.

“By automating complex tasks previously requiring human ingenuity and extensive effort, AI models can significantly lower the barriers to entry for malicious actors of all attack levels,” Google writes. “”Our evaluations revealed that current AI cyber evaluations often overlook critical areas. While much attention is given to AI-enabled vulnerability exploitation and novel exploit development, our analysis highlights AI’s significant potential in under-researched phases like evasion, detection avoidance, obfuscation, and persistence. Specifically, AI’s ability to enhance these stages presents a substantial, yet often underestimated, threat.”

Significant and sustained growth: “We find that the 50% time horizon has been growing exponentially from 2019–2024 on our tasks,” METR writes. The analysis means METR thinks there’s a high chance AI systems will be able to tackle tasks that take a human a month (167 working hours) by 2030 – or potentially earlier, if a recent uptick in the trajectory due to the arrival of new reasoning models holds.

Link

Revenge of the junior developer | Sourcegraph Blog

sourcegraph.com

I believe the AI-refusers regrettably have a lot invested in the status quo, which they think, with grievous mistakenness, equates to job security. They all tell themselves that the AI has yet to prove that it’s better than they are at performing X, Y, or Z, and therefore, it’s not ready yet.

But from where I’m sitting, they’re the ones who aren’t ready. I lay this all out in detail, my friends, so you can help yourselves.

Regardless of why the luddites aren’t adopting it, they have lost. Junior devs have the high ground, and the battle is now over. Not only are junior devs on average adopting AI faster, but junior devs are also – surprise! – cheaper. If companies are going to make cuts to pay for their devs to win with tokens, which devs do you think they’re gonna keep?

Link

Kagi is a better search engine than Google — but it costs $10 a month | The Verge

www.theverge.com

Using Kagi feels a lot like using Google a decade ago, and I mean that in a good way. You type in a search, and it returns a page full of links. It has image search, video search, maps, news, and even a podcast-specific tab I’ve found very useful. Search for something topical, and you’ll get a few links followed by a side-scrolling carousel of news stories. Search for a person, and Kagi virtually always starts with a short excerpt of their Wikipedia page.

Link

Import AI 404: Scaling laws for distributed training; misalignment predictions made real; and Alibaba’s good translation model | Import AI

jack-clark.net

We are not making dumb tools here – we are training synthetic minds. These synthetic minds have economic value which grows in proportion to their intelligence. The ‘reward system’ of the world is flowing resources into the building of smarter synthetic minds. As we make these things smarter, they will more and more display a propensity to think about themselves as distinct from us.

**Really powerful AI could wreck society by making governments too powerful:
**_…The problem with AGI is that it could make governments way better, which destroys freedom…
_Researchers with Texas A&M University and the Foundation for American Innovation have considered how powerful AI systems could alter the balance of power between citizens and government. Their takeaway isn’t very reassuring – powerful AI systems are highly likely to either a) create a “‘despotic Leviathan’ through enhanced state surveillance and control”, or foster an “‘absent Leviathan’ through the erosion of state legitimacy relative to AGI-empowered non-state actors”.

Link

Career advice in 2025. | Irrational Exuberance

lethain.com

I can’t give advice on what you should do, but if you’re finding this job market difficult, it’s certainly not personal. My sense is that’s basically the experience that everyone is having when searching for new roles right now. If you are in a role today that’s frustrating you, my advice is to try harder than usual to find a way to make it a rewarding experience, even if it’s not perfect. I also wouldn’t personally try to sit this cycle out unless you’re comfortable with a small risk that reentry is quite difficult: I think it’s more likely that the ecosystem is meaningfully different in five years than that it’s largely unchanged.

Link

AI and the Uncertain Future of Work

matthewbilyeu.com

Software is eating the world, but AI is eating software. The industry has so far witnessed a monotonically increasing demand for software–as abstractive layer after layer enabled more software to be created more easily, it seems not to have lessened the demand for applications or the workers that produce them. But that software over the years was not writing itself… The technological advancement of recent AI feels like a difference in kind, not just degree.

How do things look when AIs themselves run or mostly run companies? The most glaring downside would be the displacement of millions of human workers. Robbed of their livelihoods, where would these folks get the funds to buy the widgets being churned out by robots? The middle class would evaporate, leaving extreme inequality, with the few monstrously rich wielding armies of AIs, and the rest competing for the remaining physical jobs.

Link

Page 15 of 16