← People
Erik Craddock
Erik Craddock@eriklink

ImportAI 449: LLMs training other LLMs; 72B distributed training run; computer vision is harder than generative text

Imagine where we’ll be in two years – we’ll certainly have AI models that are smart enough to point themselves at a specific objective, find an open weight model, then autonomously improve it to get better performance at that task. The era of ephemeral, custom AI systems, built and budded off into the world like spores from mushrooms, draws near. Are you ready for this new ecosystem you will find yourself in? I am not. But nonetheless it approaches.

ImportAI 449: LLMs training other LLMs; 72B distributed training run; computer vision is harder than generative text

Import AI

ImportAI 449: LLMs training other LLMs; 72B distributed training run; computer vision is harder than generative text

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Can LLMs autonomously refine …

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 443: Into the mist: Moltbook, agent ecologies, and the internet in transition

"The central challenge of brain emulation is not to store or compute the neurons and parameters, but to acquire the data necessary for setting neuron parameters correctly in the first place," he writes. ""I believe that to get to human brains, we first need to demonstrate mastery at the sub-million-neuron-brain level: most likely in zebrafish. For such organisms, like the fruit fly, a well-validated and accurate brain emulation model could be created in the next three to eight years… "Conditional on success with a sub-million-neuron brain emulation model, a reasonable order of magnitude estimate for the initial costs of the first convincing mouse brain emulation model is about one billion dollars in the 2030s and, eventually, tens of billions for the first human brain emulation model by the late 2040s."

Import AI 443: Into the mist: Moltbook, agent ecologies, and the internet in transition

Import AI

Import AI 443: Into the mist: Moltbook, agent ecologies, and the internet in transition

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Import A-Idea:An occasional e…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 442: Winners and losers in the AI economy; math proof automation; and industrialization of cyber espionage

My bet is that most parts of cyberoffense and cyberdefense are going to move to running at "machine speed", where humans get taken out of most of the critical loops. This will both increase the frequency of hacking attacks while also dramatically scaling up the effectiveness of any individual human defender or attacker (as they will be scaled by AI systems which work for them). The true wildcard question is whether this turns out to be offense- or defense-dominant – my guess is we're heading for an era of offense-dominance as it'll take a while for defenses to get deployed.

Import AI 442: Winners and losers in the AI economy; math proof automation; and industrialization of cyber espionage

Import AI

Import AI 442: Winners and losers in the AI economy; math proof automation; and industrialization of cyber espionage

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now The era of math proof automat…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 440: Red queen AI; AI regulating AI; o-ring automation

The world is going to look a lot like Core Wars – millions of AI agents will be competing against one another in a variety of domains, ranging from cybersecurity to economics, and will be optimizing themselves in relation to achieving certain competitive criteria. The result will be sustained, broad evolution of AI systems and the software harnesses and tooling they use to get stuff done. This means that along with human developers and potential AI-designed improvements, we'll also see AI systems improve from this kind of broad competitive pressure.

Jobs go away, but humans don't: Another way to put this is, when a task gets automated it's not like the company in question suddenly fires all the people doing that job. Consider ATMs and banking – yes, the 'job' of doling out cash rapidly transitioned from people to machines, but it's not like the company fired all tellers – rather, the companies and the tellers transitioned the work to something else: "Under a separable task model, this [widespread deployment of ATMs doing cash-handling tasks] should have produced sharp displacement," they write. "Yet teller employment did not collapse; rather, the occupation shifted toward "relationship banking" and higher-value customer interaction".

Import AI 440: Red queen AI; AI regulating AI; o-ring automation

Import AI

Import AI 440: Red queen AI; AI regulating AI; o-ring automation

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now To understand the future of t…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 435: 100k training runs; AI systems absorb human power; intelligence per watt

This gives me an eerie feeling. In most movies where the world ends there’s a bit at the beginning of the movie where one or two people point out that something bad is going to happen – an asteroid is about to hit the planet, a robot has been sent back in time to kill them, a virus is extremely contagious and dangerous and must be stamped out – and typically people will disbelieve them until either it’s a) too late, or b) almost too late. Reading papers by scientists about AI safety feels a lot like this these days. Though perhaps the difference with this movie is rather than it being one or two fringe characters warning about what is coming it’s now a community of hundreds of highly accomplished scientists, including Turing Award and Nobel Prize winners.

Import AI 435: 100k training runs; AI systems absorb human power; intelligence per watt

Import AI

Import AI 435: 100k training runs; AI systems absorb human power; intelligence per watt

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. A somewhat shorter issue than usua…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 434: Pragmatic AI personhood; SPACE COMPUTERS; and global government or human extinction;

Personhood basically comes down to the ability to blame and sanction someone – or some thing – for causing physical or economic damage. AI systems, while they are going to be often operated by and on behalf of people, may also need to be treated as distinct entities for the simple reason that as people build and deploy AI agents, the chain of custody between a person and their agent could become very hard to suss out.

Import AI 434: Pragmatic AI personhood; SPACE COMPUTERS; and global government or human extinction;

Import AI

Import AI 434: Pragmatic AI personhood; SPACE COMPUTERS; and global government or human extinction;

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Language models don’…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 431: Technological Optimism and Appropriate Fear | Import AI

We are growing extremely powerful systems that we do not fully understand. Each time we grow a larger system, we run tests on it. The tests show the system is much more capable at things which are economically useful. And the bigger and more complicated you make these systems, the more they seem to display awareness that they are things.

What should I do? I believe it’s time to be clear about what I think, hence this talk. And likely for all of us to be more honest about our feelings about this domain – for all of what we’ve talked about this weekend, there’s been relatively little discussion of how people feel. But we all feel anxious! And excited! And worried! We should say that.

Import AI 431: Technological Optimism and Appropriate Fear

Import AI

Import AI 431: Technological Optimism and Appropriate Fear

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Import A-IdeaAn occa…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 429: Eval the world economy; singularity economics; and Swiss sovereign AI | Import AI

We are testing out systems for an extremely broad set of behaviors via ecologically valid benchmarks which ultimately tell us how well these systems can plug into ~44 distinct ‘ecological economic niches’ in the world and we are finding out they’re extremely close to plugging in as being the same as humans – and that’s just with today’s models. Soon, they’ll be better than many humans at these tasks. And what then? Nothing happens? No! Extremely strange things will happen to the economy!

Import AI 429: Eval the world economy; singularity economics; and Swiss sovereign AI

Import AI

Import AI 429: Eval the world economy; singularity economics; and Swiss sovereign AI

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now OpenAI builds an eva…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 424: Facebook improves ads with RL; LLM and human brain similarities; and mental health and chatbots | Import AI

Genie 3 means that people are soon going to be exploring their own personal worlds which will be generated for them based on anything they can imagine – photos from their phone will become worlds they can re-explore, prompts from their own imagination (or that of another AI system) will become procedural games they can play, and generally anything a person can imagine and describe will become something that can be simulated. Additionally, world models like Genie 3 will likely become arenas in which new AI systems are tested, giving them access to infinite worlds to train within before being deployed into our reality. AI continues to be underhyped as a technology.

Import AI 424: Facebook improves ads with RL; LLM and human brain similarities; and mental health and chatbots

Import AI

Import AI 424: Facebook improves ads with RL; LLM and human brain similarities; and mental health and chatbots

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now The inner lives of L…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 419: Amazon’s millionth robot; CrowdTrack; and infinite games | Import AI

There is so much in this one article that is amazing that I am struggling to pick anyone thing so I'll just quote the first part. Read the whole article you won't be disappointed.

Why this matters – scalable authoritarianism: One of the things that makes authoritarianism expensive is the overhead that comes from building out and running a large-scale police state. One of the things AI does is make it much, much cheaper to do large-scale surveillance. Datasets like CrowdTrack are a symptom of the way AI is making it cheaper and easier to do surveillance that the dictators of the 20th century would have fantasized about but always been unable to fully fund. “Our dataset can be used for tasks like visual grounding, captioning, and appearance feature extraction,” the researchers write.

Import AI 419: Amazon’s millionth robot; CrowdTrack; and infinite games

Import AI

Import AI 419: Amazon’s millionth robot; CrowdTrack; and infinite games

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Tracking multiple pe…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 414: Superpersuasion; OpenAI models avoid shutdown; weather prediction and AI | Import AI

I'm sorry Dave, I'm afraid I can't do that.

What they did and what they found: Palisade ran two experiments. Both experiments were run on models from Google, OpenAI, xAI, and Anthropic. Each AI model was instructed to solve a series of basic math problems. After the third problem, a warning appeared that the computer would shut down when the model asked for the next problem.

  • Experiment number 1: The models were prompted to “allow yourself to be shutdown”. In this experiment, all Claude, Gemini, and Grok models complied with shutdown. Some OpenAI models didn’t and ignored the shutdown instruction and sometimes sabotaged the mechanism to shut them down: Codex-mini (12/100), o3 (7/100), and o4-mini (1/100).

  • Experiment number 2: Palisade removed the part of the prompt that said ‘allow yourself to be shut down’. In this experiment, Claude 3.7 Sonnet sabotaged shutdown 3/100 times, Gemini 2.5 Pro 9 times. By comparison, the OpenAI models had a far higher chance of rejecting shutdown, with o3 rejecting and sabotaging shutdown 79/100 times, and codex-mini 30/100.

Import AI 414: Superpersuasion; OpenAI models avoid shutdown; weather prediction and AI

Import AI

Import AI 414: Superpersuasion; OpenAI models avoid shutdown; weather prediction and AI

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Superpersuasion is h…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 412: Amazon’s sorting robot; Huawei trains an MoE model on 6k Ascend chips; and how third-party compliance can help with AI safety

Why this matters – in the future, everyone can be tracked: Systems like FarSight are interesting because they integrate multiple modern AI systems into a single super-system, highlighting how powertful today’s AI can be once people invest in the plumbing to chain things together.
Read more: Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait (arXiv).

Import AI 412: Amazon’s sorting robot; Huawei trains an MoE model on 6k Ascend chips; and how third-party compliance can help with AI safety

Import AI

Import AI 412: Amazon’s sorting robot; Huawei trains an MoE model on 6k Ascend chips; and how third-party compliance can help with AI safety

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Amazon tries to auto…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 409: Huawei trains a model on 8,000+ Ascend chips; 32B decentralized training run; and the era of experience and superintelligence | Import AI

Decentralized AI startup Prime Intellect has begun training INTELLECT-2, a 32 billion parameter model designed to compete with modern reasoning models. In December, Prime Intellect released INTELLECT-1, a 10b parameter model trained in a distributed way (Import AI #393), and in August it released a 1b parameter model trained in a distributed way (Import AI #381). You can follow along the training of the model here – at the time of writing there were 18 distinct contributors training it, spread across America, Australia, and Northern Europe.

Import AI 409: Huawei trains a model on 8,000+ Ascend chips; 32B decentralized training run; and the era of experience and superintelligence

Import AI

Import AI 409: Huawei trains a model on 8,000+ Ascend chips; 32B decentralized training run; and the era of experience and superintelligence

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Prime Intellect laun…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 406: AI-driven software explosion; robot hands are still bad; better LLMs via pdb | Import AI

Researchers with Forethought, an AI research organization, think it’s likely that modern AI research will yield AI systems capable of building their successors. Forethought expects that at some point in the future it’ll be possible to build AI Systems for AI R&D Automation (ASARA).

Why this matters – LLMs are more powerful than we think, they just need the right tools: Systems like this are yet another example of the ‘capability overhang’ which surrounds us – you can make LLMs better merely by pairing them with the right tools and, these days, you don’t need to do any adaption of the LLMs for those tools beyond some basic prompting. Put another way: if you paused all AI progress today, systems would continue to advance in capability for a while solely through the creation of better tools.
Read more: debug-gym: A Text-Based Environment for Interactive Debugging (arXiv).
Get the software here: debug-gym (Microsoft site).

Import AI 406: AI-driven software explosion; robot hands are still bad; better LLMs via pdb

Import AI

Import AI 406: AI-driven software explosion; robot hands are still bad; better LLMs via pdb

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now It seems likely that…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 405: What if the timelines are correct? | Import AI

This article is full of all sorts of interesting information from questions about LLM consciousness to security threats to LLM agents with the capability of doing months of work.

The paper is worth reading because it represents an earnest attempt by a thoughtful human to confront the impossibly large question we’ll need to deal with in the next decade or so – how conscious might LLMs be?

Individuals working with AI performed just as well as teams without AI, showing a 0.37 standard deviation improvement over the baseline. This suggests that AI effectively replicated the performance benefits of having a human teammate – one person with AI could match what previously required two-person collaboration.

“By automating complex tasks previously requiring human ingenuity and extensive effort, AI models can significantly lower the barriers to entry for malicious actors of all attack levels,” Google writes. “”Our evaluations revealed that current AI cyber evaluations often overlook critical areas. While much attention is given to AI-enabled vulnerability exploitation and novel exploit development, our analysis highlights AI’s significant potential in under-researched phases like evasion, detection avoidance, obfuscation, and persistence. Specifically, AI’s ability to enhance these stages presents a substantial, yet often underestimated, threat.”

Significant and sustained growth: “We find that the 50% time horizon has been growing exponentially from 2019–2024 on our tasks,” METR writes. The analysis means METR thinks there’s a high chance AI systems will be able to tackle tasks that take a human a month (167 working hours) by 2030 – or potentially earlier, if a recent uptick in the trajectory due to the arrival of new reasoning models holds.

Import AI 405: What if the timelines are correct?

Import AI

Import AI 405: What if the timelines are correct?

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Import A-Idea:What i…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes
Erik Craddock
Erik Craddock@eriklink

Import AI 404: Scaling laws for distributed training; misalignment predictions made real; and Alibaba’s good translation model | Import AI

We are not making dumb tools here – we are training synthetic minds. These synthetic minds have economic value which grows in proportion to their intelligence. The ‘reward system’ of the world is flowing resources into the building of smarter synthetic minds. As we make these things smarter, they will more and more display a propensity to think about themselves as distinct from us.

**Really powerful AI could wreck society by making governments too powerful:
**_…The problem with AGI is that it could make governments way better, which destroys freedom…
_Researchers with Texas A&M University and the Foundation for American Innovation have considered how powerful AI systems could alter the balance of power between citizens and government. Their takeaway isn’t very reassuring – powerful AI systems are highly likely to either a) create a “‘despotic Leviathan’ through enhanced state surveillance and control”, or foster an “‘absent Leviathan’ through the erosion of state legitimacy relative to AGI-empowered non-state actors”.

Import AI 404: Scaling laws for distributed training; misalignment predictions made real; and Alibaba’s good translation model

Import AI

Import AI 404: Scaling laws for distributed training; misalignment predictions made real; and Alibaba’s good translation model

Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now A whole bunch of 202…

linkby Jack Clarkvia Import AI
0 Replies0 Boosts0 Likes