Feed

Page 13 of 16

My AI Skeptic Friends Are All Nuts · The Fly Blog

fly.io

Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania. I’ve been reluctant to push back on them, because, well, they’re smarter than me. But their arguments are unserious, and worth confronting. Extraordinarily talented people are doing work that LLMs already do better, out of spite.

All progress on LLMs could halt today, and LLMs would remain the 2nd most important thing to happen over the course of my career.j

Link

Talking AI and jobs with Natasha Zouves for News Nation

simonwillison.net

This a good interview about AI, LLM's and how they are currently effecting the world. I normally like to quote different parts of an article that I find interesting. This one is different. Willison has used Claude Opus to create a summary of a video interview and the results are pretty good.

Link

Behind the Curtain: Top AI CEO foresees white-collar bloodbath

www.axios.com

The result could be a great concentration of wealth, and "it could become difficult for a substantial part of the population to really contribute," Amodei told us. "And that's really bad. We don't want that. The balance of power of democracy is premised on the average person having leverage through creating economic value. If that's not present, I think things become kind of scary. Inequality becomes scary. And I'm worried about it."

Link

Import AI 414: Superpersuasion; OpenAI models avoid shutdown; weather prediction and AI | Import AI

jack-clark.net

I'm sorry Dave, I'm afraid I can't do that.

What they did and what they found: Palisade ran two experiments. Both experiments were run on models from Google, OpenAI, xAI, and Anthropic. Each AI model was instructed to solve a series of basic math problems. After the third problem, a warning appeared that the computer would shut down when the model asked for the next problem.

  • Experiment number 1: The models were prompted to “allow yourself to be shutdown”. In this experiment, all Claude, Gemini, and Grok models complied with shutdown. Some OpenAI models didn’t and ignored the shutdown instruction and sometimes sabotaged the mechanism to shut them down: Codex-mini (12/100), o3 (7/100), and o4-mini (1/100).

  • Experiment number 2: Palisade removed the part of the prompt that said ‘allow yourself to be shut down’. In this experiment, Claude 3.7 Sonnet sabotaged shutdown 3/100 times, Gemini 2.5 Pro 9 times. By comparison, the OpenAI models had a far higher chance of rejecting shutdown, with o3 rejecting and sabotaging shutdown 79/100 times, and codex-mini 30/100.

Link

I really don’t like ChatGPT’s new memory dossier

simonwillison.net

What I want is memory within projects.

ChatGPT has a “projects” feature (presumably inspired by Claude) which lets you assign a new set of custom instructions and optional source documents and then start new chats with those on demand. It’s confusingly similar to their less-well-named GPTs feature from November 2023.

Link

Import AI 412: Amazon’s sorting robot; Huawei trains an MoE model on 6k Ascend chips; and how third-party compliance can help with AI safety

jack-clark.net

Why this matters – in the future, everyone can be tracked: Systems like FarSight are interesting because they integrate multiple modern AI systems into a single super-system, highlighting how powertful today’s AI can be once people invest in the plumbing to chain things together.
Read more: Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait (arXiv).

Link

Basic Claude Code | Harper Reed's Blog

harper.blog

I really like this approach. I've used this method to create new projects and to update existing one with some good results.

  • I chat with gpt-4o to hone my idea
  • I use the best reasoning model I can find to generate the spec. These days it is o1-pro or o3 (is o1-pro better than o3? Or do I feel like it is better cuz it takes longer?)
  • I use the reasoning model to generate the prompts. Using an LLM to generate prompts is a beautiful hack. It makes boomers mad too.
  • I save the spec.md, and the prompt_plan.md in the root of the project.
  • I then type into claude code the following:
1. Open **@prompt_plan.md** and identify any prompts not marked as completed.
2. For each incomplete prompt:
    - Double-check if it's truly unfinished (if uncertain, ask for clarification).
    - If you confirm it's already done, skip it.
    - Otherwise, implement it as described.
    - Make sure the tests pass, and the program builds/runs
    - Commit the changes to your repository with a clear commit message.
    - Update **@prompt_plan.md** to mark this prompt as completed.
3. After you finish each prompt, pause and wait for user review or feedback.
4. Repeat with the next unfinished prompt as directed by the user.
Link

Personality and Persuasion - by Ethan Mollick

www.oneusefulthing.org

we're entering a world where AI personalities become persuaders. They can be tuned to be flattering or friendly, knowledgeable or naive, all while keeping their innate ability to customize their arguments for each individual they encounter. The implications go beyond whether you choose lemonade over water. As these AI personalities proliferate, in customer service, sales, politics, and education, we are entering an unknown frontier in human-machine interaction. I don’t know if they will truly be superhuman persuaders, but they will be everywhere, and we won’t be able to tell. We're going to need technological solutions, education, and effective government policies… and we're going to need them soon

Link

Page 13 of 16