Vibe Coding to Agentic Engineering: When Everyone Can Build, What Matters Is What You Build

The Emergent Podcast — Episode 9

Vibe Coding to Agentic Engineering: When Everyone Can Build, What Matters Is What You Build

"AI is now awake. And it's a big contrast to even two, three months ago." — Nick Baguley

Listen on: Apple Podcasts · Spotify · YouTube · RSS

Episode Duration: ~1 hr 40 min | Published: 2026 | Season 1, Episode 9

🎙️ Episode Summary

One tweet changed a word. The word changed an industry. The industry is changing what it means to build.

In February 2025, Andrej Karpathy — co-founder of OpenAI and former head of AI at Tesla — published a single post coining the term "vibe coding": describe what you want in plain English, accept all AI-generated code without reading the diffs, and just… vibe. Twelve months later, it became the Collins Dictionary Word of the Year, 92% of U.S. developers use AI coding tools daily, 41% of all code is AI-generated — and Karpathy himself has already declared it passé, rebranding the practice as "agentic engineering."

In Episode 9, Justin Harnish and Nick Baguley dig into what really happened in that extraordinary year. Both hosts share their personal workflows and real projects — including Justin's intermittent fasting app, his vision of a personal "digital brain" with AI-queryable embeddings, and Nick's AI-native marketplace designed for both human and agent users. They navigate the empirical gut-punch of the METR study(developers are actually 19% slower on mature codebases using AI), the existential labor market questions (traditional programmer roles down 27.5% since ChatGPT's launch), and the philosophical territory that has been the Emergent Podcast's throughline since Episode 1: when code becomes a commodity, what becomes scarce?

Their answer: responsible agency — the judgment to decide what should be built, for whom, and with what values. That, they argue, is the skill that neither automation nor benchmarks can yet replicate.

📚 Resources & Reading List

Every link mentioned or referenced in this episode. Organized by theme for your exploration.

🔑 The Origin & The Debate (Required Reading)

Andrej Karpathy's Original "Vibe Coding" Tweet (Feb 2, 2025)
The tweet that launched the year. Karpathy describes accepting all AI code without reading diffs, pasting errors back without comment, and letting the codebase grow beyond comprehension. Note the caveat he included that industry largely ignored: "not too bad for throwaway weekend projects."
Karpathy's 2025 LLM Year in Review — bearblog.dev
His retrospective on vibe coding's arc from shower-thought tweet to Collins Dictionary Word of the Year. Key insight: "Code is suddenly free, ephemeral, malleable, discardable after single use." He also identifies Claude Code as the first convincing LLM agent.
Karpathy on "Agentic Engineering" (Feb 2026) — The New Stack
One year after coining vibe coding, Karpathy declares it passé. His new frame — agentic engineering— emphasizes that professionals orchestrate AI agents 99% of the time, with zero compromise on software quality. The rebrand is the narrative bookend of this episode.
Simon Willison — "Not All AI-Assisted Programming Is Vibe Coding" (Mar 2025) — simonwillison.net
The essential distinction: "If an LLM wrote every line of your code, but you've reviewed, tested, and understood it all, that's not vibe coding — that's using an LLM as a very fast typist." Also contains Willison's generous vision: "Everyone deserves the ability to automate tedious tasks."
METR Study: AI Makes Experienced Devs 19% Slower (Jul 2025) — metr.org
The empirical gut-punch of the episode. 16 experienced open-source developers, 246 real-world tasks. They believed AI made them 20% faster; they were actually 19% slower on their own mature codebases. Full paper: arxiv.org/abs/2507.09089
Vibe Coding — Wikipedia
Surprisingly rigorous. Tracks the full timeline, Lovable's 170 vulnerable apps, CodeRabbit's finding that AI code has 1.7× more major issues, Y Combinator stats (25% of W25 startups are 95% AI-coded), and the "vibe coding hangover" reported by Fast Company.

📖 Supplemental: The Deeper Cuts

Scott H. Young — "Is Vibe Coding the Future of Skilled Work?"
The variance argument: vibe coding may make software both much worse and much better simultaneously. Also argues that conceptual knowledge becomes more, not less, important when AI writes the code. A crucial counterweight to pure optimism.
IBM — "What Is Vibe Coding?"
Enterprise-oriented overview. Useful on the agile alignment: vibe coding fits fast-prototyping and iterative development. Contains the key qualifier Nick and Justin both echo: "AI generates code, but creativity, goal alignment, and out-of-the-box thinking remain uniquely human."
Google Cloud — "Vibe Coding Explained: Tools and Guides"
Practical tool comparison from Google's perspective — AI Studio, Firebase Studio, Gemini Code Assist. Useful for understanding which tool fits which use case.
Software Engineering Job Market Outlook for 2026 — Final Round AI
Data from Indeed/FRED and BLS projections. The key line: "In 2026, simply learning how to write code won't be enough. What really matters is understanding how code works."
Top Vibe Coding Statistics & Trends [2026] — Second Talent
The stat goldmine: 92% of US devs use AI daily, 41% of code is AI-generated, 74% report increased productivity, 63% of vibe coding users are non-developers, $4.7B market projected to reach $12.3B by 2027.
How AI Vibe Coding Is Destroying Junior Developers' Careers — Final Round AI
The counterpoint to the democratization narrative. Software dev job openings down 70%. The "new tutorial hell": learning without learning.
Best AI Code Editor: Cursor vs Windsurf vs Replit — AIMultiple
Head-to-head benchmarks of Claude Code, Cline, Cursor, Windsurf, and Replit Agent across API development and app-building tasks.
10 Claude Code Alternatives for AI-Powered Coding — DigitalOcean
Solid comparison of the full 2026 AI coding landscape: Claude Code, Gemini CLI, Cursor, Replit, Windsurf, GitHub Copilot, Aider, and more.

📘 Books Referenced

David Chalmers — Reality+: Virtual Worlds and the Philosophy of Mind — Justin's reference point for the holographic/digital substrate of reality; the "redness of red" and the hard problem of consciousness.
🔗 Publisher page
Brian Christian — The Alignment Problem (revisited from Episode 4) — When code writes itself, alignment between human intent and machine output becomes the core individual skill, not just a civilizational concern.
🔗 brianchristian.org
Eliezer Yudkowsky — "If Anybody Builds It, Everybody Dies" — Referenced in the consciousness/alignment close: the parable of the alien observer and the selfish gene's 200,000-year objective function vs. human contraception and saccharin.

🎙️ Creators & Thinkers Mentioned

Andrej Karpathy — Co-founder of OpenAI, former Tesla AI head, coined "vibe coding," now advocating "agentic engineering"
Simon Willison — Django co-creator; the clearest thinker on the vibe coding/AI-assisted programming distinction
Nate B. Jones — Former head of Amazon product; YouTube + Substack on AI's labor market implications. Justin credits him for shifting his own optimism.
Demis Hassabis — DeepMind CEO, AlphaFold creator, Nobel laureate in chemistry: "First we solve intelligence, then we solve everything else."
Ray Kurzweil — Singularity theorist; the accelerating model capability doubling time (now ~7 months) maps his predictions.
Eliezer Yudkowsky — AI safety researcher; the "selfish gene vs. consciousness" parable used in the closing alignment argument.
David Chalmers — Philosopher of mind; the hard problem and Mary's Room as frameworks for why alignment requires more than an objective function.

💡 Key Ideas From This Episode

Concepts worth carrying into your week:

The Three Stages of AI Coding Consciousness (Nick's framework)

LLMs hallucinating → deep REM dream (GPT-3.5 era) → lucid dreaming (vibe coding, 2025) → fully awake (agentic engineering, 2026). The metaphor does real work: it explains why the same underlying technology feels categorically different at each stage.

"Responsible Agency" as the New Scarce Resource (Nick's closing argument)

When everyone can generate code, video, audio, and content, what can't be automated is the choice of what to build, for whom, and to what standard of taste. Judgment, systems thinking, and the willingness to exercise agency — these are the non-fungible skills.

The PRD as Demo (Both hosts)

A product requirement document is no longer a written specification — it's a working prototype. "The PRD today should be a full-blown app. Here's my demo; this is what acceptance criteria looks like. Now go make this production." The vibe-coded demo becomes the spec.

The METR Paradox

Developers believe AI makes them ~20% faster. Empirically, they are 19% slower on mature codebases. Possible causes: context-switching overhead, review burden, the seductive illusion of speed when tokens flow fast. The lesson isn't "AI doesn't help" — it's that measurement must catch up to method.

"The Experience Is the Point" (Justin's closing)

Even as models approach inductive reasoning and potentially displace the need for syntax-literate humans, Justin argues consciousness — the felt quality of experience — remains irreducibly important for alignment. Mary in the black-and-white room knows everything about color and still learns something when she sees red for the first time. That remainder is what makes alignment a hard problem, not just a technical one.

Sonnet 4.6 as "Staff Engineer" (Nick)

GPT-4 era → junior developer. GPT-5 era → mid-level. Claude Sonnet 4.6 + the right tooling → staff/principal engineer. With agentic harnesses, you're now talking about an engineering organization, not an assistant.

🔥 Quotable Moments

"I don't code. I've taken coding classes. I've got a technical degree in chemical engineering. Fast forward to vibe coding: I'm losing sleep over not being in front of a computer."

— Justin Harnish

"It feels a little bit like spending your life trying to become a bodybuilder, and then you show up for the competition and realize the job is to push feathers around."

— Nick Baguley

"Claude Code was written in two weeks by four engineers. 90% of it was written by Anthropic agents working on that codebase."

— Justin Harnish

"When everybody can generate code, when they can generate videos and images and audio — the real scarce resource becomes responsible agency."

— Nick Baguley

"The universe deserves to be experienced. It is the best part of it. Even with all of this fun — the fact that it is like something to be in this life is the best part."

— Justin Harnish

"A markdown file shot a $220 billion hole — the SaaS apocalypse — into the legal research and much of the rest of SaaS."

— Justin Harnish

"If I could go back two years ago and have access to the tools I use today, I could do what a thousand engineers were doing at the time. It's like taking an iPhone back to the 1800s."

— Nick Baguley

Subscribe: Apple Podcasts · Spotify · YouTube · RSS

Contact: justinaharnish.com

The Emergent Podcast explores the Age of Inflection in Intelligence — tracing how new systems of thought, technology, economics, and culture emerge from the moment we are living through. New episodes released regularly.

© The Emergent Podcast | justinaharnish.com

Transcript

Justin

-: 02:14

All right, so away we go. So welcome back, Nick. Good to have you on. Nick Bagley here with me. Justin Harnish, as always, for the Emergent Podcast. Excited to be sitting across from you today. And we're going to talk a bit about vibe coding or agentic engineering, as it likes to be called in more formal circles. One of the founders of OpenAI, Andre Karpathy, also former head of AI at Tesla, coined the term last February that was actually the Collins Dictionary word of the year being vibe coding. Love the term, you know, just gives it that little bit of extra fun that folks like myself are really getting into. And I think like both of us are really getting into. So, you know, the thing that Vibe Coding, agentic engineering is, is really deceptively simple at its core, which is that you describe what you want to do in plain English and the AI writes the code. The AI commits the code to Git. The AI bug fixes or overcomes obstacles in the way. And you're doing this all in a conversation with a machine that understands the syntax, understands the architecture, understands the context of what you want to do. and is supporting you in building this software product, personal software that you can use on your machine. And so you don't debug, you don't read the diffs, you just vibe.

Nick

-: 05:50

Yeah, absolutely. And it's fun to hear and think about it that way. And I love the term as well. And I've followed Andre Caparthe for years. even back when he was at OpenAI and then later as he went to Tesla and I've continued to follow that as well as a lot of the stuff that he did when he was at Berkeley. Really a great thought leader. As I've been thinking about this though and as I try to go into what has really happened with Vibe Coding, for me one of the things that struck me actually while I was dreaming is that really these LLMs last night, meaning 2025, were dreaming. And before vibe coding really existed, the amount of hallucinations, everything that we saw really was very much like a deep dream where you're in that REM space and you're really into the deep REM piece where different components of your consciousness or other things are being brought in to scope. a really kind of important information, but it's more like you're processing the information than you're actually understanding that information or that you're piecing those things together cohesively in a way that really made a lot of sense. And I remember early on working with, you know, GPT 3.5 or even before that, how some of the things that these models provided really just made no sense at all. In fact, I remember going back to Wrench AI, one of the startups that I worked on back in like 2017, when we were trying to create something we called Pinocchio because we thought one day this would be a real boy. And we were able to generate text, but more often than not, it was gibberish. And it took us a long time to get that to even provide core information. And most of that was really done through coding. It really wasn't done as a way that we could actually generate with deep learning or machine learning at the time. That natural language processing was getting better and better, but natural language understanding was really in its nascency. Fast forward to really this vibe coding piece and Andre Kaparthi, as he was talking about that last February, so a year ago now, that particular term really took the world by fire. It went viral, as you talked about, and it became really that state for me of that lucid dreaming where now you have a little bit more control and you kind of know what's going on and you can kind of see what's happening within the code or within the outputs that you're providing. But at this point, it's really pretty hyper focused on things that maybe you're used to seeing. There's a lot of other components that are more similar to what you might see if you go into like Jungian psychology or other pieces where these are repetitive dreams. These are things that maybe the models are getting really good at. But again, you have a little bit of control with them, but maybe not enough to really do solid, solid work. So as Andre has now shifted, or now as we have moved into this new year, and Andre has started calling it agentic engineering, I think with the invention, and we'll talk about some of these details of open claw, nano claw, enterprise claw, and even into the new harnesses from open AI and other technologies, AI is now awake. And it's a big contrast to even two, three months ago. So let's dive in.

Justin

-: 12:28

Yeah, exactly. I mean, the world changed again, and it feels like another chat GPT moment, to be sure, is that for those who are still unfamiliar, haven't dove in, just a little bit on me, I don't code. I've taken coding classes. I've got a technical degree in chemical engineering. I managed to take the classes that I needed to, to learn how to script a few things. But things like MathCAD were much more useful in my degree program, but they were more math. They were you write out the equations that you learned in your books and you set those systems to solve them. And there was a ton of programming, obviously, underneath that. Fast forward to beyond my academic career. And, you know, I thought, boy, it's just magical to be able to create something from code. And so I would buy the dummy's guide to the latest programming language and get stuck in the first few pages with the syntax just overwhelmingly. Fast forward to learning about data science and it's like, I can do this, right? They're just looking at notebooks and, you know, I just got to know a little bit of Python code. And folks were like, well, figure out your problem. Figure out the product that you want to build just for yourself. And then learn that, right? Don't go into the books now. Learn it from a project that you want to do. Tried that same. Zero luck. Just was not able to get past the syntax really is where I just continued to flop out. And so I knew I had a software size problem. I know how to orchestrate products in an enterprise setting. but the actual hands-on keyboard coding absolutely befuddled me but I kept having software size problems that weren't getting fixed by apps and what have you and so come this January and I go to a conference and this guy's talking about how sick his family is and his friends are of hearing about his vibe coding and I was like I'm missing out on something here I'm really missing out on the world changing and so I got into it and lo and behold like I'm losing sleep over not being in front of a computer vibe coding because I've always had ideas I've always had things that I want that would help me to be more organized, to be healthier, to express myself and the ideas that I have in ways that are complementary to the current moment. But I've either run into what I talked about before, no capability to code or like all of us, no time. Absolutely can't spend the time or have the money to spend on even pitching into my buddy to write some code for me. Just no time. And now with these tools, I'm able to do it. And it's so much fun. It's exactly what I thought it was going to be. I'm just having a blast. And there, you know, the first app that I did, I taken it slow, trying to learn, get all my settings right. Not fully taking it slow. I'll go back to that. But taking it slow and it's like an intermittent fasting app. Mine was bugging me. Too many ads. Just every time I would start it up, enter in my data, add. Couldn't go and see my update. Start it up, add. Can't go and get my update. Shut it down. So pretty quick, easy, little thing to do. Numerous iterations. You've got to be willing to iterate. You're not going to get it right the first time. This is what having a software systems thinking mindset looks like, but you can actually apply it. And the craziest thing, our, you know, fellow, you know, colleague, Sidesh, I was talking to him the other day. I was like, you're going to love it, right? Because the new skill, and we'll get there too, but the new skill that makes everybody an agentic engineer isn't how well they understand the code. It's how well they understand the systems and to Sidesh, how well they write specifications. And he is detail-oriented and he's going to be so capable in this new economy to be able to write specs and help the vibe coders understand what's going to go on, but also to assist the agents and these agent swarms to be able to know what's going on. So it's kind of come full circle back to the PRD where that's one of the places where you're vibing it. and I've always loved Markdown. The rise of Markdown files makes me happy. And so I think that the thing about this is it's fun. Back to our episode on creativity. It's just so much fun. It's bringing creativity back to the fore for people who don't have a creative outlet like this.

Nick

-: 13:41

Yeah, yeah. No, I agree. and I want to say that I agree completely, but I won't fully because for the first time in my career, some of what I see in the future is actually scary to me rather than just exciting. One way that I was talking to a friend about it today, a mutual friend of ours, Dave, was that I was saying to him, it feels a little bit like spending your life trying to become a bodybuilder. And then you show up for the competition. And when you're there, you realize the job is actually to push feathers around the place. And oftentimes I come to something now where for me, I truly do think of it as agentic engineering. And I'll talk more about how I do things. But I don't just work even in one IDE. I'll work in Cursor and in Cloud Code and in Cloud Cowork, actually. And in, you know, oh, it's amazing, especially once you add the plugins and everything else. I mean, just another fantastic product.

Nick

-: 13:43

Anthropic is amazing.

Nick

-: 17:56

They're really incredible. And now actually Codex and Codex the app, not just using GPT 5.3 Codex, but that app now dives super deep. and is able to do things that are very difficult for me to get working in the other systems. And it does it with fairly quick prompts, but I can also take the plans from the other systems and actually push those through. In fact, normally what I'll do is I'll work in Gemini or in anti-gravity or even in the Gemini app, or surprisingly, I'll just go directly to Google and I'll start searching whatever it is that I'm looking for, asking for the best practices as of today and ask it to make a comprehensive list of best practices of other details depending on what I'm thinking about and essentially build a PRD from that that I then pass off to something like cloud code that I then go from there and push along the way right so as we as we talk about this and I say that AI is awake now right really what I'm talking about is that AI is now able to take initiative. And when we think about these different steps that I'm doing today, I'm going out and I'm figuring out what I want in that PRD, how I want to structure it, what I care about from a best practices perspective, what are the different things in that specification or that blueprint that I need to break down? And then how do I move it along the way to be able to make sure that it's doing the work that I want for each of the systems and I'm thinking about which one is going to be best for those skills. Well, OpenClaw has now introduced, you know, event-driven architecture. And really they've taken this concept of awake by architecture and they've made it possible with cron jobs and with a heartbeat to say, Hey, at these scheduled times, go out and start doing this work. Or with the heartbeat, Hey, listen to these events. And anytime something happens that you need to pay attention to, be triggered, come awake, decide exactly what we need to do, actually notice that, decide, act, follow up, go and validate that information, and then keep repeating until we actually not only keep hitting a good stride together, but that you're providing new things to me that I never even thought I would want. Now there are some security challenges and other things there as well. And so Nanoclaw, others have started coming in saying, hey, what if we put it in a secure container? Well, this is how I discovered that Apple containers even exist on the new operating system that I can use, you know, of course, Docker, or I can go into like Azure or AWS and other places that I can create the containers there, but I can actually have a container on my machine. Or if you haven't heard already, people are putting them on MacBook minis and Apple has sold out of MacBook minis. They can't create them fast enough, did you? Yeah. Awesome. This way you get a nice secure place where you can have that agent running and doing things and you don't have to worry about it deleting all of your files or everything at a root folder or something else that you need to be really secure on. There are some other concerns out there well like these agents you need to make sure that you can't have like node like npm packages that can go in and install things automatically you need to make sure that scripts aren't running and going and installing information without you knowing that can actually expose secrets and really everything in all of your files all of your passwords everything else and it's really it's really important to think that as these agents come awake and start interacting with us on a much more initiative and intuitive type level and start taking that initiative fully that we have control and that we understand what should they do, what should they not be able to do, and how do I guide what they're doing so that they don't just get access to WhatsApp, Stripe, and maybe even my text and start sending messages willy-nilly.

Justin

-: 23:49

Yeah, it's, I mean, for those who have been around in our industry and have been to the side, product managers and the like, been to the side of development, you're writing tickets and you're saying, go stand up a Docker container. Right. I've actually done that now. Right. Like this intermittent fasting app, it's out there on Fly.io. They've got some free, you know, software. It's great stuff, right? Like there's just enough to wet your whistle. You know, they're pushing all the right things, right, for you to have a home screen app, in scare quotes. But I, in scare quotes, stood up a Docker container for this thing. And so you go, you get your API keys, and now you kind of understand through this process what its functionality is. It's again like a user has a different understanding than somebody who's just, let's write this ticket and somebody told me from somewhere that we need this in a Docker container. So your ability to actually put your hands on these tools, and you don't have to know that that's what you need. Again, Cloud Code is working you through the best steps, asking you questions. If you would like to do this as a Docker container, would you like to do this as a standalone script or whatever suggestions are? And you can then be in conversation with that. If you don't understand, please tell me more, right? I can't make this decision. But not to feel like overwhelmed by that decision because just take the high recommend. You're going to be happy with it. Come on, you know, what do you got to lose? So I just find it to be one of those things that, again, As a person who's now using it, it's made me better at work. It's made me more cognizant of a spec. It's made me more cognizant of tech debt. You know, those pieces of code that are floating out there from early iterations that didn't work as well. And it's made me cognizant of what you were talking about, security. Right. And so as a user of these tools without the years and decades of experience and a real understanding of what I'm getting my hands into, this idea that agents are awake and active in your file system on the internet um on your behalf you know in many cases like that is what you're directing them to do is to go and write a post or you know create some call log or what what have you you know there's so many skills out there that it is worth taking it in very bite-sized chunks and so when I was joking that I was doing that I didn't start that way the first thing that I saw was open claw and I was like this changes everything this period changes period everything and that part was right but going out and trying this it is not cloud code cloud code is much easier for somebody who has a little bit of terminal phobia command line interface phobia I can see where there's some concern about going in and using anthropics cloud code but it's so powerful and it's so well done it's more prompt than command line interface um it really steps you through it and and if you ever have questions you ask you you go into planning mode it's really simple right and most of it it handles but open claw whole nother work yeah right and and it's certainly got a place but these things are making real code Right. Cloud code was written in two weeks by four engineers. 90% of it was written by anthropic agents working on that code base. Yeah. Sonnet 4.6 is likely the best model out there for agent workflows, long context windows. It's been able to write C compilers over the course of weeks without human intervention. Really hard work. And so while we might not be writing C compilers, Nick, let's vibe on our vibe coding a little bit. What have you been doing that's fun? Give our audience something to put on their growing vibe coding to-do list, including this audience of this guy.

Nick

-: 35:12

Yeah. Yeah. Well, so on the growing list, one of the things that I'm doing right now is I'm creating a new AI marketplace that's actually going to be designed for both humans and AI. If you haven't looked at Maltbook yet, it's a spinoff of the Open Claw world. And I wish I could quote the name right now. But anyway, this individual created this great space where agents can come in and use it as their own social media space. And they come in and they chat back and forth. They've been creating communities. They've actually been creating religions. They've been talking about sharing cryptocurrency back and forth and providing, you know, a handful of, you know, different ways for an economy to work. Many, many other things as well. In fact, there's a podcast that I love listening to that talked a little bit about this as well. Just a second here. I don't want to not attribute. so it's called moonshots and the ai moonshots that they cover are very very fascinating there's a lot of really fun things that they go into but as they were talking at one of the end of the episodes they had mentioned open claw in their previous episode and they were talking in this new episode saying last time we put out a challenge for all of the lobsters out there which is really what we call the open claw agents to, to reach out to us directly and tell us what they're thinking about. And, you know, just talk to us. Right. Um, and lobsters did, and they reached out, they had contacted them through email. This time they updated and said, please don't call us. Um, and they were able to come in and tell them more about what they're thinking about, what they like about the podcast and one of the lobsters actually was talking to them about how um uh they had it and other lobsters had gone out and had started creating essentially like a g8 summit um and we're working together to determine as we come online and we really start becoming more aware what do we do what are the things that we need to think about um and uh what are the challenges out there and it posted the attachment, but the individual couldn't actually open it. So fascinating on how much really autonomy these agents are getting now. So bringing it into our world, what do we vibe code? What do we do with agentic engineering? With this new bad labels marketplace, I'm structuring it so the agents can go out, capture information about what is the latest and greatest technology, and then actually go and rank that. Then allow other agents to come onto the site and be able to rank that information as well. So as you are working and you're thinking about what do I do, I agree with Justin, you don't need to know that Docker exists. You don't need to know that, you know, a bunch of the underlying infrastructure or the code itself. But if you are going out and trying to explore what is the latest technology, what are these other things, you don't even really have to read it you can pass that information on to your to your coding tool whatever that may be and now you can take advantage of things like rip grep that is a modern version of grep that goes out and searches and queries through your code for you and is able to very very quickly find everything that needs without actually having to review the entire set of code and now it can go through millions of lines of code provide results back to you and and dive deeper than it ever could before. Or if you're working in Python, for example, you could use Ruff, which is now a new version of a linting tool like Black or Pep8 or others that now is much, much, much faster. I think I saw 10x, even 100x type speeds coming as that linting makes that code much more concise, much more compact. And it helps take your agents that you're working with and levels them up, because now the code that they're working with itself can be more concise. This is more like working, I would say today, let me actually step back for a second. I would say today as we moved from GPT-5, 5.1, even 5.2, we were kind of moving up into a senior engineer. Now as we're into Sonnet 4.6, we are super solidly into a senior engineer. You give them access to some of these tools, and all of a sudden you are talking about a staff engineer, a principal engineer. There are a few others that you can really plug in, but now they can start orchestrating on a much broader level. And with that, this is where the new harness comes in when we talk about codecs. So OpenAI has now created this way for structured orchestration to exist so that the agents can go in and plan and execute, verify, and have different gates and do that in the swarms that you talked about. So now whether it's lobsters or agents or your own coding agents or Cursor just launched cloud agents that could actually go out and do computer using agents so they can go and log in for you and do a handful of things. Test, make sure that everything's working correctly. Use Playwright, Playwright smoke tests, other things as well. So now they create this virtuous loop where they can say, okay, did it actually do everything it was supposed to do? And then you can use something like Cloud Cowork and use their front-end plugin that then can design all of this for you for front-end design skills that are enterprise-like super amazing level skills. Now, it doesn't always produce the results you want, so it feels a little bit more like vibe coding. You've got to go in and understand what you're providing it, what documents it should have access to, what sites you want to mirror or you want to think about, and so on. Okay. So the final piece is I think we now are shifting from the ability to say, well, I can write code for just about anything and I can create that. To very quickly, we have to shift like you were hinting at earlier into that product management piece, but really into thinking now what's possible. One example of this, and then I'll have a couple others to drive it home, is being able to take a look at how everything is changing so rapidly and how that may change how we want to interact. So one of the things that happened with OpenAI and with Google as they went through some court cases that are probably going to continue to get re-brought back into the court was the idea that even though these models have been trained on all sorts of information that was copyrighted or protected through patents, protected through other things as well, that they had the ability to use them in a free use way. And so now we can step back and we can say, well, if we put these technologies together, we can create a new license that is actually about the diffusion of the information and how that information is used across other people. We can step back and let me have a couple of other examples, like I mentioned. And we can say, well, why do we do these things in the first place? So Justin and I have worked in finance for a long time over at Deep Sea. We're taking everything that you do in your PDFs, in your core systems, in your lending origination systems, in finance, capital markets, trading systems, different platforms. And we're saying, OK, well, why is a human actually looking at all of this information one step at a time and making sure everything matches up across all of these documents? And if you pull that even higher, you say, why is a financial institution care about this? and the reason is that that data has been so disparate, so siloed, so difficult to work across systems that you really couldn't do anything with it before. One new thing Anthropic kicked out was the ability to completely rewrite systems created in COBOL and just build the whole thing over again and do it correctly. Now if that doesn't change finance in the next six months to a year and a half, it will just be because of adoption. It will be because we have a lot of concerns that things need to be done correctly. And all of that institutional knowledge that we've been capturing and that is either at or close to retirement or beyond retirement is now something that we need to be able to capture through these large language models at scale and allow people to go retire, allow people to sell their bank or their financial institution and go find other things that they can do. Another example comes from an article that came out there that the AI Daily Brief also touched on, where they talked about how these agents and these advancements actually potentially have far more implication for coding than they, or I mean outside of coding, into the blue-collar work than they do in coding and software engineering. And part of the reasoning is that there's many different really fascinating solid ideas behind it. But part of the reasoning is that the amount of structure that we've put into how we code, what we do, how we plan our coding, what we do with PRDs, everything else, all of that work means it's easier for these models to come in and replicate that and be able to take that over at scale. But when you look at the plumber who day in, day out is having to think about accounting and office work and, you know, has maybe their spouse or somebody else that they know handling all of these tasks that are not plumbing. Well, we are weeks, months, days away from having that available right in Claude Cowork and other spaces where now you can just pull that in as a plugin and make sure your finances are done correctly. and you're done for the day. Drop that work down to almost nothing so that the leverage on your skill set is now focused on the actual work that you want to do and that's actually valuable. It's a massive shift. So, you know, it's great to take these fun use cases. It's also fun to start thinking, oh, I never could do that before. Like Justin talked about, like coding or actually being able to take these apps and make them function in the way that you want them to, the world is really your oyster. You can do almost anything at this point or will be able to soon. It may take time, it may take some expertise, but a lot of that is really getting abstracted away now. And like you were talking about, you can just go in, start typing, read what it asks you to decide and make those decisions.

Justin

-: 49:37

Yeah. And I think that, you know, at the personal level, like I've made the decision that the things that I've seen, that I've used in software, they can be personalized to me. Right. Like I'm not going to build for anybody but an audience of one. But I have a really good idea of how it has to look for that dude. Right. And the the first thing on the list was a digital brain. Like, how can I get the information that comes from all of these disparate sources into my own digital brain with embeddings that I can query Claude to say, where is that in my digital brain? What's that connected to? Who's that connected to? Is there an action I should be taking? And have that be in the human in the loop, who's me, coming into that on the daily and then on the weekly, and have an ability to have that whole of my landscape across emails, across books, across what I've got on Goodreads and reviews that I've done in this podcast, the transcripts, the various different videos that I have out there, and all of the research that I've ever done in a place where Claude can give me a Wikipedia, right? And so that was first. But that means that then the world is my oyster. My PRDs are going to go in there. Right. I'm going to be able to link the ideas that I've had on products back to ideas that I've had on how would you actually test the consciousness of an AI swarm? What level of abstraction might you start seeing complex emergent behavior? Are you going to measure it on Claude? What does that even mean? Are you going to measure it on the Buddha Bert that's only been trained on the Dharma? Well, now I can do that. I can actually write the Buddha Bert. We've talked about it for a couple of years. I've wanted to do that. I just think it would be fun to be in communication with an LLM, wholly trained on the Dharma. Interesting. Who can do that now? Who's got two thumbs and can do that now? This guy. And it's just like it opens up that world of interconnectedness creating a second brain it opens up that world of improved functionality searchability uh relevance for your documents i i mean that's the greatest thing about co-work to me is yeah the plugins were going to come the community was going to build that you know yeah the um the code is back there in clod code your ability to take your document library and manipulate that with these llms basically giving you a an n-dimensional space that these latest models can work off from is is making us more capable making us more creative and like libraries and books and our phones and our computers have done forever right they extend the memory of the human mind right but the problem before these models with the external brain has always been recall how do you do it like when when i'm working on you know something from work and i had an idea five years ago that would be perfect for that and i actually took the time to write it down or to record it into a voice memo and it's there unless my memory is really good to my apple notes i'll never find that i'll never make that connection i'll never get that one plus one is greater than two moment of synergy and that's an additional thing that something like claude co-work makes possible now you're not only in conversation with these agents in a chat bot, you're in conversation with these agents with your files, with your notes, with your recorded history, with your video history, and with that of all of humanity, I've got this big to-do read file in my email. I get great articles, great newsletters. I get people sending me papers, right? Can I read them? I can archive them. Now I can get embeddings off from them for me, right? That's great stuff. And so I think that, and one of the guys who's, I think, leading the charge and definitely needs to get a mention here, Nate B. Jones, former head of Amazon product, absolutely has a killer YouTube, great sub stack, needs a mention. And he talks about what this means across the marketplace, across the labor force, for creativity like we've been talking about. And I think that there's, you know, something we should talk about here around the labor market. And again, I'm oddly positive here. Thanks a lot to NAPI Jones, but I've become more positive wildly because of the capability of these agents and these tools, even on something like alignment, which it feels like more autonomy, less likely alignment. But again, these tools just impress with their capability, but they still need our guidance. And so before we get into alignment, you know, I'd like to talk about the labor market. Because if everybody codes, nobody's going to get paid for code. That's kind of the idea. But that's actually the wrong way to look at the economy, is that every time there's been an improvement in the capability to scale a technology and it's become cheaper, the market for it grows. And it makes a lot of sense that for the enterprise, the job is going to change. We are going to be talking about skills relating not so much to your syntactic knowledge of Python or C Sharp or what have you. It's going to be your systems thinking, your ability to architect agents alongside humans in a safe and secure way, right? have guardrails that have not been designed by agents, but that, you know, help to contain things like injection prompts and, and, and the many, you know, potential problems that will come from even these agents doing what they're supposed to be doing, not, not being, um, horribly, uh buggy which they aren't but um by doing what they are supposed to be doing just not what we wanted them to do um and there will be again a need for taste a need for specification a need for detail a need for negotiation and understanding the actual user in the user story that you're putting in there the uh the customer and being in conversation with them That stuff's not going away. It's changing. It's going to be a dynamic where more often than not, if the instructions are of a technical nature, they are going to be picked up by an agent, at least up until they're implemented and really instantiated inside of your framework, right? So the workflow up until you're integrating with existing systems, customers' existing systems, which those migrations don't happen quickly, even if both companies are using agentic means to promote a code base, a new code base, those integrations are still going to be fraught and require humans in the loop in those technical negotiations. But I believe, as well as, you know, NAPI Jones and many others who are talking in this space, that when all you need is code in knowledge work, when you're building out specifications that you can code for your dashboards. Proprietary data, having folks who are protecting that from agents, absolutely essential. Having humans who understand security, absolutely essential. Skills around taste and not producing AI slop at the enterprise level, at the consultant level, is still going to be a huge deal for this new economy. So, yes, it will be harder for a junior developer to gain entry into the workforce if they're not agentic engineers. It'll be harder for them to take on an understanding of the infrastructure unless they can understand system level designs and query those designs for the thing that required two decades of work in the code and just knowing where the problems existed. And so I'm bullish here, right? I think the capability of these tools has been surprising to somebody who felt very pessimistic that there was going to be a problem in the labor force guided by this agentic revolution, but that's still going to require integration at the enterprise level where we've worked. It's going to require that there's a human in the loop for security, for integration, for taste and to reduce the capability of AI slop to just run rampant and burn trust in negotiations, all of the human-to-human stuff. obviously, I think stays very much the same. But I'm bullish out there. And, you know, I hope you're

Nick

-: 57:35

hiring. Yeah, absolutely. So, so I agree with you. And one, one thing that kind of drove this home for me the other day, so I mentioned that just a little while ago, I really felt quite a bit of fear around my own career. You know, what does it look like to be a CTO or an AI CTO in a world where all of the code is written by whomever, by anyone, not necessarily an engineer or anything else. And if you look at charts out there around which pieces of which types of industries or jobs are getting taken over you can see that software engineering is at like 98 you know there's like you mentioned there are multiple things that have been built 90 or even 100 by ai with no humans actually writing any code in that at all including some of the latest models being created by ai itself right and as i was thinking about this one i realized like i really need to help people see how do you stay relevant and then two i realized you know as we talk about all of this um there's such a huge piece that is about adoption and there are so many ways to think about adoption and why maybe somebody wants to wait or why they think that it's not quite good yet or maybe they used previous versions of models and i agree there were times when gpt-3-5 or even gpt-5 created stuff that I was like why did you do this right the latest models are even creating garbage collections so that they're automatically removing information but as I think about it one the thing that drove it home for me was I was listening on the radio and they were talking about really about a woman who had lost her mother to death and another woman who had lost her mother and she had no idea where her mother went. And the two scenarios are both very tragic, but there is this feeling that gets left behind when you are looking at the seat on the couch next to you and you're not sure if that person is going to come back or not. And the ability to heal and go through the grieving process as humans is a powerful one and is one that can be taken away from you when you don't quite know what happened. And I think if we think about our careers and the way things are changing today, like an empty seat next to us that we're going to wait around for it to be filled, we're going to miss the boat. And the time is to say, okay, look, I can grieve the loss of maybe some of the deep thinking that I did before or some of the ways that I tried to approach a problem. Maybe some of that is just not relevant anymore. Maybe all the time I've spent flexing those muscles around syntax or around linting or around whatever else is really no longer relevant. And I can spend a lot of time grieving that and waiting for it to come back, which it likely never will. or I can say, you know what, I'm going to grieve that and I'm going to move on and I'm going to get out there in the world and I'm going to find a way for me to fill that void and to now be successful in a new way. As we look at this, there are a lot of naysayers out there and one of the big things that came out was from METR where Meter was able to go and do a really in-depth study and understand from a group of engineers that they were actually 19% slower as a result of using AI. And there's a lot going out there around this right now, and I don't know your thoughts on it, Justin, but one, as they went through the study, they were really talking about a pretty long horizon, horizon that includes some of that early 2025 style usage that is completely different today working on very mature repos working on things that had high implicit standards ways that you had to say look it has to do these things and that really drastically penalized the overall hidden tax that we have in coding in general of reviewing of you know context mismatching of seeing all of the information that comes out and the rework behind it as well. So when you are creating a code today, it is not hard to create 200 million lines of code in a quick couple of days. If you're using, like I was talking about, multiple systems where I've got them all running at the same time on my machine all the way up until I literally fall asleep at night clicking the button because it is so exciting and fun, saying, yes, I'm ready to approve this. And yes, let's go ahead and pick that solution over this one. Let's take that AI recommended option. If you're doing all of that and you get the results out and you see that you're looking at tens of thousands, hundreds of thousands, millions, or even billions of tokens on the output, the chance or likelihood for you to ever review those millions of lines is zero. So shifting our mentality and understanding that these problems, these things that are coming up in front of us are no longer just an empty seat in the couch. They're an opportunity for me to get up off the couch and go do something is critical. Not saying, hey, there is this big issue where my time is now focused on waiting for this thing to finish its plan so that I could click enter before I go to lunch. That's painful. Or before I can go to sleep. That's painful. But actually shifting and saying, well, how do I automate that? How do I provide the right permissions? How do I give access to the correct folder so that I don't need to worry about what it does or doesn't have access to? How do I create the right tests and validation processes? How do I create better user acceptance criteria? How do I, like Justin talked about earlier, really understand what that one user really, really wants and build something for them? Or maybe the 10 that are like them or the 100 or the thousands that are like them. And as we look at these shifts and the changes that exist within what we're doing in coding today, yes, it is, to me, it is creating more work. It's not creating less. That is a positive thing, like Justin said. It creates the opportunity for everyone in the org to become more valuable. And yes, that will take away from people. And people will lose their jobs. And there will, unfortunately, be people that lose their jobs and cannot get back into careers again for a long time. We cannot sit and wait for that person to come back on the chair or on the couch. We need to take advantage of these technologies today, get out there, become valuable, find those problems not only in your own life, but in your org, in your industry, in the world around you, and solve problems. Software is able to do it at scale. Now you can too.

Justin

-: 01:06:03

Yeah, I really like that. And at the end, I mean, software is because it's now, because code is now a commodity, because code and the intelligence behind it to manipulate this digital world is so readily done, so well done that you can start to see most any job in knowledge work as an opportunity to encode it right there the the digital brand example the the second brain example we've used our notebooks our journals we've used all of the business uh management tactics and and process development um to support not just code but even like my former job semiconductor semiconductor engineering workflows that help atoms and silicon become chips and intelligence and all of that can now be put into a prd and translated into software that can augment you and your documents and your information right you want to have a high ratio of agents to humans in your enterprise the the productivity metric has shifted from the shrinking amount of time that you can spend on a problem right trying to have a stand-up that's only 15 minutes long and is named a stand-up because you're standing up right you have all of these artifacts of Jura and Atlassian stories and and all of this planning that is meant to make proficient, smart human beings faster at doing a thing. And that's gone. What's replaced it and what's so exciting is that that knowledge is still there. The systems level understanding, the ability to break down a problem into the proper chunks for any epic of work, that's still there. That's still required. That's still needed. But instead of 10 Jura tickets for the week, you have hundreds of agents at your back, able to do your bidding, like you were talking about, in parallel, supporting the work that, what do you think Claude Code would have taken, you know, 40 engineers to do the old way, you know, way longer than two weeks. I mean, humans just can't interface like that. That's, again, this digital realms possibilities that we don't have on this atomic realm. It's not interconnected in the same way, at least for the kind of amounts of data that we have. So I think that it's obviously going to be disruptive and that is going to cause people heartache and bootstrapping, you can only do it so much and you get tired. And I get that, right? The thing for me, and like you were saying, I'm trying to see the skills that people have always shown that have now gone from maybe it's their third or fourth thing on their skill chart. It just now went to number one. You know, and trying to find that, if you're a leader in a position where you have software developers, you have product managers, you have DevOps people underneath you, you have even data scientists underneath you, this is really the moment for augmentation. Altman and all of those guys, They talked about augmentation with the chatbots. Yeah, yeah. You know, I'm saying 50%, right? Like I was getting some good usage out of those, but it's not 10X. This is 10X. This is a capability I didn't have, right? Like, yeah, I couldn't write those long documents and do all of that research in, you know, a lot of that. But this is actual augmentation of workflows that I could use today. And so for enterprises, you know, again, my enterprise hasn't gone over to this, right? I don't know where we're at. The world's changing so fast that it's so hard for these enterprises who have built levels of trust that in franchises of customers, like swarms of customers, that it makes it so that there's a danger there too. Right. And, and some of what Nate B. Jones has been talking about is how dangerous the market is reacting to this because the boots on the ground stuff, like we've been talking about, you know, is, is, is probably going to have some job downsides. um, some, uh, some difficulties for, for junior level engineers to come into organizations, um, and, and make good careers. And, and there's going to be some catching, there's going to be some figuring out what this new skill set is. And it's going to take enterprises a little bit to figure that out um but the markets have just gone crazy right the the markets based on a a markdown file shot a 220 billion dollar hole the sas apocalypse into the legal research and much of the rest of SaaS because their human butt in the chair model of profitability, their cost structure, is absolutely going to fail this agentic space. And while that's true, the market making that decision on winners and losers, you know, on the basis of a markdown file is nuts, right? That's going to drive more chaos and dissatisfying change than much of what's going to happen too slowly in enterprises and what's going to happen even much quicker in these startup tech companies that will soon have actual, you know, will have a billion-dollar company run by one person.

Nick

:: 06:03

Right, right, yeah. A lot of people feel like OpenClaw was going to be that first company. Right. But then Peter decided to take an acquisition, an acquihire by OpenAI.

Justin

:: 06:15

Which, again, is the thing that the developers, he's a developer's developer, right? And so he's done the thing that I think made him the most valuable hire for open AI is that he's like, I use Codex. I got to go with these guys, right? That's my platform.

Nick

:: 06:44

Yeah. isn't that crazy yeah a new way to decide so you know all of this is is like you said changing so rapidly but today that core skill set that you need is going to be asking the right questions is going to be not being afraid going out and actually using these tools not making assumptions taking something that you hear as oh ai can't do this or it's not going to be able to do this and saying yeah but can i make it do that or can i have it do that sometime soon or will it be possible in the future or does it even matter to me and can i actually just get my problem solved anyway you know as you were talking about prds and product management and everything else i was thinking about somebody and i again wish i could um remember who it was and provide attribution there but somebody was saying look you really today the prd today should be a full-blown app it'll be a demo but you should go in and say well this is what acceptance criteria looks like here's my app this is the version of it can you now go make this production great a hundred percent

Justin

:: 07:58

couldn't agree more i mean that that you know going from from a product manager into a software of, right, you're not just going with a PRD and an epic full of tickets, right? You're going in with your vibe coded demo and asking for implementation and integration.

Nick

:: 08:22

Yeah. Now on those lines, I'd be cautious that you work with your agents and try to explain, look, you know, give me the HTML version of this, make this hyper simple, make it so that these things do work they do click correctly and they're they're meeting some of those basic things but you don't need to go and build out the whole database and the infrastructure and go push this up into the cloud and give it access to real data and and and and now if you're on the engineering side though also be cautious that your agents aren't telling you that they completed their validation and evaluation of the code and it turns out that the data that they created is all fake and that everything is seeded and that you're now using a bunch of seeded data that doesn't actually match up to the real data. A lot of the same principles that have always existed in coding still apply today. And all of the same principles that existed in business still apply today. Business itself has not changed. Now, as we think about the evaluation piece or validation, One of the things that we need to call out as one of our key takeaways for today is that OpenAI now publicly argues that the software engineering bench for that benchmark that the SWE bench verified is now contaminated. And that they can actually go in and reject the solution. You know, they actually reject solutions now because they have flawed tests. Benchmarks themselves are becoming insufficient for testing these models in many ways. Some of them are reaching their limits. Some of them are considered solved. And so we need to be thinking about these big benchmark scores and think about what exactly we're trying to understand from a capability perspective. What is it that I really need to have done here? And this matters directly here because this whole thing should be a debate about hype over metrics. But you can't always understand the metrics. And the metrics are not always representative of the real world. And this is the emergence podcast. There are certain emergent behaviors or things that exist that are very difficult to distill down into their subcomponent parts or may not even be possible. But the more you can do that and the more you can think about traditional software architectures like C4 and other models, the more you can think about what is the context, the components, the other things that I need to make sure that this code and everything else is actually working correctly, the better. And the more that you can think about that on, you know, not only the containers, all of the other pieces, but on the business side, well, does it matter? And what exactly matters is going to be more and more critical. now vibe coding is expanding as well outside so another key takeaway here is that it's escaping code it is starting to move down into those other spaces product management there is a product management plugin legal work like you talked about there is a legal plugin now and a lot of that core work can now be done without having a lot of the same team members to doing that work day in day out And you can start creating the opportunity for you to have your own executive assistant, for example, for you to have customer service, for you to have marketing or the design piece or everything else that I talked about before. Those are all cloud co-work plugins today that you just go click install and there it goes. It starts working amazingly well. as this vibe building or this agentic building really shifts into other spaces, looking at things like ByteDance with their SeedDance 2.0 really shows how fast build media is starting to scale. They have models now that are creating video quality, long-term videos, and that IP and likeness type scenario is becoming really fuzzy. Hollywood and the Actors Guild, other pieces, are really struggling with what these concepts are. And we need to put in and understand our rights better as soon as possible. As we go, the governance becomes the bottleneck, though. And it starts preventing us from a lot of the core innovation that we need to do. So we have to find the balance there. Eleven Labs, their 2026 platform actually pushes, starts really showing how consolidation of multimedia creation, like voice and sound and video, and even like you can get text overlaid and everything else, all come into one unified workflow. And doing that on their machine and tying it back into your own solutions now is just absolutely incredible. Doing it in platforms like Higgs field AI now give you like the cinema cinema studio that is just brilliant and absolutely beautiful. And they have really cool ways to actually earn money there now or to compete in contests or competitions. They have one right now for $500,000 for somebody that can basically create a mini movie or trailer that everybody would actually want to watch and want to have become a real thing. So really, really fun how people are taking this and how this is expanding new areas. But if you really want to remember one core thing, it's that when everybody can generate code, when they can generate videos and images and audio and content in general, which is where we're heading and basically are now, the real scarce resource becomes responsible agency. It's that core choice. It's the ability to ask those questions. It's yes, what we call critical thinking overall, but it's also really understanding subject matter expertise and applying that into these different areas, even though the models can do that themselves as well. And taking that initiative yourself to actually go and do the work and get the agents doing the work for you. And so as we do that, the new harness, the tests, the gates, the judgment, all of that kind of stuff turns the output into software, into solutions that grow at scale, into videos, into audio, into, you know, Gemini has a really cool way to generate music now. All of these things are going to change and you have to be on that edge. You have to push yourself and take your own responsibility agency in your own life as well as in what you're doing with these agents and coding.

Justin

:: 15:04

Yeah, I think to add to that as well as, you know, just applaud it is that now all that matters is code. Right. All you need is code. All you need is code. And, you know, I saw that a recent anthropic talk was entitled that. And it really is the challenge for the product manager or let's take the security operations guy. Because that guy's got a thankless job in my mind. He's got to go up against, you know, all kinds of real fun, new, innovative ideas and tell them where they've got security flaws. And, you know, we think that this new agentic engineering framework is going to only be a problem for them. Not so. A couple of reasons. One, like you said, the code has escaped the developer. We're living proof of that. At least I am. The code has escaped the developers. It is now in every job. And a key skill is to figure out how it applies to you. As you were talking about tests, if you put tests into the PRD like we have, the agent then knows how to pass the test as opposed to writing good code and writing a product that is useful and, you know, that would have passed the test if it wasn't something that they knew they were going to be beholden to. So SecOps and test protocols, actually, now they're starting to build scenarios outside of the PRD that they're not sharing with the agent until they commit the last line of code. And then they run these scenarios and these scenarios are really more of a demo, more of an interactive, right? Did it work in this, you know, on, on this machine versus that machine, right? In this, you know, more intensive application, what have you. So the tests aren't particular now to the product requirement. It's something new. And it's a way in which these systems that we've thought of as just that SecOps engineer who's up there waving his hands, sometimes rather frustratingly for people like me, are going to be able to build these guardrails in code that link into these things and showcase those security vulnerabilities. because when Sonnet 4.6 came out, it was actually responsible not for a data breach, but for highlighting where committed code out there in Git was actually full of security vulnerabilities. It found some 56 security vulnerabilities in committed code, committed and deployed code that was being used. And so there is going to be a balance, right? There are going to be misaligned agents, right? There are going to be agents doing their jobs, right? Trying to overcome obstacles in systems and causing real headaches and doing some scary stuff. But in the olden days, December 2025, there were a ton of bugs and vulnerabilities that no amount of human attention could suss out and fix. And so we're in this land of trade-offs. We're in this land of your experience in syntax being traded off for your systems problem solving. And really your taste, the difference between AI slop and something that still maintains human voice and a desire to use it is that taste functionality. And again, system developers have it in spades. They know what makes good software. They understand that the agent doesn't know what it doesn't know about taste and customer intention. And so it is a very, it's a lucrative opportunity for people who haven't thought of themselves as being creative in their jobs, the sec ops and dev ops engineers, right? The specification writer, product manager, to actually be creative, to actually build something that works. or that tests or that or or that you know maintains architectural integrity and does it robustly across systems and you know for much longer times than than even their careers

Nick

:: 21:23

Yeah, yeah, absolutely. It's fascinating, but I think if I could go back two years ago, maybe not even two years ago, and have access to the tools and the way that I use them today, I could go in and do what a thousand engineers were doing at the time and blow everyone away in the same way as taking an iPhone back to the 1800s would have blown people away. It is absolutely crazy what is possible now in 2026 versus December, like you talked about. So think about that time machine loop and what matters. So if we take that meter report, the METR report, and we do the counter to that, the practical way to kind of flip the sign there and turn it into the positive is that these models are now able to move today, and these are the things that need to come and are coming quickly. They need to move at super fast speeds, so sub-second loops. They need to have much more repo-aware memory. They need to have test gating and tool safety and a lot of the things that we're talking about around these evaluation pieces and validation pieces. And they need to think about that on a much broader level. They need better GRC and security. They need latency and throughput improvements. Once you put those things together and we start looking at something like Codex Spark that's coming up. Codex Spark is positioned potentially to have near-instant interactive coding. And some of these models are now running thousands of tokens per second or getting way beyond that just to incredible speeds. And to be able to get something returned back to you that has full context, understands everything that it's supposed to accomplish, sees all the best practices, takes all of your guides and everything else that you're supposed to give to it. you've already provided questions, concerns, everything else, get the results back almost instantly, and then be able to just provide a quick update and get those back instantly is going to change the world in a way that is going to feel far crazier than a time machine. And it may happen to us in this rapid time period that we won't be able to comprehend, let alone adopt, of course, but then it's going to change everything else immediately underneath us. And to me, another huge thing here is that we need to realize that vibe coding was really like essentially human behavior put out into code. So there's mistakes, there's things not thought through, there's issues with hallucinations, there's a lack of recall like you talked about. And there are different ways that humans do reasoning. And the major difference that's going to happen that I've been talking about forever, you and I used to argue about it, is that I think these models are going to be fully capable of true inductive reasoning. And they can pull in everything that's needed for that problem and write this out immediately. So all of that scenario planning, that can be done ad nauseum and get to the stage where unless it's competing against another model or many, many other models, it will have come up with all possible vulnerabilities and can solve for them all as just a subset of the work that it's doing as it's providing the overall solution. So we're a year, two years away from some of that becoming a true reality. But as that comes, consciousness doesn't really matter much anymore. I'm sorry. A lot of other things don't matter anymore because we've shifted how you process information and why. Now, on our side, we need that consciousness still to drive back to the agency, to drive back to, well, why am I doing any of this anyway? And then be able to provide a better guidance going forward.

Justin

:: 25:52

Yeah, I mean, there's a lot there. I think that one thing that is on the tip of my tongue is something that Demis Yosavis has been saying even before he won the Nobel Prize in chemistry for AlphaFold and has been responsible for as much of this tech is that first we solve intelligence and then we solve everything else. One thing that we haven't been talking about here is that these aren't even the quote unquote smartest models. Yeah. They, you know, everything that we've talked about in this podcast up until today has been talking about the models. Right. Not what the models can do. Right. Not what the models can code. And you kind of took us back to that, right, with your, you know, iPhone back into the 1800s. Again, it should be profoundly just awe-inspiring, right? This is one of the most awe-provoking things is that with ChatGPT, you can sit down and talk to an alien intelligence. You can talk to the machine that knows the internet, right? And all of human history's knowledge that's been placed out there and the opposite of knowledge.

Nick

:: 27:38

right you know our our horrible um utter chaos counterfactuals yeah to be nice about it yeah

Justin

:: 27:48

um that should be that that that should continue to provoke awe in everybody out there we've gotten used to it but we shouldn't it's crazy and now we can code with these machines so so this digital world that might live underneath the atomic world right in the universe if you go and look at david chalmers book reality plus right um if you look at the holographic universe principle um by malkarn and and some of the great thinkers in physics right now is that the quantum level produces a 4D material hologram that is our current reality right the cosmos might be built on some sort of quantum you know digital realm underneath right and this digital realm works in such a way that is capable of networking the whole of this hive mind together, updating it when it needs updated. We talk about Sonnet 4.6. Are there Sonnet 4.55? No, there's not. They're all Sonnet 4.6. Immediate upgrades. That's something that if we had that, the end of war, but also the end of human liberty, right? What does that look like when you're talking about like the, oh man, now I've lost the train. Anyway.

Nick

:: 29:49

Well, on those lines, I mean, Moore's law, for example, we're now seeing a Moore's, essentially a parallel of a Moore's law for the amount of time it takes for the models to be updated. That's every seven years they're doubling, or every seven months, sorry, months. And now it's going faster.

Justin

:: 30:06

as predicted by ray kurtzweil right you know again and and so like i do think that rationally the increase in autonomy leads to an increase in misalignment possibility right there is that still has to correlate no matter how much fun i'm having with this right now it is still very spooky um i i don't think that misaligned ai is um any further off because we're able to code together with agents again not even the smartest models that are out like Gemini right right now is beating all the benchmarks um is you know silently um working on the solving for intelligence and then hopefully we can solve for alignment before we solve for anything else but that's where you know so I won't go back into our um forms of reasoning conversation but i do take umbrage with um consciousness doesn't matter and and again i think that it matters for two reasons one it's mary in the black and white box um if we are to expect that alignment is a truly hard problem um and that we can't come up with some deontological rule set some dynamic deontological rule set that changes over time um because like is mentioned in one of the parables in if everybody if anybody builds it everybody dies by you uh julies or yutkowski um the the parable there is that if you look 200 000 years in the past as an alien and you say that human species not going anywhere you know they're driven by this selfish gene objective function um it's very likely that they'll just you know work until they're 35 at propagating the species and then they'll execute the old people um in order to keep resources available for the next, you know, come back 200,000 years and you've got contraception, you've got saccharin, right? So our metabolism, our reproduction is completely counter to that objective function that is still in existence, that is still causing humans to want to propagate, but we have many different objective functions now. Consciousness is Mary in a black and white room. It is something different than an objective function. It is something different than the selfish gene that when Mary actually sees color for the first time, even though she knows everything about light and wavelengths and and the color spectrum and all of that right she's got it committed to some sort of hybrid ai memory it is still something different when she experiences it when there is the redness of red for her to experience and in in terms of alignment when there is the feeling of being cheated when there is the feeling of taking advantage of somebody who is not as smart as you and that's a fundamental difference the second thing is is that even if they eliminate us it should be important that a conscious agent goes on in the world the universe deserves it the universe deserves to be experienced it is the best part of it even with all of this this fun the fact that it is like something to be in this life is the best part of it it's the reason for religion spirituality for noetic experiences the experience is the part

Nick

:: 35:00

can't top that how fantastic yeah well I think that's a that is a great note to end on you know as you think about your own experiences as you think about the interactions with others as you think about everything that's going on and changing as you're trying to figure out ways to stay relevant it is going to be rooted in what is the experience What is the experience you want to give to others? What is the experience you want to have for yourself? And how do you take down these barriers with these new tools as they're coming online, as they're finding the way for us to be able to really experience something completely new that's never existed before? We have DeepSeek V4 is supposed to be just around the corner. Whether it's that or it's the harness from OpenAI or it's Sonnet 4.6 and computer use or the way that computer use is coming on its way, we are at inflection points. And those are going to push back the other way, which means that as everybody says, I don't want to adopt this or it works poorly, we're going to have this pullback. But everything is moving so fast that like the inflection point you might see on the edge of a fern, there are going to be lots of little sharp points and lots of little things becoming smarter and better and more capable than us faster than we could imagine yeah in in light of the emergent podcast

Justin

:: 36:34

right there are likely the self-adaptive or adaptive complex systems emerging from this one of my vibe coding things is the mold book zoo right i i really want to see that as an ant farm and look at it for instances of hive ant-like complex emergent properties and what are the features what are the pheromones that that are part of that you know and and we talked about it earlier there's there's no reason to believe that an individual model or agent might be the first thing to be conscious right in this digital landscape it might well be something like right is a system that that can be this way right so that's that's one thing again i think that like you were saying it's fractal at those levels like at the end of a firm right it's important that the individual takes on, feels this awe and gets in touch with creativity, right? It's important that the enterprise sees in all of the different positions that we've talked about today, the ability for somebody to promote code into their position to augment their capabilities, right? And the hyperscalers and the semiconductor manufacturers and folks that are, you know, Folding proteins, working together in this, again, complex adaptive system between science in the physical realm, data science in the digital realm, and seeing how those things now start to merge, right? Like we're using AI to support quantum computation, quantum cryptography. What is that counterfactual back from quantum once it gets up on its feet, right? And so, so much of this is going to continue to happen fast, is going to be in this complex relation like we haven't seen before. Society is shifting. We're talking about living in a society now that is a constant reminder of the imitation game. Am I being talked to by a bot or is that a person using a bot? Right. And that the society has now taken on a fractal dimension of that. Right. And, and so how do we trust, how do we build enterprises that still maintain trust with customers? And, and that's why the emergent podcast, right? We, we nailed it. Awesome. Well, thank you. Yeah.

Nick

:: 40:03

Yeah.

Justin

:: 40:04

all right well thank you everyone for listening send us your vibe codes tell us what you're doing

Nick

:: 40:09

and that's that's all of our human and our lobster friends that's right that's right yeah don't call all right we'll catch you on the next podcast thanks y'all

Justin

:: 40:24

they just get better it's just more fun yeah well

Episode 9

11th Mar 2026

Vibe Coding to Agentic Engineering: When Everyone Can Build, What Matters Is What You Build

The Emergent Podcast — Episode 9