LLMs: An Operator’s View

Leave a comment
Managing managers

Many of you that read my articles are operators of some kind.

You may run one or many teams, or even a whole company. And, even if you are not a manager by definition, you may wield a great deal of influence over directions and decisions.

In the midst of the current LLM explosion, we as operators find ourselves amongst:

  • A blistering pace of improvement in the capabilities of LLMs. New models and products are being released at a rate that is hard to keep up with.
  • Immense noise and hype online making all sorts of claims, good and bad, about what the future holds.
  • An expectation from our companies to go full-on with “AI”, which typically means LLMs, both in developer tooling and in customer-facing products. AI is the new data is the new cloud.
  • Echoes in the industry that we are all now overstaffed as a result of productivity gains: that everyone should do more with less, and that AI is the answer to that.

Note: this article is not a technical overview of how to build products with LLMs. Instead, the intent is to touch upon what leadership should do from the perspective of the productivity of teams and organizations, and consequently how we should think about spending our budgets to make that happen. There are plenty of hot takes out there on AI. This is not intended to be one of them.

What we’ll cover related to LLMs is:

  • The (real) rising floor of developer productivity.
  • The changing size of organizations.
  • The increasing importance of code reviews.
  • The changing nature of interviews and identifying talent in short spaces of time.

The intent is that this should provoke thought and discussion, and will hopefully help you think about how to allocate your budget and focus in the coming months and years.

The Floor Is Rising

With CopilotCursorCline and other LLM-based developer tools, the floor of developer productivity is rising.

At the time of writing in 2025, I believe even the most AI-skeptical developers are now seeing the productivity gains that LLMs can provide. Yes, it was true that, several years ago, the promise and the hype far outweighed the consistent proof of benefits, but in a post GPT-4 world, LLMs have become an integral part of the developer’s toolkit, even if it is just for fast research or rubber ducking rather than agentic pair programming.

I don’t know many developers that would give them up now, myself included. I go too fast with them to go back to the old way of doing things.

If for some reason (!) you haven’t fully leaned into LLM-assisted coding yet, the benefits are plentiful:

  • LLMs are fantastic for getting over the cold start problem of a new idea. You can go from nothing to a throwaway prototype in no time at all, starting with a vague prompt of what you want and iterating on it. There are numerous “vibe coding” projects that are generating some serious revenue.
  • You can use a prompt to sketch out whole architecture ideas with back of the envelope calculations and tradeoffs.
  • Copilot-style autocompletion is now very good unlocks the next step in your thought process.
  • Agent-based tools like Cursor or Copilot Chat, when kept under control, can be a great way to get a lot of boilerplate code written quickly.
  • Writing tests, and therefore driving up code coverage, is now much easier. LLMs can write tests for you, and agent-based tools can execute the red-green cycle for you as you go.

If you haven’t yet spent an afternoon or evening with Cursor, then please, please, please make time and see how fast you can go from a blank page to a fully functioning hobby project. It is incredible how fast you can go from nothing to something.

So in terms of the Gartner hype cycle J-curve, we are clearly on the upward slope towards enlightenment. The tools are getting better, and they are getting better fast. It is unclear as to how far the Bitter Lesson will take us, and predictions currently range from being on the cusp of hitting a plateau of productivity to full-blown AGI, but it is clear that an organization that does not embrace LLMs will be left behind by their competitors.

As an operator, up-skilling your team to use these tools is now essential. Securing the necessary budget to give everyone access to the Pro tiers of ChatGPT, Cursor, or whatever tools represent the best fit for your team is a table stakes activity. And yes, this does mean that your budget will increase, but the productivity gains from an existing team will more than make up for it. Trade the cost of hiring new people for the cost of acquiring tooling.

You should also take the adoption of this tooling seriously. It is not just a case of giving everyone subscriptions and hoping for the best. You need to invest time and effort into training your team on how to use these tools effectively.

  • Run a survey to see what tools your team is already using and how they are using them. As part of the survey, identify which of your engineers are already fully ingrained in the new LLM workflows and which are not.
  • Identify champions based on the previous point and have them run training sessions and also overindex on pair programming with those who are less familiar (or more skeptical) of the tools.
  • Promote a culture of sharing best practices and tips for using LLMs. Get your champions to lean in and share their workflows and processes with the rest of the team. Videos work wonders here.
  • Track the usage of AI tools over time as you adopt them. For example, Cursor offers team analytics, and you can see how many lines of code are being generated and accepted. Use this as part of the feedback loop to see how your team is progressing. Is usage increasing or decreasing? Why?
  • Cross-reference the usage data with other metrics you are collecting. For example, how is the average number of commits to the codebase changing as tool usage increases? What about the number of incidents or reported bugs? What’s happening with your DORA metrics as a result?

Focus on showing that the tools are making a difference, and this too can be motivation to bring skeptical engineers on board.

Organization Sizes Are Changing

Given that the way that we create software has changed, there is another operator’s consideration: the size of your organization.

Layoffs have been rife since the end of ZIRP. Overlapping this period has been the rise of LLMs, and in some cases, the two have been conflated: organizations haven’t just shrunk because of AI efficiency gains, but they also haven’t just shrunk because of the macroeconomic environment either; the two are becoming somewhat intertwined, if you believe what these companies are saying.

However, it is true that from a company-operator’s perspective, the hard-to-exactly-quantify, but definitely real, productivity gains from LLMs allow you the ability to do more with less.

And amongst a tricky economic environment, instead of staying same-sized and increasing output, there has been a trend in many organizations to reduce headcount and combine this with AI tooling to (sort of) maintain the same level of output.

If you think about it, many of the world’s largest companies are (or were) staffed to pre-LLM productivity levels off of the back of ZIRP, and you could argue that there consequently has been an exchange of a large chunk of money that used to pay salaries for a smaller chunk of money that pays for tokens and subscriptions.

One could even argue, especially at large companies, that if all developers could go, let’s say, twice as fast with the new tooling, then other bottlenecks would appear that would limit the speed of progress anyway, so less really is more.

These bottlenecks may already exist: the sometimes glacial speed of making decisions, the amount of change and new features that your users can stomach at once, the time it takes to go through cycles of shipping and learning and iterating and so on.

Many companies are already at the point where they effectively block the speed of their own progress in other ways than just the number of developers they have. Making those developers faster may not actually help them ship more features, and in fact, it may make things worse.

Maybe you work for a company like this.

Going back to the operator’s perspective, if you currently work for a small or medium-sized company, a good idea would be to focus your attention on giving everyone access to the right tools and training to become more productive before you go on another hiring spree. Get everyone coding like they should be coding in 2025 first, assess and prove the productivity gains, get your tooling in place, and then look at hiring more people.

And remember that tooling goes beyond developers: we’re talking about all employees. A pro subscription to ChatGPT is just as useful for a marketer’s efficiency as it is for a developer. Giving each employee in a 50-person company a ChatGPT Pro subscription is still cheaper than hiring a senior developer or two. Think about macro efficiency gains across the whole organization, not just in engineering.

Reviews Are More Important Than Ever

The flip side of the productivity gains is that more code is being written, and, most importantly, not all of it has been as carefully thought through as handcrafted code.

If you’ve used Cursor without specifically asking the prompt to slow down, go step-by-step and ask for your input frequently, you’ve likely seen it go off and blast out hundreds of lines of code that is hard to keep track of.

Now, this is great for getting a prototype up and running, but it is not so great for production code: the code generation starts to go faster than you can meaningfully comprehend it as a human, and bugs can be introduced that are hard to spot. In the best case, the code can be messy or unoptimized. In the worst case, it can be full of security holes that could seriously compromise your organization.

As such, with the faster production of code, as an operator it is more important than ever to ensure you have a strong review process in place: if your most senior engineers were getting a half-arsed rubber stamp thumbs up from their peers (not advised, but it happens), now you need to ensure that all code is being scrutinized as the origins of it are less clear.

You could:

  • Make it clear to your organization that even though LLMs can generate lots of code almost instantly, human reviewers can only digest so much. Keep PRs small, commits clear, and code easy to read.
  • Increase the number of required reviewers on your PRs. For example, go from one reviewer to two. You could also have engineers flag their own PRs that have heavy LLM usage to call out that they need extra scrutiny.
  • Give people a refresher on security best practices (shock horror!) so they can be better aware of when LLMs are generating code that is insecure.
  • Make improvements in your incident postmortem process to ensure that you are learning from your mistakes. Share any production issues that stemmed from overlooked generated code widely across the organization so that everyone can learn from them.
  • Investigate AI tools such as DeepCode by Snyk or Graphite’s Diamond that could help detect issues in code before it is even reviewed by a human.

Am I Even Interviewing You? Or Your LLM?

The typical tech interview process for individual contributors, which involves some combination of coding challenges, white boarding, and system design, has had another curve ball thrown at it by LLMs.

When interviewing remotely, we may have previously been concerned about candidates using a search engine to look up answers, but now we have to consider that they might be side-channeling all of the questions to an LLM.

If you are an interviewer, how can you tell that off to the side of the Google Meet window is another browser window with a prompt open? By the time you have described the system design specification, the candidate could have easily typed it into the prompt and have gotten an incredibly detailed and plausible answer back.

And hey, don’t just take my word for it, try it: open Grok and type “I am doing a system design interview. Help me with it. I have to design Instagram from scratch. Give me back of the envelope calculations and follow the structure of the ByteByteGo book.”

Scary, huh?

If your candidates are good at placing their windows on the screen in the right places and keeping their eye movement under control, you might not even notice that they are doing it. How are we meant to get good signal from candidates now that we can’t figure out if we’re talking to them or a prompt?

If you want to test a candidate completely without LLM assistance, you could ask them to share their entire screen so you can see what is going on. However, this feels invasive. Alternatively, a lighter touch version is to have the interviewer share their screen and problems are tackled together via pair programming and high bandwidth conversation where it would be hard for the candidate to be typing away into a prompt and then trying to pass off the answer as their own.

Alternatively, you could go in the complete opposite direction: accept that LLMs are now part of the job, and like the rest of the article, embrace them.

For example, if you want to hire people that can tackle large and ambiguous problems quickly with LLMs, get them to demonstrate these skills in the interview. This is similar to exams at school that are open book: you can use whatever resources you want, but you have to demonstrate that you know how to use them and that you can think critically about the answers that they give you.

The choice is yours as an interviewer: either allow LLMs or don’t, but be explicit about it ahead of the interview so that the candidate knows what to expect. If you do allow LLMs, you should also be clear about the rules of conduct in the interview: are they allowed to use them for everything? Are they allowed to use them for some things? What are the boundaries? Don’t make them guess.

Regardless of which way you go, you’ll need to adapt your interview process to ensure that you are getting the right signal.

  • Having candidates seek solutions to leetcode problems is not going to work. LLMs can easily dump code for doubly linked lists and binary trees, and annotate the answers with all of the big-O complexities attached.
  • Instead, questions that you ask should be sufficiently ambiguous that part of the interview is figuring out the specific requirements of the problem and what the code or system should do. Doing this in a conversational manner is a great way to see how the candidate thinks, and if you’ve not allowed LLMs to be used, it should be obvious through long periods of silence if they’re trying to bend the rules.
  • Spot-test their knowledge: the interviewer should be able to interrogate components of the candidate’s solution as they go along, asking questions about the design and implementation that highlight whether the candidate actually has knowledge here, or at least is able to think about their solution critically and from first principles. For example, if they think a cache should be implemented, ask them why, and what the tradeoffs are. Ask for some examples of caches they have used before and how they worked. Pick a point in the solution and go fully down the rabbit hole with them. Think latency, throughput, and failure modes. Answers should be fairly instantaneous if they know their stuff.
  • If candidates come to a solution quickly, see if there are alternative ways in which they could have approached the problem. For experienced engineers, it should be possible to have a conversation about the tradeoffs of different approaches to the one they have taken. If they’ve used a batch processing system, ask about how a streaming system could look. If they’ve written code that is synchronous, ask them how they would make it asynchronous, and so on. Probe deeper.
  • Use methods of collaboration that LLMs are not good at. For example, a shared whiteboard is a great way to think about problems together, interactively, proving that you are really working together with the candidate in the same way that you would if they were a new hire.

Design your interview process to find the kinds of candidates that you really want to work with. If you’re looking for people that are great at using LLMs, then have your interview process find these people. Be open about it.

If instead you value candidates that are great at coding or design solutions unassisted despite the tools we all now have available, that’s also fine, but be open about that too. Let them know way ahead of time that this is how it’s going to be. You can’t have it both ways, and you need to design your process accordingly to get the right signal.

And That’s a Wrap

If you haven’t already, you need to start bringing your team(s) into the present day. Software development isn’t just changing, it has changed, and if you haven’t been adapting already, you’re getting left behind. This isn’t just important for your company, but it’s also incredibly important for your employees: you owe them access to the best tools available to do their jobs.

Happy prompting.

Should managers still code?

comments 2
Growth

This month we have a mailbag question from a reader who asks:

Hi James,

Your newsletter “being in the details” resonated with me.

I would like to ask your opinion about managers writing code. I skim PRs but don’t critically review them and don’t write code. Should EMs write code in their day job?

Thanks!

It’s a great question, and given the higher scrutiny we’ve seen on the role of engineering managers in the past few years, it’s worth spending some time on it.

Before we go any deeper, the short answer is that it depends exactly on what you mean by coding. I think that there is a big difference between being in the code and writing code. All managers should be in the code, but not all managers should be writing code.

But spending some time digging into the nuances of the question can, I think, highlight both the practical aspects of ensuring that managers are sufficiently in the code, and, more importantly, identify the existential worry that many engineering managers may have about their role, and what they can do about it.

Let’s get going.

Refresher: measuring your impact

We’ll begin by revisiting Andy Grove’s equation for measuring a manager’s impact, which states that the output of a manager is the output of their team, plus the output of the neighboring teams under their influence. This is always useful to refer to when thinking about how to spend your time. I contemplate it a lot.

There are, of course, a multitude of ways to increase your output as a manager. These range from, and are not limited to:

  • Hiring and retaining great people.
  • Owning the team’s strategy and roadmap, and ensuring efficient execution.
  • Making decisions to ensure that the team is working on the right things and saying no to the things that don’t matter.
  • Dealing with fires, escalations, and other crises that pop up all of the time.
  • Building a strong culture within the team so that people are engaged, challenged, and motivated.
  • Mentoring and coaching your reports so they get better and can have more work delegated to them, thus increasing output further.
  • Managing the team’s stakeholders so they can offer their steer to the team early and often.
  • Actively performance managing the team so that superstars can continue to shine and underperformers can be coached or exited.
  • Building close working relationships with other teams so that smooth collaboration happens across the organization, leading to a better and more cohesive product.

And so on.

Now, it isn’t a stretch to say that engineering managers, who typically get to a senior individual contributor level before transitioning to management, are also often very good at writing code.

However, there’s a whole team of people working for them who are also very good at writing code, and typically they won’t be spending as much time on the items listed above, which means they can be more productive at writing code than the manager can.

So, therefore, surely it makes sense for managers to focus on the things that only they can do, and leave the coding to the people who have more time for it and are better at it as a result?

Right?

Or is there something else at play here?

2025 cultural backdrop

The last few years have created somewhat of an existential crisis for engineering managers.

Since the tail end of the Covid-19 pandemic, where the zero-percent interest rate environment ended, leading to high inflation and interest rates, the end of cheap debt and plentiful investment, and the consequential slowdown in growth and corrective layoffs at many technology companies, managers have felt the brunt of the “great flattening”, where the number of managers has been reduced in favor of more individual contributors.

This has typically meant that the average number of direct reports per manager has increased, the amount of scope per manager has increased too, and the number of total layers in the org chart has shrunk.

This, therefore, feeds into a narrative that managers worth their salt need to be more productive, more impactful, and more efficient than before.

Depending on where you read your opinions on the internet, It can also build a more toxic picture: that managers are just unnecessary overhead for having some number of individual contributors; that it is a non-technical job for those that aren’t as good as others at actually building the product.

Hmm.

For those that made a conscious decision to move into management and learn this new role, and especially for those who take their craft seriously as managers, it can feel like the tide has turned against them.

As such, engineering managers, and the organizations that they work for, are pushing for ways for managers to differentiate themselves, typically by being more technical, more hands-on, and more in the details than they may have been before.

This isn’t a bad thing. When done right, it can be extremely beneficial. I wrote extensively about being in the details from a senior leadership perspective.

I would argue that being in the details is the key tenet of not just being a great manager in the climate that we find ourselves in, but also being a great manager full stop.

For senior leaders such as Directors and VPs, being in the details covers ideas such as having ICs report to you, as well as managers; doing regular, hands-on deep dives into the architecture and codebases your team owns; mixing up 1:1s with pair programming sessions, code reviews, and other technical activities, and more.

But what does this mean for frontline engineering managers? Is the new normal just about writing more code and doing less of the other things that peacetime managers would normally do?

Writing code versus being in the code

It’s more nuanced than that.

Here is a list of statements that represent how I would want my engineering managers to be in terms of their relationship with the codebase:

  • Should they be able to write code? Yes.
  • Should they understand how the codebase and their features and services are built? Yes.
  • Should they be able to do code reviews? Yes.
  • Should they review all design documents and architecture proposals from their team? Yes.
  • Should they be able to debug and triage production issues? Yes.
  • Should they be able to pair program with their reports? Yes.
  • Should they be accountable for the quality of the code that their team produces? Yes.
  • Should they write code themselves? Maybe.

Why maybe?

It depends on the manager, the team, and the organization. As a senior leader, I would rather my managers be in the code as per the above list, but not necessarily putting themselves in the critical path by writing code, given that they are likely to be interrupted more often, have more meetings, and be pulled in more directions than their reports.

I expect you to know how everything works, insofar that if I asked you to show me how a feature works by tracing through the code, you could do it. However, I would also know that there are other people on the team that are better placed to be the primary implementers of features in the area that you understand really well.

But, if you are a manager that is absolutely itching to write code and stay close to the details, then so be it and more power to you.

Here’s some approaches that might work for you:

  • Explicitly set aside uninterrupted time for coding. This could be a day a week, or a few hours a day, or whatever works for your team. Make sure that your team knows that you are doing this, to keep interruptions to a minimum. Block it out in your calendar, and set your statuses accordingly.
  • Pair program with your reports. This is a great way to get into the code, and also to mentor your reports at the same time. It’s a win-win. Sure, you’re not taking the lead on the coding yourself, but you’re in the details by proxy of collaborating with others. I love doing this myself.
  • Do code reviews. Don’t just skim PRs (sorry, reader!), but really dig into them: run the branch locally, test it, think critically about the design and the implementation, and provide feedback. Record a video of your review to highlight things that could be better.
  • Increase your coding involvement during specific occasions. Depending on how you and your team work, you may find that prototyping phases are where you can get your hands dirty effectively. That code isn’t going into production, so go to town! Alternatively, you might be great at digging in during incidents or times of high stress. Find the times that work for you and your team, and lean into them. This approach isn’t about spending a consistent amount of time coding every week; it’s about finding the right times to be in the code where you can be most effective.
  • Find time for exploratory coding that expands the knowledge of your team. Block out some regular time to stay on top of the latest technologies in a way that is fun and engaging for you, sharpens your skills, but also brings back learnings to your team. For example, if your team uses Apache Flink and the latest version has new aggregation functionality that could make your pipelines simpler, spend a couple of hours doing a prototype with it for a team-specific use case and then share it back.

And finally…

Going back to the original question: should managers write code? I think it’s best to rephrase that.

Should managers be in the code? Yes, absolutely.

Should managers write code? Maybe, but it also depends on what you mean by writing code.

If you mean being the primary implementer of features, then probably not. If you mean being an integral part of how your team produces code, then yes, absolutely. I recommend it highly.

So there we are.

I enjoyed doing this month’s article as a mailbag question. If you ever have a question that you think would make a good article, please send it in! I’d love to hear from you, and I’ll try my best to answer it.

Happy coding (in some form), managers.