Imagine something happened today that didn’t go the way you wanted. Or the way you’d expected it to go. Maybe it was a decision you made that didn’t work out. Or a decision somebody else made that impacted you. Your choice, your circumstance.
We have a range of reactions we can reach for. They range from curiosity on one side to frustration and anger on the other. We all have defaults.
But as we accumulate wisdom, the most worthy defaults are those that swing the dial to the side of curiosity.
As I was mulling this idea after a few experiences recently, I realized this is just another way of articulating the idea of a learner mindset versus a judger mindset – from Marilee Adams’ book, Change Your Questions, Change Your Life.
It comes down to whether we habitually ask ourselves and others learning questions or judging questions.
And whether we do it when things are not going our way.
An investigation into Meta’s AI Smart Glasses by two of Sweden’s most established newspapers Svenska Dagbladet and Göteborgs-Posten took them to Kenya where Meta’s data annotation partner Sama employs human data annotators. These workers help make the glasses “smart” by annotate/label the various images users see.
As part of this job, the annotators get a window into the lives of the wearers – except sometimes the window is much more revealing than the glass wearer might realize.
“I saw a video where a man puts the glasses on the bedside table and leaves the room. Shortly afterwards his wife comes in and changes her clothes”, one of them says.
“Someone may have been walking around with the glasses, or happened to be wearing them, and then the person’s partner was in the bathroom, or they had just come out naked”, an employee says.
“There are also sex scenes filmed with the smart glasses – someone is wearing them having sex. That is why this is so extremely sensitive. There are cameras everywhere in our office, and you are not allowed to bring your own phones or any device that can record”, an employee says.
One annotator sums it up: “You think that if they knew about the extent of the data collection, no one would dare to use the glasses”.
There’s a cost to smart glasses getting smarter. It helps to be thoughtful as to when we’re comfortable for our data to be training data.
There’s nothing good that an “um” or a “like” adds to a sentence. Beyond making you sound less thoughtful.
I’ve known this for a while. Gone through phases where things got better. Then regressed.
But this is one of the projects I want to go after now. Sixth time (given at least five recorded attempts from the past) is a charm.
No new insight here. No clever framework. Just the recognition that it’s time to try again – and that the right system for me is simple: frequent check-ins and daily reminders.
In 2015, two geographers noticed solar panels popping up on houses in their small US state of Connecticut. Curious, they set out to see if they could figure out what predicted who had them. Would they be in richer homes? Or in areas with higher population density?
Early adopters of solar panels tend to be people who are interested in innovative technology, who find an installer they trust, and who think having solar panels will benefit them.
But once an early adopter made their choice, the geographers found, a cluster would spring up around them. Having solar panels on a house near you, where you could see them and talk to a real live person who had them, it turned out, was the biggest predictor of whether you’d get them yourself.
Soon the Connecticut study was being replicated – in Sweden, in China, and in Germany, where they actually put a number on it. Rooftop solar installations were most influential, they found, on neighbors who lived within one kilometer (source: TED ideas).
The truth, of course, is this applies well beyond installing solar panels. Solar panels are just physical manifestations of the proximity principle.
People who prioritize their health are more likely to have friends who prioritize their health. And so on.
At the end of every round trip around the sun, I write a summary of the biggest lessons I’ve learnt. They’re like software release notes and this is version 37. As I think of the biggest lessons I’ve learnt, I look for the biggest ways I’ve changed how I operate. To learn and not to do is not to learn after all.
On the face of it, this past year had a lot going on.
The craft of product management changed completely thanks to Large Language Models – both in what we build and how we build. I spent much of the year learning the new “what,” shipping two products that rank among the most meaningful of my career. I went through a full career exploration and made a significant change. Health stayed a high priority – walking more, eating better, being more thoughtful about what I put into my body and when. I made progress on some significant learning projects with my kids. And late in the year, I made a commitment to be the most patient version of myself with my wife – realizing, only 13 and a half years after marriage, that I’d had my priorities backward. Better late than never.
Any one of those could make for a significant year. All of them happened in the same twelve months. And of course, interspersed were all the expected fumbles, stumbles, and downs that are part of the day to day.
But when I look underneath all of what went well, there’s just one foundation – learning how to learn.
In every single case, the same process played out.
First, get clear on what I’m solving for – block out the noise, identify the real priority.
Second, break the goal into smaller commitments and make progress in daily increments, sometimes weekly.
Third, check in every week without fail. When setbacks inevitably come – and they always do – tune out the noise, focus on what I control, change what I need to, and recommit.
That’s it. That’s the whole system.
This blog started nearly 19 years ago from a simple realization – I needed to become a learning-focused person. I wasn’t one. And 19 years of daily writing later, I’m only now beginning to appreciate just how much learning how to learn changes a life.
I think it is the foundation of a life with agency. When you commit to learning in small, daily increments, the benefits compound in ways you simply cannot see when you start. You build proficiency through practice. You build insight through reflection. And slowly – thread by thin thread – you build the kind of quiet unshakeable confidence that flows from insight and proficiency.
Perhaps most importantly, you start to see yourself as someone who can make and keep commitments – the foundation of integrity. The kind that makes you a person of value to the people around you.
19 years in. Still learning that learning follows pain and that the obstacle is the way.
I’ve been rewatching The Last Dance – the documentary about Michael Jordan’s time with the Chicago Bulls. Every time I do, something new jumps out.
This time it was a comment from Michael Jordan’s college coach. He described Michael’s insatiable appetite to get better – and then said something that stuck with me. Michael had the ability to turn it on and turn it off at will.
But, boy, he never turned it off.
Someone else added another layer to this. Michael knew that every time he played, there was somebody in that arena seeing him for the first time. And he had this deep desire to always show them excellence. It showed up in all the little things.
Aristotle said excellence is not an act but a habit. We are what we repeatedly do.
Watching Michael Jordan go about his business is an embodiment of that. Early in his career, despite all the distractions that come with being a young, famous basketball player, he lived like he was still in college. Head down. Show up. Perform at his absolute best. Make sure the team never lost.
It got me thinking – how good would it be to have somebody say that about you in your craft? That you had the ability to turn it on or off. But boy, you never turned it off.
What image instantly comes to mind when you think of the world’s deadliest animals? Mine immediately went to hippopotamuses who are known to be deadly to anyone who strays close to them in the water. Perhaps snakes.
This chart from the Our World in Data team does a beautiful job of showing us the power of availability bias.
The deadliest animals are mosquitos followed by humans – by far.
Incredible to think the mosquito impact is separate from sandflies and Tsetse flies.
The saying – “If you think you’re too small to make an impact, try going to sleep with a mosquito in the room” – takes on an entirely new meaning when you look at this chart.
Notes on LLM RecSys Product – Edition 4 of a newsletter focused on building LLM powered products.
Quick recap: We’ve covered the central thesis that LLM recsys is the core primitive of AI-native products, how teacher models enable painful self-awareness, and how the eval loop (not just evals) drives systematic improvement.
The eval loop only works if your teacher model knows what “good” looks like. That requires rigorous product policy.
What Product Policy Is
Product policy is the crystallization of the product team’s intuition. It’s the best understanding of what a great user experience looks like.
PRDs describe deterministic features we want to build. Product Policy, on the other hand, defines behavior to exhibit – and gets encoded into the teacher model, the production models, and the eval loop.
You won’t know if your definition of quality matches what users actually value until you test. If your true north metrics – typically laddering into user retention or end outcomes – improve, your intuition was right. If not, you refine. The Product Policy evolves with user signal.
Policy Encodes Judgment
For small products, the Policy could be written entirely by one author.
However, for large products, you typically need multiple contributors who will need to debate every gray area (of which there will be many). These debates matter because policy decisions cascade through the entire system. They define what you measure, what counts as “good,” what users see.
This is where product leaders must be hands-on. You can’t delegate the constitution. Your understanding of the user and your judgment must show.
One note – even in cases of complex policies, I would recommend having one author so your Policy reads as coherent. It forces the entire team to align to one point of view vs. “you take that section and I’ll write this.”
The Rubric
Policy lives in a rubric. This could be a binary 0/1 or a more sophisticated graded rubric. Here’s an example – let’s imagine you’re an e-commerce product team and are building out the policy for a product query.
You look at the most common product queries and pick a popular one – “toaster”
Let’s explore what the rubric might look like. Here’s an example –
Next, assuming the rubric feels right, we might make a policy decision – show only results rated 3-4. Filter out 0s and 1s and show 2/Fair with a different Ux (e.g., “Related”).
This in turn means toaster ovens are out – even though they toast. This is judgment made operational. And it cascades through millions of queries.
The Golden Dataset
The golden dataset brings the rubric to life with examples. A complex policy might need a golden set of 500 or so examples to bring the various items in the judgment to life
Not just “this is a 3” but “this is a 3 because…” – high-quality examples with detailed chain of thought.
The golden dataset evolves over time as you learn from user signal. What you thought was a 3 might become a 2. What you filtered out might need to be included.
The author of the policy should drive the golden dataset process – ensuring consistency, adjudicating disagreements, maintaining the chain of thought quality.
The Two Gates
Policy quality is measured through two gates:
Gate 1: Is the policy clear?
Have multiple raters score the same examples using your rubric. Measure inter-rater agreement using Cohen’s kappa.
Cohen’s kappa measures agreement beyond chance – because even random guessing produces some agreement. (Here’s the Wikipedia article and the Scikit implementation.)
The interpretation:
κ > 0.8 = strong agreement (policy is clear)
κ 0.6-0.8 = moderate (policy needs refinement)
κ < 0.6 = weak (policy is ambiguous)
If your raters can’t agree whether a result is a 2 or a 3, the policy isn’t clear enough. Sharpen definitions, add examples, debate more.
Gate 2: Does the model encode it?
Once Gate 1 passes, measure how well the model’s judgments match your golden dataset. Same metric: Cohen’s kappa between model output and human-labeled examples.
Gate 1 tells you if humans understand the policy. Gate 2 tells you if the model does.
One Policy, Not Many
A common mistake: architecting your system with multiple policies for different stages – e.g., one for ranking, another for filtering, another for personalization.
The problem: every model is imperfect. If each stage operates at 80% quality, your end-to-end experience is 0.8 × 0.8 × 0.8 = 51%.
Three “pretty good” stages compound into a mediocre experience.
The better approach is to build one unified policy. Define what great looks like end-to-end. Train one teacher on that complete experience. Let the production stack learn from that unified signal.
This is harder to build. But it’s the only way to avoid compounding errors through your system.
Next up: The eval loop in practice. How teams actually run it, what metrics matter, and when to invest in policy versus when to keep it simple.
One of the things that’s fascinating about the recent conversation about AI’s impact – in the media and across the internet – is that I think we’re showing a certain proclivity for what a friend hilariously termed “doomporn.”
This is similar to the love for “hustleporn” that used to be so prevalent – where everybody was talking about how hard they were hustling and working.
Doomporn has similar characteristics. There’s a lot of love for articles and points of view that forecast doom. And these seem to not just capture imagination but also ripple into the markets in ways that are fascinating to watch. We can name any number of these articles in the past weeks.
Who’s right? Nobody knows.
And that’s one of four truths worth reminding ourselves of –
First, AI is going to bring negative impact along with positive impact – like every major technology. The internet connected all of us – but it also polarized all of us. Television and video gave us access to internet – but it also isolated us. Name the technology and you’ll see accompanying negative impact that is proportional to the positives. AI will be no different. It definitely behooves us to be thoughtful about the impact (e.g., Noah’s post).
Second, nobody knows the future. Everybody is guessing and placing bets. It’s possible some people’s bets are going to be better than others. But it’s certainly hard to call right now.
Third, everybody’s talking their book. This is such a massive financial story that there’s a lot of financial interest no matter what the point of view is. It’s natural for everybody to be biased toward their financial interest and their bets – because they want their bets to succeed.
Finally, while there’s a lot of talk of doom, and who knows, there might be a non-zero chance the threat is existential in the next decades – the truth is this is a moment of change. And the best way to deal with change is to embrace it.
If you’re in the workforce, the message is as clear as ever. Use the AI tools and make them part of your workflow.
The doomporn conversations might be annoying. But let that not bury the lede – the change is real. And we’re better off embracing it.
We were at a somewhat remote place for a few days recently and had rented a car. We ended up buying enough groceries to last us a few days and had enough food throughout – so right at the end of the trip, we took the rental car and drove straight back to the airport.
Our kids pointed out that we didn’t end up using the car at all. Maybe we shouldn’t have rented one. Maybe we could have taken a taxi for example.
The conversation we had then was about the fact that the car had brought us optionality. If we hadn’t gotten the right amount of food or groceries, we knew we could always drive out of the remote location, get to a grocery store, and make it back.
The meta lesson – optionality always costs something. This is obvious when we buy refundable tickets or accommodation anywhere we go. However, it is less obvious when the costs are hidden. For example, if we’re choosing a path that gives us optionality in our career, that’s totally fine. But there’s a cost to doing so vs. say specializing.
It isn’t that optionality is right or wrong. It’s just important to be thoughtful about when we choose to pay that cost – and when we choose not to.