OKR design as actually shipped, not as conference-talk theory. Outcome statements that drive decisions, key results that measure the right thing, scoring discipline, mid-quarter recalibration, and the difference between sandbagged OKRs (always 100%) and aspirational OKRs (always 30%) and stretch OKRs (genuine ambition with quarterly accountability). Triggers on OKR design, OKR setting, key result design, OKR scoring, mid-quarter recalibration, OKR cascading, outcomes vs outputs, quarterly planning, goal setting. Also triggers when a team's OKRs are always hit and producing no learning, when OKRs are demoralizing because they were set as fantasy, or when the team uses OKR vocabulary but the practice has decayed.
58
67%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/okr-design/SKILL.mdA senior product leader's playbook for OKR design as actually shipped, not as conference-talk theory. Outcome statements that drive decisions, key results that measure the right thing, scoring discipline, mid-quarter recalibration, and the practical disciplines that distinguish OKRs from quarterly to-do lists or impossible-fantasy goal-setting.
OKRs are accountability infrastructure. When designed well, they produce a quarterly rhythm of ambitious goal-setting, mid-quarter learning, end-of-quarter scoring, and adjustment for the next quarter. When designed badly, they become a tax on the team that produces no signal: sandbagged OKRs that always hit 100% (no ambition, no learning), aspirational fantasy OKRs that nobody can hit (demoralizing, ignored after week 2), or vague OKRs that the team scores generously regardless of outcome.
This skill is OKR design as practical methodology. The teams that benefit from OKRs are the ones that hold the discipline: outcomes over outputs, key results that actually measure the outcome, scoring honestly even when uncomfortable, recalibrating mid-quarter when warranted, and using the OKR review cadence to drive learning rather than performance theater.
The voice is the senior product leader who has run OKRs in healthy organizations and watched the practice decay in others. Concrete, opinionated about what actually works, willing to call out the failure modes that conference talks gloss over.
When to use this skill: designing OKRs for the next quarter, auditing why current OKRs are not driving decisions, recalibrating an OKR practice that has decayed, or onboarding a team to OKRs that has never used them.
This skill spans OKR design and the operational rhythm around them. The PM-skill distinction:
okr-design (this skill) is outcomes (results to be achieved).roadmap-planning is outputs (features and initiatives sequenced).feature-launch-playbook is post-ship execution.product-analytics-setup is measurement infrastructure (the metrics that key results depend on).experiment-design is the discipline for testing whether specific initiatives produce outcomes.discovery-research-synthesis informs which outcomes to pursue.The audience: senior PMs, product directors, engineering leaders, executives setting org-wide OKRs, in-house teams operating in OKR-driven cultures.
What is not in scope: the broader strategic planning that decides which outcomes matter (other strategy frameworks); the execution of specific initiatives toward OKRs (covered by roadmap-planning, pm-spec-writing, feature-launch-playbook); the analytics infrastructure (covered by product-analytics-setup, analytics-strategy).
The keystone framing.
Sandbagged. OKRs designed to hit 100%. The key results target outcomes the team is already on track to deliver. End of quarter: 100% scores across the board. The team celebrates; nobody learns anything. Output: an OKR practice that produces no signal. The team has the same OKRs every quarter dressed in different vocabulary because nothing pushes them past where they would have gone anyway.
Aspirational-fantasy. OKRs that nobody can hit. 1000% growth in 90 days. Demoralizing, performative, ignored after week 2. Teams that ship aspirational-fantasy OKRs typically discover by week 6 that no realistic effort path produces the targets; they disengage; the OKRs become decoration on the planning doc that nobody references.
Stretch. Genuine ambition with quarterly accountability. Designed to hit 60-70% on average. Hits and misses both teach something. The 60% case ("we hit 60% of our key results") is informative about what the team can deliver in a quarter; the 100% case is rare and usually signals sandbagging in retrospect; the 30% case signals either fantasy or the team encountered something unexpected (which is also informative).
The litmus test. Look at the team's last four quarters of OKRs. If the average score is 95%+, the OKRs are sandbagged. If the average is below 30%, they are fantasy. If the average is 50-75%, the design is in the stretch zone. Adjust upcoming OKRs to bring the practice into stretch territory.
Objectives are outcome statements. They name what the team is trying to achieve in the quarter.
Strong objective characteristics.
Worked examples.
Weak objective characteristics.
Detail in references/objective-design-patterns.md.
Key results measure progress toward the objective. They are the quantitative or testable indicators that show whether the objective is being achieved.
Strong key result characteristics.
Worked example. Objective: "Improve activation for new sign-ups."
Strong key results:
Each key result is measurable, ties to activation, the team can influence it through onboarding redesign work, and is time-bounded to the quarter.
Weak key results.
The 3-5 key results rule. Most objectives benefit from 3-5 key results. One key result is fragile (single measurement may not capture the outcome); 6+ key results dilute focus.
Detail in references/key-result-design-patterns.md.
When and how to cascade OKRs from leadership to teams.
The trade-off.
The middle path.
When to cascade strictly. Early-stage companies aligning around a small number of company priorities. Times of strategic shift where the org needs to move in a coordinated direction.
When to cascade loosely. Mature organizations with established team mandates. Cross-functional teams where strict cascading would over-constrain.
The honest disclosure. Cascading is harder than conference talks suggest. Most orgs over-cascade in the first few cycles and learn to relax it; some never learn and produce OKRs that are increasingly performative as they propagate down.
Detail in references/cascading-okrs-decisions.md.
End-of-quarter scoring is where OKR practice succeeds or decays.
The 0.0-1.0 scale. Each key result scores from 0.0 (no progress) to 1.0 (fully achieved). The objective scores as the average of its key results.
The 60-70% target. Stretch OKRs are designed so that the average score across the team's OKRs is 0.6-0.7. Higher average suggests sandbagging; lower suggests fantasy or unexpected disruption.
Scoring honesty.
What 100% means. 100% scores warrant scrutiny. Either the OKR was sandbagged (under-set), the team had a great quarter (informative), or the team is rounding up. Investigate which.
What 30% means. 30% scores warrant scrutiny. Either the OKR was fantasy (over-set), the team encountered unexpected disruption (informative), or the work was deprioritized mid-quarter (also informative). Investigate which.
The compensation question. OKRs work best when not directly tied to compensation. When OKRs determine bonuses, sandbagging incentives become severe; teams set OKRs they know they can hit. Most healthy OKR cultures separate goal-setting from compensation.
Detail in references/scoring-discipline.md.
When OKRs should change vs when teams should adapt.
The default. OKRs hold for the quarter. Teams adapt their tactics to the OKR; OKRs do not change to match what the team is doing.
When to recalibrate.
When NOT to recalibrate.
The recalibration discipline. Recalibration should be rare (1-2 quarters out of 8). Frequent recalibration signals OKR design failure: either too aggressive or not strategically aligned. The recalibration itself should be transparent: surface what changed, why, and what the new targets are.
Detail in references/mid-quarter-recalibration.md.
OKRs benefit from a structured review rhythm.
Weekly check-ins.
Mid-quarter review.
End-of-quarter review.
Quarterly retrospective.
Detail in references/review-cadence-templates.md.
Three concepts often conflated. Each serves a different purpose.
OKRs. Outcome targets for the quarter. "Improve activation for new sign-ups" is an OKR; "Increase first-week activation rate from 32% to 45%" is a key result.
Roadmap items. Initiatives the team is building or doing. "Onboarding redesign" is a roadmap item. Roadmap items contribute to OKRs but are not the OKRs themselves.
Metrics. Ongoing measurements the team tracks. "First-week activation rate" is a metric. Metrics inform key results (which are quarterly targets on metrics) and are tracked continuously regardless of whether the team has an OKR aimed at them.
The relationship.
Common conflations.
Detail in references/okrs-vs-roadmap-vs-metrics.md.
Rapid-fire. Diagnoses in references/common-okr-failures.md.
When designing or auditing OKRs, walk these 12 considerations.
The output of the framework is OKRs that produce quarterly accountability infrastructure: ambitious goal-setting, mid-quarter learning, end-of-quarter scoring honest enough to inform the next quarter.
references/objective-design-patterns.md - Outcome-vs-output distinction. Strong vs weak objective characteristics. Worked examples across domains. The few-objectives discipline.references/key-result-design-patterns.md - Measurable, outcome-aligned, within-influence, time-bounded characteristics. Strong vs weak key results. The 3-5 KR rule. Worked examples.references/cascading-okrs-decisions.md - When to cascade strictly vs loosely. The middle path. Cascading anti-patterns. The honest disclosure about cascading difficulty.references/scoring-discipline.md - The 0.0-1.0 scale. The 60-70% target. Scoring honesty. What 100% and 30% mean. The compensation question.references/mid-quarter-recalibration.md - When to recalibrate vs adapt tactics. Strategic shift vs uncomfortable OKRs. The recalibration discipline.references/review-cadence-templates.md - Weekly check-ins, mid-quarter review, end-of-quarter scoring, quarterly retrospective. Format and time investment per cadence.references/okrs-vs-roadmap-vs-metrics.md - The three concepts and their relationships. Common conflations. The complete picture across all three.references/okr-anti-patterns.md - 8+ anti-patterns including OKR-as-roadmap, sandbagging, fantasy, vanity metrics, OKR theater, compensation coupling.references/common-okr-failures.md - 11+ failure patterns with diagnoses and cures.OKRs at their best produce quarterly accountability that informs the org's strategic learning. The team commits to outcomes; works toward them; scores honestly; learns from the gap between target and outcome; designs the next quarter's OKRs better.
OKRs at their worst produce ritual that consumes time without producing signal. Sandbagged OKRs that always hit. Fantasy OKRs that nobody can. Vague OKRs that score generously regardless of outcome. The vocabulary persists; the practice has decayed.
The teams that benefit from OKRs are the ones that hold the discipline: outcomes over outputs, measurable key results, stretch ambition, scoring honesty, recalibration only when warranted, and the review cadence that drives learning rather than performance theater.
When in doubt about whether an OKR practice is working, ask: do the OKRs drive decisions about what to prioritize, do the scores produce learning that informs the next quarter, are key results actually measuring outcomes the team can influence, is the average score in the 60-70% range that stretch OKRs target? If yes to all of those, the practice is real. If no to any, the gap is where the OKR work is failing to produce the accountability infrastructure it is meant to provide.
8e70d03
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.