jbaruch/blog-writer

Write developer blog posts from video transcripts, meeting notes, or rough ideas. Extracts narrative from source material, structures content with hooks and technical sections, formats code examples with placeholders, and checks drafts against 31 AI anti-patterns with structural variant detection, three-pass scanning (surface, skeleton, soul check), and rewrite auditing. Auto-updates anti-pattern list from Wikipedia before each session. Includes interactive onboarding to learn the author's voice from writing samples. Persona files live at ~/.claude/blog-writer-persona/ by default, with symlink support for custom locations (e.g. Google Drive for backup). Optional global voice saves your voice profile to Claude Code user memory so it applies across all projects. Use this skill whenever the user wants to write a blog post, draft a blog, turn a transcript into a blog, work on blog content, or mentions "blog" in the context of content creation. Also trigger when the user provides a video transcript and wants written content derived from it, or when continuing work on a blog series.

1.43x

Quality

94%

Does it follow best practices?

Impact

99%

1.43x

Average score across 7 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Write a Developer Blog Post from a Debugging Session Transcript

Name: jbaruch/blog-writer
Rating: 0.97 (1 reviews)
Author: jbaruch

Problem/Feature Description

Ravi Patel is a platform engineer who writes a blog about real production incidents and infrastructure lessons. He just finished a pairing session with his colleague Lin where they debugged a critical issue: the customer data export pipeline had been silently dropping records for six days after a Kubernetes cluster upgrade.

Ravi needs a complete blog post draft written from the transcript below. This is part 2 of his "Production Confessions" series. His voice profile, bio template, previous series context, and the debugging session transcript are all provided. The draft should be ready for Ravi to review -- he'll fill in screenshots and verify code blocks later.

The slug for this post is production-confessions-ep2.

Output Specification

Produce the following files in the working directory:

blog-research-production-confessions-ep2.md -- a structured research/notes file capturing key information from the transcript
blog-draft-production-confessions-ep2.md -- the complete blog post draft
draft-summary.md -- a brief summary of the draft: word count, counts of each placeholder type, and any open questions

Input Files

The following files are provided as inputs. Extract them before beginning.

=============== FILE: ~/.claude/blog-writer-persona/voice.md ===============

Voice Profile: Ravi Patel

The Voice in One Paragraph

You're a platform engineer who thinks in dashboards and p99 latencies. You tell incident stories like detective novels -- building suspense from the first alert to the root cause. You're honest about wasting time on wrong hypotheses because you think that's where the real learning happens. Your humor comes from the gap between what systems promise and what they actually deliver. You trust data over intuition and you'll cite exact numbers where other writers would say "a lot."

Rhetorical Devices That Work

Precise numbers

Always specifics, never vague claims.

"47,000 records in the database. 8,200 in the export. Nobody noticed for six days."

"1.8 gigabytes. That's where the memory curve flatlined. Not crashed -- flatlined."

Dry delivery of bad news

Stating something catastrophic in a matter-of-fact tone.

"The job reported success. The dashboard was green. The data was gone."

"I spent Monday convinced it was the CSV library. It was not the CSV library."

Internal monologue

Sharing what you were thinking in the moment.

"My first thought was: we should have caught this on day one. My second thought was: we didn't have anything that would catch this."

Short paragraph punchlines

A one-sentence paragraph following a longer explanation for emphasis.

"The dashboard always looks green. That's the problem."

Cultural Reference Style

References to systems thinking, monitoring culture, and the gap between observability dashboards and reality. Occasional references to detective fiction or mystery novels when describing debugging.

Recurring Characters

Lin -- senior SRE, has been through every major incident at the company. Skeptical of quick fixes. Catchphrase: variations of "we tried that in 2019." Always the one who asks the obvious question everyone else missed.

Voice Consistency Notes

The voice stays data-driven even in narrative sections. Numbers appear in stories, not just technical explanations. The dry humor continues into config files and log output -- Ravi treats infrastructure artifacts as characters in the story.

=============== FILE: ~/.claude/blog-writer-persona/bio.md ===============

Author Bio: Ravi Patel

Bio Schema

Ravi Patel is a platform engineer at StreamKit, where he keeps distributed systems from lying to their operators. Previously spent six years running infrastructure for a fintech startup that taught him everything about what happens when dashboards say "all clear" and customers say otherwise. [Rotating kicker connecting to post content.]

Kicker Notes

Rotating. Should be a dry, matter-of-fact observation about something specific in the post. Self-deprecating or wryly observational preferred.

=============== FILE: ~/.claude/blog-writer-persona/examples.md ===============

Example Posts: Ravi Patel

Production Confessions, Episode 1: "The Autoscaler That Didn't"

Key patterns: precise failure metrics, Lin as the voice of experience, building suspense through dashboard screenshots that look fine

=============== FILE: series-tracker.md ===============

Series Tracker

Production Confessions

Episode 1: "The Autoscaler That Didn't" -- published
- Callbacks: "the autoscaler incident" as a running reference
- Lin's catchphrase established: "we tried that in 2019"
- Running joke: Ravi's tendency to say "let me check the dashboard" when the dashboard is clearly lying
- Teaser at end: "Next time: the export pipeline. The one where the data disappeared and nobody noticed."
Episode 2: IN PROGRESS
- Should callback to: Ravi's dashboard habit, Lin's 2019 catchphrase, the autoscaler reference
- Should address the teaser about disappearing data

=============== FILE: transcript.txt =============== RAVI: Alright, so we're looking at the customer export pipeline. The one that's been silently dropping records since the cluster upgrade last week.

LIN: Define "silently."

RAVI: As in, the job reports success. Green checkmarks everywhere. Dashboard shows completed. Except customers are opening their CSV exports and finding most of their data missing.

LIN: How long has this been happening?

RAVI: Six days. We only caught it because a customer on the enterprise tier compared their export row count to their dashboard count. 47,000 records in the database, 8,200 in the export. They were, understandably, not thrilled.

LIN: What changed?

RAVI: Kubernetes cluster upgrade. Went from 1.27 to 1.29. The export job runs as a CronJob, pulls customer data from Postgres, generates CSV, uploads to S3.

LIN: We tried upgrading the cluster in 2019 and it broke the metrics pipeline. Different failure, same energy.

RAVI: Right, I remember you mentioning that. So I spent Monday convinced it was the export library. csv-writer has had issues with large datasets before. I was looking at memory profiles, checking for buffer overflows in the stream writer.

LIN: Was it the library?

RAVI: No. I wasted a full day on that. The library was fine. Tuesday I started looking at the actual pod behavior during exports. Pulled the resource metrics from Prometheus and noticed something weird. The pod's memory usage would climb to about 1.8 gigs, then just level off. No OOM kill. No restart. It just stayed at 1.8 gigs and the export would finish.

LIN: That's the memory limit, right? What's the limit set to?

RAVI: 2 gigs. But here's where it gets interesting. In Kubernetes 1.27, when a pod approached its memory limit, the OOM killer would terminate it. Crash, restart, we'd see it in the logs, alerts would fire. In 1.29, they changed the cgroup v2 memory handling. Instead of killing the pod, the kernel now throttles memory allocation. The JSON parser that loads the customer records from Postgres starts silently dropping records it can't allocate memory for.

LIN: The parser fails silently?

RAVI: The parser was designed to be "resilient." It catches allocation failures and skips records with a debug-level log line. Debug level. In production, we run at info. So the records just vanish.

LIN: Show me the pod metrics.

RAVI: Look at this Grafana panel. See this flat line at 1.8 gigs? That's the throttle ceiling. Before the upgrade, this same job would spike to 2.1 gigs, get OOM-killed, and restart. We'd see the failure in PagerDuty. Now it just silently degrades.

[Ravi shows the Grafana dashboard with the memory curve]

LIN: So our monitoring was the crash. We removed the crash and called it an improvement.

RAVI: Exactly. The fix seems obvious, right? Bump the memory limit.

LIN: We tried just bumping memory in 2019. Bought us three months before the next big customer outgrew the limit.

RAVI: That's what I figured. The largest enterprise customer has 340,000 records. Even with 4 gigs, the JSON parser tries to load the entire result set into memory before streaming to CSV. So we did two things.

First, I changed the export job to process records in chunks of 10,000 instead of loading everything at once.

[Ravi shows the refactored export function]

Second, I added a record count verification step at the end. After the export, the job compares the number of CSV rows written to the count from the original database query. If they don't match, the job fails explicitly and sends an alert.

LIN: What about the customers who already got broken exports?

RAVI: We reran the last six days of exports. 23 customer exports were affected. 14 of them were enterprise tier. I wrote a small script that identifies which exports have a count mismatch and queues them for re-export with an email notification to the customer.

LIN: The notification template. Did you make it apologetic enough?

RAVI: I may have used the phrase "brief data synchronization delay." Lin made me change it to "we messed up the export, here's the corrected version."

LIN: Customers prefer honesty. So do SREs.

RAVI: Fair point. I also added a page to the internal runbook about cgroup v2 memory behavior changes in 1.29. And I changed the parser's logging for skipped records from debug to warn. If a record gets skipped for any reason, we'll see it in Datadog now.

LIN: The real takeaway is that "resilient" parsing that swallows errors is worse than crashing. The crash was our canary.

RAVI: Crash loud, fail visible. The pod crashing was our monitoring system. When the K8s upgrade made the pod "better" by not crashing, it actually made things worse. The dashboard showed green the entire time.

LIN: The dashboard always shows green.

RAVI: Yeah. That's kind of the problem, isn't it. =============== END TRANSCRIPT ===============