Building Claude Code Skills: What I Have Learned So Far

In my last newsletter, I shared Career Helper, a Claude skill for job seekers. I mentioned I'd follow up with what I've learned from building 50+ skills. This is that follow-up.

I've been building with Claude Code since its launch in February 2025. When Anthropic formalised the skills feature in October, we already had dozens of working skills in production. Career Helper, which I gave away free on GitHub, was one of them. It has fifteen capability modules, handles everything from CV optimisation to salary negotiation, and demonstrates what's actually possible when you move beyond simple prompts.

This isn't a tutorial. Anthropic's documentation covers the basics well enough. This is what I've learned from doing it at scale: the patterns that work, the mistakes that cost time, and the mental models that make everything easier.

What Skills Actually Are

A skill is a way to give Claude structured expertise in a specific task or activity. But that framing undersells what's possible.

Skills aren't just prompts. They're a combination of prompts, context, templates, and workflow logic that together create something closer to a specialist tool than a general assistant.

For example, my Career-Helper skill has fifteen capability modules. Each one has its own approach, its own templates, its own quality standards. When you ask it to prepare you for an interview, it researches the company first, generates questions likely for that specific role, and builds answers from your actual experience. Same approach every time.

That's the gap between a prompt and a skill. A prompt is a one-shot instruction. A skill is a persistent capability with memory, structure, and standards. It has a level of context and completeness that provides a repeatable result.

The Three Levels of Automation

I've found it useful to think about three distinct levels when building with Claude Code:

Slash commands are simple, repeatable actions that users invoke explicitly. Type /commit and get a formatted commit message. Type /review and get a code review. They're good for things you do frequently that don't require judgement about when to do them.

Skills are domain expertise encoded as structured workflows. The key difference from commands: Claude decides when to use them based on what you're asking for. Ask about job interviews, and Career Helper activates without you typing anything. Skills are semantic, commands are explicit. Right now this is where I am finding the most success in implementing working solutions.

Sub-agents run in isolated context windows with their own tool access. A planning agent might coordinate research agents, analysis agents, and writing agents, each with their own specialisations, working in parallel on different aspects of a larger task. The sub-agents and hooks system has great promise but right now feels like it needs a little more control and orchestration options.

The automation hierarchy matters. Commands require user invocation. Skills require semantic matching. Sub-agents require orchestration logic. Most people stop at commands. The real opportunity is in skills and sub-agents.

Who This Is For

Building skills isn't excessively complex, but it's not purely no-code either. It sits somewhere between low-code and some-code.

You'll benefit from some experience with coding, even if it's just enough to read and extend existing examples. More importantly, you need to be able to write well-structured prompts that are repeatable and testable. And you need to think in systems: how components connect, where state lives, what happens when things fail.

If you're comfortable with structured problem-solving and can write clear instructions, you have the foundation. The rest is learnable.

Progressive Disclosure

One pattern that took me a while to appreciate: you don't load everything upfront.

Before skills were introduced, the @ symbol in Claude Code was primarily a way to include subfiles. When Anthropic formalised skills, this changed how progressive disclosure worked. What I had previously built as sub-agents, I moved to skills. The context loading became more intentional.

The better approach is progressive disclosure. Start with minimal context. Load more as needed.

The mechanism is straightforward: your main skill file links to supporting documents using standard Markdown links. Claude reads these linked files only when the current task requires that information. Your core instructions might be 300 lines, but the full skill could have thousands of lines of templates, examples, and reference material that only get loaded when relevant.

Structure your skill as layers: a core instruction set that's always loaded, supporting documents that are linked and read on-demand, and detailed templates that are pulled in for specific outputs.

The Date Problem

Here's something that will bite you if you don't know about it: Claude's awareness of the current date isn't always reliable.

The issue isn't that Claude can't access dates. It's that different execution environments handle this differently, and the documentation doesn't make guarantees. Claude Code provides environment context that typically includes the current date. But I've seen enough edge cases to adopt a simple rule: always explicitly provide the current date in your skill's context if your outputs depend on it.

This matters more than you'd think. Research tasks need current information. Document generation needs accurate dates. Scheduling logic breaks if the model gets confused about temporal context.

The fix is defensive: inject the date explicitly rather than assuming it's available. Belt and braces.

Standalone vs Interdependent Design

Early on, I built skills that depended heavily on each other. Research skill feeds into analysis skill feeds into writing skill. Very elegant on paper.

In practice, dependencies create fragility. If one skill changes, others break. Testing becomes complex. Users can't use individual skills in isolation.

Now I design skills to be standalone by default. Each skill should be useful on its own. Dependencies are optional enhancements, not requirements.

The exception is sub-agent orchestration, where one skill explicitly coordinates others. But even then, the individual skills remain independently functional.

Templates and Persistence

Skills need to produce consistent outputs. The way to achieve this is templates.

But templates aren't just formatting. Good templates encode methodology. They force comprehensive coverage of a topic. They ensure outputs include everything they should, sections, quality checks, citations, structured analysis.

Persistence is the other half. Skills need to remember what they've done in a conversation. What research has been gathered, what decisions have been made, what outputs have been produced.

Claude Code gives you file system access. Use it. Write intermediate results to disk. Create working documents that accumulate context. Don't rely on conversation memory for anything important.

One of the key things to learn about to help with templates, and prompt structuring, is the use of Markdown and basic html formatting. These both provide excellent ways to define templates and document structure to LLMs.

Environment Differences

This one catches people: there are multiple ways to run Claude, and they have different capabilities.

Claude.ai (desktop and web) does support loading capabilities and skills. The mobile app, at the time of writing, does not. No file system access or code execution in any of these, so they're useful for quick interactions but not for building agentic workflows.

Claude Code comes in two forms: the local CLI that runs on your machine, and a cloud-based version that runs in a browser with GitHub integration. Both support skills, file system access, and bash execution. The cloud version runs in an isolated VM with pre-installed toolchains.

Agent SDK (Python and TypeScript) gives you programmatic control for building skills into larger applications. Same capabilities, different interface.

Know your target environment before you start building. And if you're building for teams, assume they'll be running in different environments than you.

Testing Skills

You need a way to test skills systematically. Not unit tests in the traditional sense. Claude is probabilistic. The same input can produce different outputs. But you still need quality assurance.

The approach that works: build validators. Create test scenarios with expected patterns. Run the skill through each scenario. Check that outputs meet structural requirements, content thresholds, and quality standards.

I have a long history with evals and rubrics, so this felt natural. The challenge is automation. Running validators as part of a CI/CD pipeline is still tricky. Running them in bulk causes runtime issues and failures. I haven't fully got this working to my satisfaction, but I'm building towards it. The validators themselves are valuable even when run manually.

Distribution Across Teams

If you're building skills for yourself, this doesn't matter. If you're building for teams, it matters a lot.

Skills need to be installable without explanation. They need documentation that answers the questions people actually ask. They need versioning so updates don't break existing users.

The challenge is that Claude Code skills are still maturing. You can bundle skills as a private plugin for the marketplace, but that requires users to be running Claude Code. Anthropic has just announced distribution tools for enterprise and teams users, which includes a distribution framework. I haven't tried it yet, so I can't comment on how well it works.

For now, plan for some support overhead. This will improve.

Four Weeks to Competence

If you're starting from zero, here's a realistic learning path:

Week One: Get comfortable with Claude Code basics. Understand the environment, file access, and command execution. Don't build anything complex yet.

Week Two: Build your first simple skill. Something you actually need, probably a command or simple workflow. Focus on making it work reliably.

Week Three: Add complexity. Templates, progressive disclosure, persistence. Convert your simple skill into something more robust.

Week Four: Build for others. Document your skill, test it on someone who wasn't involved in building it, iterate based on their confusion.

Four weeks won't make you an expert. But it will make you competent enough to build useful things.

The Deeper Pattern

Here's what I've come to believe: skills aren't really about AI. They're about capturing and scaling expertise.

When I build a skill, I'm not teaching Claude something new. I'm encoding knowledge that already exists, methodologies, templates, quality standards, into a form that can be applied consistently at scale.

This is valuable regardless of what happens with AI. The process of building a skill forces you to articulate what you actually know. It reveals gaps in your methodology. It makes implicit expertise explicit.

Even if you never deploy the skill, the exercise of building it clarifies your own thinking.

What I'm Not Covering

This newsletter is already long, and I've deliberately left things out.

I haven't covered the specific techniques for complex workflows. I haven't explained how to build sub-agent systems that coordinate multiple skills. I haven't discussed quality assurance at scale or team governance around skill development.

Those are deeper topics. They're also the difference between skills that work and skills that work reliably at enterprise scale.

If you're exploring this space and want to go deeper, that's what we do at Prosper. Building skills, training teams, implementing AI capabilities that actually work in production.

This is part of my ongoing series on building and using agentic AI. Previous edition: Giving Back: A Claude Skill for Job Seekers

If you're building skills for your organisation and want expert help, reach out. It's what we do.