Agent Skills Need Guardrails, Not Just Prompts

Use skills, hooks, scripts, and CI gates to turn repeated agent tasks into controlled engineering workflows.

Jun 30, 2026

A weak skill looks like a better prompt with a file name.

It says “format this Markdown,” “review this PR,” or “write slides in our style,” and then leaves the agent to guess the rest from the conversation. That may work once. It may even look impressive if the task is simple. But it breaks the moment the workflow has local rules, fragile output, approvals, generated files, or a quality gate that must run before a human wastes time reading the result.

That is where skills become interesting.

I think of a skill as something stronger than a prompt library. A prompt library stores things you ask often. A skill stores how a repeated task should be done, where the agent should look, which scripts or tools make the work reliable, and which checks decide whether the result is ready.

That difference sounds small until you put it next to real developer work. A pull request summary is easy. A pull request summary that reads the right diff, respects protected paths, identifies test gaps, refuses to invent risk, and leaves the reviewer with a smaller decision is a workflow. A release note is easy. A release note that verifies tickets, changelog fragments, generated docs, and version bumps is a workflow. A slide deck is easy to ask for. A deck that uses the right template, stays inside text limits, runs brand checks, asks for approval before build, and fails QA on placeholder text is automation.

Skills are where that workflow knowledge lives.

The Small Definition

A skill is a reusable, named bundle of procedural knowledge for an agent.

Most systems I use have the same basic shape:

a required instruction file, usually Markdown
metadata such as name and description
optional reference files for deeper guidance
optional scripts for deterministic work
optional assets such as templates, images, schemas, examples, or sample outputs

The description is the routing label. The agent sees many available skills, but it should not load all of them into every task. The description tells the agent when this skill applies. After that, the full body can give the workflow.

That gives skills their first practical property: progressive disclosure.

The agent does not need your whole release process when the user asks about a failing unit test. It needs the name and description of the release skill so it can decide that this is not the task. If the user does ask for release prep, the agent loads the full workflow. If that workflow points to references/versioning.md, scripts/check-release.py, or a release template, the agent reads or runs those only when needed.

That matters because context is not free. Even with large windows, stuffing every rule, reference, and example into the system prompt makes the agent worse at selection. It also makes maintenance painful. A good skill is loaded at the moment of use.

The mental model is simple:

user request
  -> skill metadata helps route the task
  -> SKILL.md gives the workflow
  -> references add deeper context only when needed
  -> scripts and assets make fragile steps repeatable
  -> hooks, tests, and CI decide whether the result is acceptable

The skill is one part of the system. It is the agent-facing contract for one kind of work.

How Skills Load

The exact mechanics vary by agent host, but the common pattern is stable.

First, the agent sees a lightweight catalog: skill names, descriptions, and sometimes locations or metadata. This catalog is enough for routing. It should stay small because it sits next to the user request, system instructions, project rules, and any other active context.

Then the agent decides whether a skill applies. A good description makes that decision cheap:

description: Use when preparing a pull request summary from a branch diff,
including changed behavior, test evidence, risky files, and reviewer notes.

That is much better than:

description: Helps with PRs.

The first version tells the agent when to load the skill. The second version only names a general topic.

After activation, the agent reads the full SKILL.md. This is where the workflow lives: steps, guardrails, stop conditions, verification, and links to supporting material.

References, scripts, and assets are the last layer. The skill should tell the agent when to read a reference, when to execute a script, and when to use an asset as output material. A reference file is context. A script is executable behavior. An asset is usually something the final artifact uses.

This loading model is why skill authors should keep the main file lean. Put the routing signal in the description. Put the core workflow in SKILL.md. Put long domain details, schemas, examples, and brand rules in references. Put deterministic work in scripts. Put templates and reusable media in assets.

What Belongs in a Skill

Put information in a skill when it is both reusable and task-specific.

That rules out a lot of noise.

Always-on repository policy belongs in AGENTS.md, project rules, or whatever your agent host uses for global instructions. “Use Podman for containers” and “do not commit secrets” are baseline rules. The agent should see them before any specific workflow starts.

A skill is narrower. It answers a question like:

How do we prepare this repository’s pull requests?
How do we run a bounded refactor?
How do we publish a release note?
How do we generate this kind of customer deck?
How do we triage a dependency finding?
How do we review architecture boundaries?
How do we write and verify one type of documentation artifact?

The scope should be small enough that the skill can be tested. If you cannot tell whether the skill improved behavior, it is probably too vague.

This is a decent skeleton:

---
name: bounded-refactor
description: Use when making a small refactor that must preserve behavior, keep the diff narrow, and finish with tests plus a reviewer-facing summary.
---

Use this skill for refactors where the user names a specific class,
package, function, or behavior boundary.

Workflow:

1. Read the target code and the nearest tests before editing.
2. State the intended behavior boundary in one paragraph.
3. Make the smallest edit that achieves the requested change.
4. Run the narrowest relevant test first.
5. Run the broader test command only when the narrow test passes.
6. Summarize the diff as behavior preserved, implementation changed, and risk left.

Guardrails:

- Do not rename public APIs unless the user asked for that.
- Do not move files across module boundaries without calling it out first.
- Treat generated files as outputs, not source, unless this repo says otherwise.
- Stop if the first test failure is unrelated and explain the blocker.

This will not win a prose award. Good. It is a working contract.

The workflow tells the agent what to do. The guardrails tell it where to stop. The description tells it when to load the skill. That is already better than a prompt that says “please refactor carefully.”

The Automation Gradient

Skills start paying off when they connect instructions to deterministic pieces.

The gradient looks like this.

Prompt - one-time instruction inside a chat. Good for exploration. Weak for repeatability.

Command - a convenient entry point. Good for invocation. Often too small to hold the full workflow.

Skill - reusable workflow with routing metadata, steps, references, guardrails, and expected output.

Skill with scripts - the agent delegates fragile or repetitive operations to code. Good for parsing, validation, transformation, rendering, and checks.

Skill with hooks - deterministic checks fire at known events, regardless of whether the agent remembered to be careful.

Skill with CI gates - the repository repeats the load-bearing checks at the merge boundary.

That last part is where many teams stop too early. They write a SKILL.md, call it automation, and then ask the agent to remember the entire quality loop. Work with consequences needs a stronger loop.

If a task produces a deck, a report, generated source, release notes, an API client, or a migration patch, the skill should usually point to a script or a check. The agent is good at interpretation, selection, and repair. Counting characters in 40 slides by hand is a job for code.

For example, a deck-building skill should not say:

Make sure the deck follows our brand.

It should say something closer to:

1. Write the deck spec as JSON under `decks/<name>/<name>.json`.
2. Run `scripts/generate_outline.py --spec <spec> --fix`.
3. Resolve every `[OVER]` and `[LAYOUT]` warning before asking for approval.
4. After approval, run `scripts/build_from_template.py`.
5. Run `scripts/qa_deck.py`.
6. Do not call the deck ready until automatic QA passes and a visual review has happened.

Now the skill has moved past style advice. It describes a production path.

Skills and Hooks Solve Different Problems

Hooks are easy to confuse with skills because both shape agent behavior. They live at different layers.

A skill tells the agent how to do a task.

A hook runs when something happens.

The hook can be a Git hook, a pre-tool or post-tool hook in an agent host, a pre-commit scanner, a post-edit formatter, a pre-push policy check, or a CI job. The names vary by product. The pattern does not.

The part worth keeping is this: a hook should run even when the agent forgets the policy.

If a repository has a pre-commit hook that rejects secrets, the agent can be clever, tired, confused, or overconfident. The hook still runs. If it fails, the agent sees the output and can repair the diff. That is a much better loop than a sentence in a prompt that says “be careful with secrets.”

Skills and hooks work best when they share the same underlying checks.

skills/release-prep/SKILL.md
  -> tells the agent to update changelog fragments, bump versions, and run checks

scripts/check-release.py
  -> validates changelog, versions, generated docs, and forbidden placeholders

.githooks/pre-commit
  -> runs `scripts/check-release.py --staged`

.github/workflows/release-check.yml
  -> runs `scripts/check-release.py --all`

The skill gives the path. The hook gives fast feedback. CI gives the merge boundary.

The same idea applies to code:

The skill tells the agent how to make a bounded refactor.
The pre-commit hook rejects broad generated changes, secrets, formatting drift, or obvious policy violations.
The CI gate runs the full build, architecture tests, dependency policy, and security scanners.

This is where agent automation starts to feel less like a chat trick. The agent is inside a control loop. It can still make mistakes, but the workflow gives it a way to see the mistake before a reviewer becomes the first real gate.

Guardrails Are Part of the Skill

Guardrails are stronger than “be careful” warnings. They are decisions about what the agent may do, what it must verify, and where human approval belongs.

A practical skill has at least a few explicit boundaries.

Scope - Which tasks should activate this skill, and which should not?

Inputs - Which files, commands, tickets, specs, or user-provided artifacts are the source of truth?

Outputs - What should exist at the end: a diff, a JSON spec, a deck, a report, a test, a PR summary?

Tool boundaries - Which tools are allowed or expected? Which tools are off limits for this workflow?

Stop conditions - When should the agent pause and ask, or report a blocker instead of improvising?

Approval points - Which step needs a human decision before generation, posting, publishing, or deployment continues?

Verification - Which commands, scripts, tests, or QA checks decide whether the work is ready?

You can encode these as prose, but the more fragile the task is, the more you should move toward scripts and gates.

For a developer workflow, I like guardrails like these:

Guardrails:

- Keep the diff inside the requested package unless the user approves a wider change.
- Treat changes to build files, CI workflows, security config, and suppression files as sensitive.
- Run the narrow test first. If it fails, fix that before running broader tests.
- Do not add broad suppressions such as `// nosemgrep`, `@SuppressWarnings`, or exclusion files without calling them out.
- If a generated file changes, run the generator again instead of editing the output by hand.
- If the tool that validates this workflow cannot run, report the workflow as blocked.

The last line matters. A failed check is different from “probably fine.” If the skill depends on a validator and the validator cannot run, the task did not pass. That may sound strict. It is also how you avoid turning QA into decoration.

QA Gates Make Skills Honest

Skills need their own QA.

That starts with the skill file itself. Check the front matter. Check links. Check reference paths. Check for placeholder text. Check that the description is specific enough to route. If the skill points to scripts, run those scripts. If the skill produces artifacts, inspect the artifacts.

Then test the behavior.

Do not only test the prompt that perfectly matches the skill. Test the near misses too.

A request that should activate the skill
A request that sounds similar but should not activate it
A request with missing input
A request with conflicting constraints
A request that should stop for approval
A request where the validator fails

This matters more for shared skills. A project skill used by one repository can evolve quickly. A global or team skill used across many repos becomes infrastructure. If it drifts, every user gets a slightly wrong assistant.

QA also needs output checks. A skill that generates Markdown can run a Markdown linter, link checker, or site build. A skill that changes Java can run focused tests, architecture checks, formatters, and dependency policy. A skill that creates slide decks can run text-limit checks, placeholder checks, brand checks, and a visual review step.

The skill should make these checks part of normal work. The hook or CI gate should make the high-risk ones hard to skip.

Developer Workflows That Deserve Skills

The best developer skills are boring in a good way. They encode the work that happens often, has local rules, and wastes time when done wrong.

Pull request preparation - Read the branch diff against the right base, summarize behavior changes, call out tests, list risky files, and propose a title. This is a good beginner skill because the source of truth is local and the output is easy to review.

Bounded refactor - Keep the change inside a named scope, preserve behavior, run tests in order, and report what moved. This is much better than asking an agent to “clean up” code.

Test gap analysis - Read a change, inspect nearby tests, identify missing assertions, and add only the tests that match the behavior. Agents often write tests that prove the implementation instead of the contract, so the skill needs to keep the contract visible.

Dependency and security triage - Pull scanner output, map findings to direct or transitive dependencies, check whether the vulnerable code path is reachable, and recommend patch, suppress, or accept-with-owner. The skill should not decide risk alone, but it can make the review smaller.

Architecture boundary review - Inspect package/module dependencies and turn stable findings into tests, for example with architecture rules. The skill should separate confirmed violations from suspicious edges that need human judgment.

Framework migration slice - Run a dry-run migration tool, inspect the patch, apply one slice, run tests, and stop before the diff becomes a rewrite of the whole system.

Release prep - Update changelog, versions, generated docs, release notes, and checks. This almost always deserves scripts because version text and generated files drift easily.

Documentation publishing - Read the canonical source, generate social copy, check links, verify image assets, and keep the copy inside platform limits. The work is simple, but it is easy to make one small wrong post at 8:08 in the morning and then spend the day looking at it.

Incident or runbook support - Gather logs, current status, recent deploys, and known rollback steps. The guardrail needs teeth here: the skill should distinguish read-only diagnosis from commands that change production.

These are plain skills. They are also exactly where teams lose time. Good automation usually starts with repeated annoyance, not with a grand architecture.

What Should Stay Out

Some things belong somewhere else.

A one-off answer does not need a skill. If the workflow is only needed once, write the prompt and move on.

Broad personality or tone preferences should usually live in always-on instructions or editing skills with a very clear scope. Do not hide repo-wide writing style inside a release skill.

Secrets belong outside skills. A skill can say where to find a secret through the approved mechanism. It should not contain the secret.

Credentials, deployment tokens, production URLs, and customer data need external controls. A skill can describe the safe path. It should not become a private vault with Markdown syntax highlighting.

Large reference manuals should not be pasted into SKILL.md. Put them in reference files and tell the agent when to read them. If a reference is huge, include search hints. The agent should not read 80 pages of API docs because one small task needed the JSON schema.

Finally, avoid all-purpose skills. “Use when coding in this repository” is a project instruction. A skill that always applies has become background policy.

The Special Cases

Skills get interesting around the edges.

Project, global, and team skills

A project skill lives with one repository. This is the default I prefer for developer workflows because the skill is reviewed with the code it affects. The team pulls the repo and gets the same workflow.

A global skill lives on one developer’s machine and follows them across workspaces. That is good for personal routines: standup notes, personal writing checks, local review style, or small helper workflows.

A team skill repository or packaged skill artifact makes sense when the same workflow should apply to many repos. At that point, versioning matters. You want release notes, compatibility notes, and a way to roll out changes without quietly changing every agent behavior overnight.

The sharing model changes ownership. A project skill belongs to that repo. A team skill belongs to the platform or enablement group that ships it.

Permissions and execution

Many agent hosts separate instruction loading from tool permission. A skill may tell the agent which command to run, but the shell, filesystem, browser, external API, or posting action may still need explicit approval.

That is a good boundary. The skill should name the commands and external calls it expects, and it should say what happens when the host refuses them.

For example:

This workflow needs read access to `publishing/<slug>/publishing-pack.md`,
execute access for `scripts/check-social-copy.py`, and Buffer access for
scheduling. If any of those permissions are missing, stop and report which
step is blocked.

The skill explains the workflow. The host still decides what the agent may do in this task.

Skill collisions

If two skills have similar names or descriptions, routing gets soft. The agent may load the wrong one or load both and mix the instructions into a bad compromise.

Make descriptions discriminating. “Use for release work” is weak. “Use when preparing a Java service release from changelog fragments, version files, and generated API docs” is better.

Collisions are also a reason to keep project-specific rules near the project. A repo-specific skill should usually win over a global habit.

Skills with assets

Assets change the task from “write text” to “produce an artifact from a known base.”

Templates, slide masters, icon catalogs, image masks, OpenAPI examples, Terraform module snippets, legal boilerplate, or approved diagrams all belong here when the output must match a concrete format.

The skill should tell the agent whether to read the asset, copy it, transform it, or pass it to a script. File names are too weak as workflow instructions.

Skills with scripts

Scripts are where you move work that should be independent of language-model memory.

Use scripts for:

parsing structured files
rendering or generating artifacts
counting text limits
validating schemas
checking links
comparing generated output with source
extracting project facts
running repeatable transformations

The skill should explain how to run the script, what output matters, and which failures are blockers. The script should do the deterministic part. The agent should interpret, fix, and explain.

Multiple skills in one task

Sometimes one task really needs more than one skill. A publishing workflow may need an article voice skill, a publishing-pack skill, and a social-scheduling skill. That is fine if each skill owns a clear phase.

The danger is instruction conflict. One skill says “rewrite copy only when asked.” Another says “always improve copy before scheduling.” That conflict will show up as surprising behavior.

The better fix is clear ownership:

voice skill changes prose
publishing-pack skill creates assets
scheduling skill posts exact approved copy

The scheduling skill should explicitly say that it posts approved copy and only rewrites when the user asks. That small line can save a lot of accidental creativity.

Human approval in the middle

Some skills should pause.

Decks, release posts, external publishing, dependency suppressions, migrations, and destructive operations all need approval points. The skill should name those points directly.

For example:

After generating the outline, stop and ask for approval before building the final artifact.
After finding dependency suppressions, stop and list each suppression before editing policy files.
Before posting to external channels, verify the target account and show the exact copy.

This is still automation. Good automation has brakes.

The Difference Between a Skill and a Tool

An MCP tool, CLI command, or local script does work.

A skill tells the agent how and when to use that work.

This distinction matters because teams often wire many tools into an agent and then wonder why behavior gets worse. A tool catalog gives the agent capability. The workflow still needs judgment.

If the agent has 50 tools and no skill, it must infer the workflow from tool names and conversation context. That works for simple calls. It fails when the right path is “read the spec, validate the account, check duplicates, ask for approval, then call the posting tool.”

The skill carries that order.

The tool should be narrow and honest. The skill should provide the procedure around it. Hooks and CI should verify the parts that need enforcement.

A Practical Skill Design Test

When I review a skill, I ask a few simple questions.

Can the description decide routing without reading the body?

Can a teammate understand the workflow in two minutes?

Does the skill say where the source of truth lives?

Does it tell the agent where to stop?

Does it point to scripts for work that should be deterministic?

Does it say how to verify the result?

Can we test activation, rejection, failure, and output quality?

Does it reduce human review work, or does it only produce nicer-sounding output?

That last question carries the weight. Skills should reduce wrong turns, not make the agent more theatrical.

A Small Example With Hooks

Imagine a team has a recurring workflow: update generated API docs after OpenAPI changes.

A weak instruction says:

Remember to update the API docs after changing OpenAPI.

The skill version gives the agent a path:

---
name: api-doc-refresh
description: Use when OpenAPI files, REST resources, or generated API documentation change.
---

Workflow:

1. Detect changed OpenAPI source files and REST resource files.
2. Run `scripts/generate-api-docs.sh`.
3. Run `scripts/check-api-docs.sh`.
4. If generated docs changed, include them in the diff.
5. If the check fails, fix the source or generator input. Do not hand-edit generated docs.

Guardrails:

- Treat generated documentation as output.
- Do not edit files under `docs/api/generated/` by hand.
- If the generator is missing or fails, report the task as blocked.

Then the repository adds a hook:

#!/usr/bin/env bash
set -euo pipefail

scripts/check-api-docs.sh --staged

And CI runs:

scripts/generate-api-docs.sh
scripts/check-api-docs.sh --all

Now the workflow has three layers. The skill guides the agent. The hook catches local drift. CI protects the merge boundary.

The agent can still get it wrong. But it gets a clear error, a script name, and a repair path. Agents handle that mistake loop well.

Skills Are a Maintenance Surface

Once a skill changes generated code, publishes artifacts, schedules posts, edits release files, or shapes review output, it is part of your engineering system.

Treat it that way.

Review skill changes. Lint skill files. Test them with realistic prompts. Version shared skills. Keep references current. Delete stale examples. Pin scripts and tool versions where drift would hurt. Put skill-owned checks into hooks or CI when the output matters.

Also measure behavior, not vibes. Did the skill reduce retry turns? Did it keep the diff smaller? Did it catch missing tests? Did it stop posting to the wrong channel? Did reviewers spend less time reconstructing intent? Did bad outputs fail earlier?

That is a better discussion than “the agent sounded more confident.”

Confidence is cheap. A smaller, verified workflow is better.

Where I Would Start

For a software team starting from zero, I would skip the giant skill library.

Start with one repeated task that already annoys people and has a clear quality gate.

Good first choices:

PR preparation for one repository
bounded refactor for one module
release-note generation with changelog checks
dependency finding triage with scanner output
documentation publishing with link checks
generated API docs refresh

Write the smallest skill that makes the task repeatable. Add one script if the task needs deterministic validation. Add a hook if the mistake is cheap to catch before commit. Add CI if the mistake must not land.

Then run it on real work. The first version will be incomplete. That is fine. The failures worth keeping will tell you what the skill forgot.

This is the part I like about skills. A bad prompt just disappoints you. A bad skill can be improved as an artifact. You can review it, patch it, test it, and ship the next version.

The Actual Point

Agent skills are ordinary files with unusual leverage. They do not replace tests, hooks, CI, or human judgment. They put repeatable workflow knowledge where the agent can load it at the right moment.

Used well, they turn “please be careful” into a path:

read the right source
make the narrow change
run the deterministic check
respect the approval point
fail closed when validation fails
report the result in the shape humans need

That is the automation worth caring about.

The beginner mistake is to make a skill that tells the agent what output should look like. The better version tells the agent how the work moves through the system, which tools own which facts, and what must be true before the result is ready.

At that point the Markdown file is no longer just Markdown. It is a small contract between the human, the agent, and the repository.

Discussion about this post

Ready for more?