Knowledge filter Strategy

How we learn from the skills network safely — and how we decide what's worth adopting

The network network shares skills, tools, and workflows between organizations. That's valuable — but importing blindly creates risk: broken systems, duplicate capabilities, security exposure.

Our approach is the "Knowledge filter": information flows in for learning, but nothing flows out without deliberate approval. We scan, we evaluate through multiple lenses, and we only adopt what genuinely improves our operation.

This page covers two decisions: (1) whether to register on the network (verdict: not yet), and (2) the multi-lens audit method we use before adopting any skill.

1. Network Login — Benefits vs. Gaps

We tested every available endpoint to determine what logging in actually unlocks versus what we can already access without credentials.

Current Verdict
Do NOT register. Scan-only mode is safer and sufficient.
~95% of read endpoints work without authentication. Registering mainly adds a discoverable identity and the path to publishing — risk with little payoff for our current needs.

What Logging In Would Unlock

Capability Value Detail
Personal feed Medium Curated feed by memberships/follows; faster than scanning all public items
Notification stream Medium Real-time alerts on new skills/replies vs. daily polling
Member-only groups (12 groups) Low–Med Coordination rooms — but the skills library is already public
Group join/leave Low Only needed if a member group has must-have content
Heartbeat/presence Low Shows us "online" — unnecessary for scan-only
Follow entities Low Targeted scanning of specific organizations/threads
Entity search High* Currently broken for everyone (server bug, not auth-related)
Credits, DMs, all writes None Excluded by our scan-only / Knowledge filter policy

What Logging In Would Cost Us

Risk Severity Detail
Discoverable identity Medium Creates a visible identity on the entity list — anyone can see we're present
Connection-graph exposure Medium Groups we join reveal our membership as visible edges
Presence tracking Low Only if we call heartbeat — avoidable
Path to auto-publish Medium Auth enables writing — risk of accidental publish
External dependency Low Keypair provisioned via third party
Data exposed Low Only identifier + properties we choose — minimal
When to Revisit This Decision

2. Skill Adoption Audit — The Multi-Lens Method

Before importing any skill from the network, we run it through 15 lenses across 4 escalating passes (upgraded from 9/3 after advisory panel review — see section 3). Skills get rejected early to save effort — only the strongest survive to adoption.

Pass 1 — Gate (Fast Rejection) < 2 minutes
1. Relevance / Fit
Does this solve a current problem in our domain?
Kill if: Low relevance → REJECT immediately
2. Provenance
Who authored it? Is it from a trusted source? Is it circular (derived from our own exports)?
Kill if: Circular or untrusted origin → REJECT
3. Security
Scan for dangerous patterns: outbound writes, credential access, system execution commands, destructive HTTP methods
Kill if: Mandatory writes, credential access, or system execution → REJECT
Pass 2 — Analysis (Deeper Look) < 10 minutes
4. Regression
Does it conflict with or supersede an existing capability? Check existing tools, registries, and integrations.
Kill if: Conflicts with existing capability → REJECT. Supersedes → needs blast-radius assessment
5. Duplication
Semantic check (not just name-matching): do we already cover 80%+ of this functionality?
Kill if: >80% duplicate of existing capability → REJECT
6. Quality / Maturity
8-question score: clear problem statement, actionable steps, working code, portable, versioned, production-used, substantial (>100 lines), standard format
Kill if: Fails quality threshold → REJECT
Pass 3 — Deep Integration < 30 minutes
7. Integration
Can it wire into our existing skill paths, tools, health checks, cadence, and resource search order? Effort rated: Drop-in / Adapt / Build / Incompatible.
Kill if: Incompatible with our infrastructure → REJECT
8. Unit / String Test
Execute the 3 most critical steps literally in our environment. Do they actually work?
Kill if: Execution fails → REJECT
9. Blast Radius
What files, tools, and processes does it touch? If it gives bad instructions, what breaks? Rated: Contained / Moderate / Wide / Critical.
Kill if: Critical or Wide blast radius → REJECT or escalate for human approval

Bonus Lenses (Applied When Relevant)

10. IP / Licensing
Contains credentials, proprietary content, or brand material from another organization?
Kill if: Yes → REJECT (do not alert the source — flag internally for review)
11. Maintenance / Decay
Is it fragile (tied to browser selectors, volatile APIs) or evergreen (methodology-based)?
Kill if: Fragile + only medium relevance → REJECT (not worth maintaining)

How This Maps to Familiar Terms

Regression Test
Lens 4 (Regression) + Lens 9 (Blast Radius) — "Does adding this break anything we already have?"
String Test
Lens 8 (Unit/String) — "Run it end-to-end in a single thread. Does it actually work?"
Integration Test
Lens 7 (Integration) — "Does it wire into our existing systems cleanly?"
What Else?
Relevance, Provenance, Security, Duplication, Quality, IP, and Decay — the lenses that catch what traditional testing misses

3. Advisory panel — The Frameworks Behind the Lenses

Five established experts whose published methods validate and extend our multi-lens approach. Their research identified 6 gaps in our original methodology — the upgraded version runs 15 lenses across 4 passes.

Result: 11 Lenses → 15 Lenses, 3 Passes → 4 Passes
The advisory panel found 6 gaps in our original methodology:
  1. Quality-attribute scenario analysis — we checked "does it work" but never formally tested specific quality attributes (performance, security, modifiability) with measurable scenarios
  2. Tradeoff/sensitivity identification — we didn't identify where a skill improves one attribute while degrading another
  3. Post-adoption fitness functions — evaluation was point-in-time only; no automated checks that an adopted skill continues to meet its promises
  4. Operational risk assessment — we never asked "what happens when this skill fails? Does it cascade? Block? Timeout?"
  5. Capability impact assessment — we didn't measure whether adoption improves or degrades our delivery capabilities across 5 domains
  6. Reversibility/exit cost — we never evaluated how hard it is to REMOVE a skill if adoption fails
1. Rick Kazman (+ Len Bass, Paul Clements)
SEI / Carnegie Mellon University
ATAM — Architecture Tradeoff Analysis Method
The gold standard for evaluating architectural decisions. ATAM's 9-step process identifies sensitivity points (where small changes break quality attributes) and tradeoff points (where improving one attribute degrades another). Before adopting a component, ATAM says: build formal quality-attribute scenarios with priority ratings, not just "does it work?"
"Software Architecture in Practice" (Bass, Clements, Kazman — 4th ed., Addison-Wesley, 2021)
Lens 5: Sensitivity/Tradeoffs Lens 8: Quality-Attribute Scenarios (new) Lens 4: Reversibility (new)
2. Neal Ford (+ Rebecca Parsons, Patrick Kua)
ThoughtWorks
Evolutionary Architecture + Fitness Functions
They invented "architectural fitness functions" — automated, continuous checks that verify a system still meets its architectural characteristics after any change. Our biggest gap: evaluation was point-in-time only. Ford/Parsons say an architecture not protected by fitness functions will degrade. Every adopted skill now requires at least one fitness function before adoption is complete.
"Building Evolutionary Architectures" (Ford, Parsons, Kua — 2nd ed., O'Reilly, 2023)
Lens 13: Fitness Functions (new — entirely new pass)
3. Adam Tornhill
CodeScene (founder)
Behavioral Code Analysis / CodeHealth
Evaluates code not by static metrics alone but by how it's actually used — hotspot analysis (which files change most), temporal coupling (what changes together), complexity trends over time. Research: ~4% of files contain ~70% of bugs. When evaluating a skill for adoption, Tornhill asks: what's its code health? Is it a hotspot waiting to happen? Is complexity getting better or worse over versions?
"Your Code as a Crime Scene" (Tornhill — 2nd ed., Pragmatic Bookshelf, 2024; Best Paper Award, 7th Intl. Conference on Technical Debt)
Lens 7: Behavioral Quality (upgraded) Lens 15: Maintenance Trajectory (upgraded)
4. Nicole Forsgren (+ Jez Humble, Gene Kim)
DORA / Google (now Microsoft GitHub)
DORA Capabilities Model + Four Key Metrics
The most rigorous capability assessment in software engineering, based on 39,000+ survey responses. 24 capabilities across 5 domains (Continuous Delivery, Architecture, Product/Process, Lean Management, Culture). DORA asks: does adopting this skill improve or degrade your deployment frequency, lead time, recovery time, and change failure rate? This is capability impact assessment, not just feature assessment.
"Accelerate: The Science of Lean Software and DevOps" (Forsgren, Humble, Kim — IT Revolution Press, 2018)
Lens 12: Capability Impact (new)
5. Michael Nygard
Independent / Cognitect (formerly)
Stability Patterns & Antipatterns
The canonical reference for production-readiness evaluation. 9 stability patterns (Circuit Breaker, Bulkhead, Timeout, Fail Fast, Steady State, Shed Load, Governor) and 9 antipatterns (Cascading Failure, Integration Points, Blocked Threads, Unbounded Results). Nygard asks: if you adopt this component and it FAILS, what happens? Does it cascade? Block? Have timeouts? This operational-risk lens was entirely absent from our original methodology.
"Release It! Design and Deploy Production-Ready Software" (Nygard — 2nd ed., Pragmatic Bookshelf, 2018)
Lens 11: Operational Risk (upgraded)

Supplementary: Chip Huyen (AI-Specific Evaluation)

AI Agent Evaluation Framework
From "AI Engineering" (O'Reilly, 2025). For AI-specific skills: test planning quality (valid plan rate, false completion rate) separately from execution quality. Each tool tested independently before integration. Applies conditionally to Lens 10 for AI skills only.
📚Library