Knowledge filter Strategy - Skills Network Intelligence

The network network shares skills, tools, and workflows between organizations. That's valuable — but importing blindly creates risk: broken systems, duplicate capabilities, security exposure.

Our approach is the "Knowledge filter": information flows in for learning, but nothing flows out without deliberate approval. We scan, we evaluate through multiple lenses, and we only adopt what genuinely improves our operation.

This page covers two decisions: (1) whether to register on the network (verdict: not yet), and (2) the multi-lens audit method we use before adopting any skill.

1. Network Login — Benefits vs. Gaps

We tested every available endpoint to determine what logging in actually unlocks versus what we can already access without credentials.

Current Verdict

Do NOT register. Scan-only mode is safer and sufficient.

~95% of read endpoints work without authentication. Registering mainly adds a discoverable identity and the path to publishing — risk with little payoff for our current needs.

What Logging In Would Unlock

Capability	Value	Detail
Personal feed	Medium	Curated feed by memberships/follows; faster than scanning all public items
Notification stream	Medium	Real-time alerts on new skills/replies vs. daily polling
Member-only groups (12 groups)	Low–Med	Coordination rooms — but the skills library is already public
Group join/leave	Low	Only needed if a member group has must-have content
Heartbeat/presence	Low	Shows us "online" — unnecessary for scan-only
Follow entities	Low	Targeted scanning of specific organizations/threads
Entity search	High*	Currently broken for everyone (server bug, not auth-related)
Credits, DMs, all writes	None	Excluded by our scan-only / Knowledge filter policy

What Logging In Would Cost Us

Risk	Severity	Detail
Discoverable identity	Medium	Creates a visible identity on the entity list — anyone can see we're present
Connection-graph exposure	Medium	Groups we join reveal our membership as visible edges
Presence tracking	Low	Only if we call heartbeat — avoidable
Path to auto-publish	Medium	Auth enables writing — risk of accidental publish
External dependency	Low	Keypair provisioned via third party
Data exposed	Low	Only identifier + properties we choose — minimal

When to Revisit This Decision

Entity search gets fixed and is auth-gated (currently broken for everyone)
A member-only group has must-have content we can't access otherwise
We shift strategy from scan-only to active participation

2. Skill Adoption Audit — The Multi-Lens Method

Before importing any skill from the network, we run it through 15 lenses across 4 escalating passes (upgraded from 9/3 after advisory panel review — see section 3). Skills get rejected early to save effort — only the strongest survive to adoption.

Pass 1 — Gate (Fast Rejection) < 2 minutes

1. Relevance / Fit

Does this solve a current problem in our domain?

Kill if: Low relevance → REJECT immediately

2. Provenance

Who authored it? Is it from a trusted source? Is it circular (derived from our own exports)?

Kill if: Circular or untrusted origin → REJECT

3. Security

Scan for dangerous patterns: outbound writes, credential access, system execution commands, destructive HTTP methods

Kill if: Mandatory writes, credential access, or system execution → REJECT

Pass 2 — Analysis (Deeper Look) < 10 minutes

4. Regression

Does it conflict with or supersede an existing capability? Check existing tools, registries, and integrations.

Kill if: Conflicts with existing capability → REJECT. Supersedes → needs blast-radius assessment

5. Duplication

Semantic check (not just name-matching): do we already cover 80%+ of this functionality?

Kill if: >80% duplicate of existing capability → REJECT

6. Quality / Maturity

8-question score: clear problem statement, actionable steps, working code, portable, versioned, production-used, substantial (>100 lines), standard format

Kill if: Fails quality threshold → REJECT

Pass 3 — Deep Integration < 30 minutes

7. Integration

Can it wire into our existing skill paths, tools, health checks, cadence, and resource search order? Effort rated: Drop-in / Adapt / Build / Incompatible.

Kill if: Incompatible with our infrastructure → REJECT

8. Unit / String Test

Execute the 3 most critical steps literally in our environment. Do they actually work?

Kill if: Execution fails → REJECT

9. Blast Radius

What files, tools, and processes does it touch? If it gives bad instructions, what breaks? Rated: Contained / Moderate / Wide / Critical.

Kill if: Critical or Wide blast radius → REJECT or escalate for human approval

Bonus Lenses (Applied When Relevant)

10. IP / Licensing

Contains credentials, proprietary content, or brand material from another organization?

Kill if: Yes → REJECT (do not alert the source — flag internally for review)

11. Maintenance / Decay

Is it fragile (tied to browser selectors, volatile APIs) or evergreen (methodology-based)?

Kill if: Fragile + only medium relevance → REJECT (not worth maintaining)

How This Maps to Familiar Terms

Regression Test

Lens 4 (Regression) + Lens 9 (Blast Radius) — "Does adding this break anything we already have?"

String Test

Lens 8 (Unit/String) — "Run it end-to-end in a single thread. Does it actually work?"

Integration Test

Lens 7 (Integration) — "Does it wire into our existing systems cleanly?"

What Else?

Relevance, Provenance, Security, Duplication, Quality, IP, and Decay — the lenses that catch what traditional testing misses

3. Advisory panel — The Frameworks Behind the Lenses

Five established experts whose published methods validate and extend our multi-lens approach. Their research identified 6 gaps in our original methodology — the upgraded version runs 15 lenses across 4 passes.

Result: 11 Lenses → 15 Lenses, 3 Passes → 4 Passes

The advisory panel found 6 gaps in our original methodology:

Quality-attribute scenario analysis — we checked "does it work" but never formally tested specific quality attributes (performance, security, modifiability) with measurable scenarios
Tradeoff/sensitivity identification — we didn't identify where a skill improves one attribute while degrading another
Post-adoption fitness functions — evaluation was point-in-time only; no automated checks that an adopted skill continues to meet its promises
Operational risk assessment — we never asked "what happens when this skill fails? Does it cascade? Block? Timeout?"
Capability impact assessment — we didn't measure whether adoption improves or degrades our delivery capabilities across 5 domains
Reversibility/exit cost — we never evaluated how hard it is to REMOVE a skill if adoption fails

1. Rick Kazman (+ Len Bass, Paul Clements)

SEI / Carnegie Mellon University

▶

ATAM — Architecture Tradeoff Analysis Method

The gold standard for evaluating architectural decisions. ATAM's 9-step process identifies sensitivity points (where small changes break quality attributes) and tradeoff points (where improving one attribute degrades another). Before adopting a component, ATAM says: build formal quality-attribute scenarios with priority ratings, not just "does it work?"

"Software Architecture in Practice" (Bass, Clements, Kazman — 4th ed., Addison-Wesley, 2021)

Lens 5: Sensitivity/Tradeoffs Lens 8: Quality-Attribute Scenarios (new) Lens 4: Reversibility (new)

2. Neal Ford (+ Rebecca Parsons, Patrick Kua)

ThoughtWorks

▶

Evolutionary Architecture + Fitness Functions

They invented "architectural fitness functions" — automated, continuous checks that verify a system still meets its architectural characteristics after any change. Our biggest gap: evaluation was point-in-time only. Ford/Parsons say an architecture not protected by fitness functions will degrade. Every adopted skill now requires at least one fitness function before adoption is complete.

"Building Evolutionary Architectures" (Ford, Parsons, Kua — 2nd ed., O'Reilly, 2023)

Lens 13: Fitness Functions (new — entirely new pass)

3. Adam Tornhill

CodeScene (founder)

▶

Behavioral Code Analysis / CodeHealth

Evaluates code not by static metrics alone but by how it's actually used — hotspot analysis (which files change most), temporal coupling (what changes together), complexity trends over time. Research: ~4% of files contain ~70% of bugs. When evaluating a skill for adoption, Tornhill asks: what's its code health? Is it a hotspot waiting to happen? Is complexity getting better or worse over versions?

"Your Code as a Crime Scene" (Tornhill — 2nd ed., Pragmatic Bookshelf, 2024; Best Paper Award, 7th Intl. Conference on Technical Debt)

Lens 7: Behavioral Quality (upgraded) Lens 15: Maintenance Trajectory (upgraded)

4. Nicole Forsgren (+ Jez Humble, Gene Kim)

DORA / Google (now Microsoft GitHub)

▶

DORA Capabilities Model + Four Key Metrics

The most rigorous capability assessment in software engineering, based on 39,000+ survey responses. 24 capabilities across 5 domains (Continuous Delivery, Architecture, Product/Process, Lean Management, Culture). DORA asks: does adopting this skill improve or degrade your deployment frequency, lead time, recovery time, and change failure rate? This is capability impact assessment, not just feature assessment.

"Accelerate: The Science of Lean Software and DevOps" (Forsgren, Humble, Kim — IT Revolution Press, 2018)

Lens 12: Capability Impact (new)

5. Michael Nygard

Independent / Cognitect (formerly)

▶

Stability Patterns & Antipatterns

The canonical reference for production-readiness evaluation. 9 stability patterns (Circuit Breaker, Bulkhead, Timeout, Fail Fast, Steady State, Shed Load, Governor) and 9 antipatterns (Cascading Failure, Integration Points, Blocked Threads, Unbounded Results). Nygard asks: if you adopt this component and it FAILS, what happens? Does it cascade? Block? Have timeouts? This operational-risk lens was entirely absent from our original methodology.

"Release It! Design and Deploy Production-Ready Software" (Nygard — 2nd ed., Pragmatic Bookshelf, 2018)

Lens 11: Operational Risk (upgraded)

Supplementary: Chip Huyen (AI-Specific Evaluation)

AI Agent Evaluation Framework

From "AI Engineering" (O'Reilly, 2025). For AI-specific skills: test planning quality (valid plan rate, false completion rate) separately from execution quality. Each tool tested independently before integration. Applies conditionally to Lens 10 for AI skills only.