bdistill-knowledge-extraction
استخرج المعرفة المجال المنظمة من نماذج الذكاء الاصطناعي في الجلسة أو من نماذج مفتوحة المصدر محلية عبر Ollama. لا يتطلب مفتاح API.
محتوى هذه المهارة بلغته الأصلية (غالبًا الإنجليزية).
Knowledge Extraction
Extract structured, quality-scored domain knowledge from any AI model — in-session from closed models (no API key) or locally from open-source models via Ollama.
Overview
bdistill turns your AI subscription sessions into a compounding knowledge base. The agent answers targeted domain questions, bdistill structures and quality-scores the responses, and the output accumulates into a searchable, exportable reference dataset.
Adversarial mode challenges the agent's claims — forcing evidence, corrections, and acknowledged limitations — producing validated knowledge entries.
When to Use This Skill
- Use when you need structured reference data on any domain (medical, legal, finance, cybersecurity)
- Use when building lookup tables, Q&A datasets, or research corpora
- Use when generating training data for traditional ML models (regression, classification — NOT competing LLMs)
- Use when you want cross-model comparison on domain knowledge
How It Works
Step 1: Install
pip install bdistill
claude mcp add bdistill -- bdistill-mcp # Claude Code
Step 2: Extract knowledge in-session
/distill medical cardiology # Preset domain
/distill --custom kubernetes docker helm # Custom terms
/distill --adversarial medical # With adversarial validation
Step 3: Search, export, compound
bdistill kb list # Show all domains
bdistill kb search "atrial fibrillation" # Keyword search
bdistill kb export -d medical -f csv # Export as spreadsheet
bdistill kb export -d medical -f markdown # Readable knowledge document
Output Format
Structured reference JSONL — not training data:
{
"question": "What causes myocardial infarction?",
"answer": "Myocardial infarction results from acute coronary artery occlusion...",
"domain": "medical",
"category": "cardiology",
"tags": ["mechanistic", "evidence-based"],
"quality_score": 0.73,
"confidence": 1.08,
"validated": true,
"source_model": "Claude Sonnet 4"
}
Tabular ML Data Generation
Generate structured training data for traditional ML models:
/schema sepsis | hr:float, bp:float, temp:float, wbc:float | risk:category[low,moderate,high,critical]
Exports as CSV ready for pandas/sklearn. Each row tracks source_model for cross-model analysis.
Local Model Extraction (Ollama)
For open-source models running locally:
# Install Ollama from https://ollama.com
ollama serve
ollama pull qwen3:4b
bdistill extract --domain medical --model qwen3:4b
Security & Safety Notes
- In-session extraction uses your existing subscription — no additional API keys
- Local extraction runs entirely on your machine via Ollama
- No data is sent to external services
- Output is reference data, not LLM training format
Related Skills
@bdistill-behavioral-xray- X-ray a model's behavioral patterns
Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.