feat: CRM Clinicas SaaS - MVP completo
- Auth: Login/Register con creacion de clinica - Dashboard: KPIs reales, graficas recharts - Pacientes: CRUD completo con busqueda - Agenda: FullCalendar, drag-and-drop, vista recepcion - Expediente: Notas SOAP, signos vitales, CIE-10 - Facturacion: Facturas con IVA, campos CFDI SAT - Inventario: Productos, stock, movimientos, alertas - Configuracion: Clinica, equipo, catalogo servicios - Supabase self-hosted: 18 tablas con RLS multi-tenant - Docker + Nginx para produccion Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
545
.claude/skills/agentdb-learning/SKILL.md
Normal file
545
.claude/skills/agentdb-learning/SKILL.md
Normal file
@@ -0,0 +1,545 @@
|
||||
---
|
||||
name: "AgentDB Learning Plugins"
|
||||
description: "Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and more. Use when building self-learning agents, implementing RL, or optimizing agent behavior through experience."
|
||||
---
|
||||
|
||||
# AgentDB Learning Plugins
|
||||
|
||||
## What This Skill Does
|
||||
|
||||
Provides access to 9 reinforcement learning algorithms via AgentDB's plugin system. Create, train, and deploy learning plugins for autonomous agents that improve through experience. Includes offline RL (Decision Transformer), value-based learning (Q-Learning), policy gradients (Actor-Critic), and advanced techniques.
|
||||
|
||||
**Performance**: Train models 10-100x faster with WASM-accelerated neural inference.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Node.js 18+
|
||||
- AgentDB v1.0.7+ (via agentic-flow)
|
||||
- Basic understanding of reinforcement learning (recommended)
|
||||
|
||||
---
|
||||
|
||||
## Quick Start with CLI
|
||||
|
||||
### Create Learning Plugin
|
||||
|
||||
```bash
|
||||
# Interactive wizard
|
||||
npx agentdb@latest create-plugin
|
||||
|
||||
# Use specific template
|
||||
npx agentdb@latest create-plugin -t decision-transformer -n my-agent
|
||||
|
||||
# Preview without creating
|
||||
npx agentdb@latest create-plugin -t q-learning --dry-run
|
||||
|
||||
# Custom output directory
|
||||
npx agentdb@latest create-plugin -t actor-critic -o ./plugins
|
||||
```
|
||||
|
||||
### List Available Templates
|
||||
|
||||
```bash
|
||||
# Show all plugin templates
|
||||
npx agentdb@latest list-templates
|
||||
|
||||
# Available templates:
|
||||
# - decision-transformer (sequence modeling RL - recommended)
|
||||
# - q-learning (value-based learning)
|
||||
# - sarsa (on-policy TD learning)
|
||||
# - actor-critic (policy gradient with baseline)
|
||||
# - curiosity-driven (exploration-based)
|
||||
```
|
||||
|
||||
### Manage Plugins
|
||||
|
||||
```bash
|
||||
# List installed plugins
|
||||
npx agentdb@latest list-plugins
|
||||
|
||||
# Get plugin information
|
||||
npx agentdb@latest plugin-info my-agent
|
||||
|
||||
# Shows: algorithm, configuration, training status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start with API
|
||||
|
||||
```typescript
|
||||
import { createAgentDBAdapter } from 'agentic-flow/reasoningbank';
|
||||
|
||||
// Initialize with learning enabled
|
||||
const adapter = await createAgentDBAdapter({
|
||||
dbPath: '.agentdb/learning.db',
|
||||
enableLearning: true, // Enable learning plugins
|
||||
enableReasoning: true,
|
||||
cacheSize: 1000,
|
||||
});
|
||||
|
||||
// Store training experience
|
||||
await adapter.insertPattern({
|
||||
id: '',
|
||||
type: 'experience',
|
||||
domain: 'game-playing',
|
||||
pattern_data: JSON.stringify({
|
||||
embedding: await computeEmbedding('state-action-reward'),
|
||||
pattern: {
|
||||
state: [0.1, 0.2, 0.3],
|
||||
action: 2,
|
||||
reward: 1.0,
|
||||
next_state: [0.15, 0.25, 0.35],
|
||||
done: false
|
||||
}
|
||||
}),
|
||||
confidence: 0.9,
|
||||
usage_count: 1,
|
||||
success_count: 1,
|
||||
created_at: Date.now(),
|
||||
last_used: Date.now(),
|
||||
});
|
||||
|
||||
// Train learning model
|
||||
const metrics = await adapter.train({
|
||||
epochs: 50,
|
||||
batchSize: 32,
|
||||
});
|
||||
|
||||
console.log('Training Loss:', metrics.loss);
|
||||
console.log('Duration:', metrics.duration, 'ms');
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Available Learning Algorithms (9 Total)
|
||||
|
||||
### 1. Decision Transformer (Recommended)
|
||||
|
||||
**Type**: Offline Reinforcement Learning
|
||||
**Best For**: Learning from logged experiences, imitation learning
|
||||
**Strengths**: No online interaction needed, stable training
|
||||
|
||||
```bash
|
||||
npx agentdb@latest create-plugin -t decision-transformer -n dt-agent
|
||||
```
|
||||
|
||||
**Use Cases**:
|
||||
- Learn from historical data
|
||||
- Imitation learning from expert demonstrations
|
||||
- Safe learning without environment interaction
|
||||
- Sequence modeling tasks
|
||||
|
||||
**Configuration**:
|
||||
```json
|
||||
{
|
||||
"algorithm": "decision-transformer",
|
||||
"model_size": "base",
|
||||
"context_length": 20,
|
||||
"embed_dim": 128,
|
||||
"n_heads": 8,
|
||||
"n_layers": 6
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Q-Learning
|
||||
|
||||
**Type**: Value-Based RL (Off-Policy)
|
||||
**Best For**: Discrete action spaces, sample efficiency
|
||||
**Strengths**: Proven, simple, works well for small/medium problems
|
||||
|
||||
```bash
|
||||
npx agentdb@latest create-plugin -t q-learning -n q-agent
|
||||
```
|
||||
|
||||
**Use Cases**:
|
||||
- Grid worlds, board games
|
||||
- Navigation tasks
|
||||
- Resource allocation
|
||||
- Discrete decision-making
|
||||
|
||||
**Configuration**:
|
||||
```json
|
||||
{
|
||||
"algorithm": "q-learning",
|
||||
"learning_rate": 0.001,
|
||||
"gamma": 0.99,
|
||||
"epsilon": 0.1,
|
||||
"epsilon_decay": 0.995
|
||||
}
|
||||
```
|
||||
|
||||
### 3. SARSA
|
||||
|
||||
**Type**: Value-Based RL (On-Policy)
|
||||
**Best For**: Safe exploration, risk-sensitive tasks
|
||||
**Strengths**: More conservative than Q-Learning, better for safety
|
||||
|
||||
```bash
|
||||
npx agentdb@latest create-plugin -t sarsa -n sarsa-agent
|
||||
```
|
||||
|
||||
**Use Cases**:
|
||||
- Safety-critical applications
|
||||
- Risk-sensitive decision-making
|
||||
- Online learning with exploration
|
||||
|
||||
**Configuration**:
|
||||
```json
|
||||
{
|
||||
"algorithm": "sarsa",
|
||||
"learning_rate": 0.001,
|
||||
"gamma": 0.99,
|
||||
"epsilon": 0.1
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Actor-Critic
|
||||
|
||||
**Type**: Policy Gradient with Value Baseline
|
||||
**Best For**: Continuous actions, variance reduction
|
||||
**Strengths**: Stable, works for continuous/discrete actions
|
||||
|
||||
```bash
|
||||
npx agentdb@latest create-plugin -t actor-critic -n ac-agent
|
||||
```
|
||||
|
||||
**Use Cases**:
|
||||
- Continuous control (robotics, simulations)
|
||||
- Complex action spaces
|
||||
- Multi-agent coordination
|
||||
|
||||
**Configuration**:
|
||||
```json
|
||||
{
|
||||
"algorithm": "actor-critic",
|
||||
"actor_lr": 0.001,
|
||||
"critic_lr": 0.002,
|
||||
"gamma": 0.99,
|
||||
"entropy_coef": 0.01
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Active Learning
|
||||
|
||||
**Type**: Query-Based Learning
|
||||
**Best For**: Label-efficient learning, human-in-the-loop
|
||||
**Strengths**: Minimizes labeling cost, focuses on uncertain samples
|
||||
|
||||
**Use Cases**:
|
||||
- Human feedback incorporation
|
||||
- Label-efficient training
|
||||
- Uncertainty sampling
|
||||
- Annotation cost reduction
|
||||
|
||||
### 6. Adversarial Training
|
||||
|
||||
**Type**: Robustness Enhancement
|
||||
**Best For**: Safety, robustness to perturbations
|
||||
**Strengths**: Improves model robustness, adversarial defense
|
||||
|
||||
**Use Cases**:
|
||||
- Security applications
|
||||
- Robust decision-making
|
||||
- Adversarial defense
|
||||
- Safety testing
|
||||
|
||||
### 7. Curriculum Learning
|
||||
|
||||
**Type**: Progressive Difficulty Training
|
||||
**Best For**: Complex tasks, faster convergence
|
||||
**Strengths**: Stable learning, faster convergence on hard tasks
|
||||
|
||||
**Use Cases**:
|
||||
- Complex multi-stage tasks
|
||||
- Hard exploration problems
|
||||
- Skill composition
|
||||
- Transfer learning
|
||||
|
||||
### 8. Federated Learning
|
||||
|
||||
**Type**: Distributed Learning
|
||||
**Best For**: Privacy, distributed data
|
||||
**Strengths**: Privacy-preserving, scalable
|
||||
|
||||
**Use Cases**:
|
||||
- Multi-agent systems
|
||||
- Privacy-sensitive data
|
||||
- Distributed training
|
||||
- Collaborative learning
|
||||
|
||||
### 9. Multi-Task Learning
|
||||
|
||||
**Type**: Transfer Learning
|
||||
**Best For**: Related tasks, knowledge sharing
|
||||
**Strengths**: Faster learning on new tasks, better generalization
|
||||
|
||||
**Use Cases**:
|
||||
- Task families
|
||||
- Transfer learning
|
||||
- Domain adaptation
|
||||
- Meta-learning
|
||||
|
||||
---
|
||||
|
||||
## Training Workflow
|
||||
|
||||
### 1. Collect Experiences
|
||||
|
||||
```typescript
|
||||
// Store experiences during agent execution
|
||||
for (let i = 0; i < numEpisodes; i++) {
|
||||
const episode = runEpisode();
|
||||
|
||||
for (const step of episode.steps) {
|
||||
await adapter.insertPattern({
|
||||
id: '',
|
||||
type: 'experience',
|
||||
domain: 'task-domain',
|
||||
pattern_data: JSON.stringify({
|
||||
embedding: await computeEmbedding(JSON.stringify(step)),
|
||||
pattern: {
|
||||
state: step.state,
|
||||
action: step.action,
|
||||
reward: step.reward,
|
||||
next_state: step.next_state,
|
||||
done: step.done
|
||||
}
|
||||
}),
|
||||
confidence: step.reward > 0 ? 0.9 : 0.5,
|
||||
usage_count: 1,
|
||||
success_count: step.reward > 0 ? 1 : 0,
|
||||
created_at: Date.now(),
|
||||
last_used: Date.now(),
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Train Model
|
||||
|
||||
```typescript
|
||||
// Train on collected experiences
|
||||
const trainingMetrics = await adapter.train({
|
||||
epochs: 100,
|
||||
batchSize: 64,
|
||||
learningRate: 0.001,
|
||||
validationSplit: 0.2,
|
||||
});
|
||||
|
||||
console.log('Training Metrics:', trainingMetrics);
|
||||
// {
|
||||
// loss: 0.023,
|
||||
// valLoss: 0.028,
|
||||
// duration: 1523,
|
||||
// epochs: 100
|
||||
// }
|
||||
```
|
||||
|
||||
### 3. Evaluate Performance
|
||||
|
||||
```typescript
|
||||
// Retrieve similar successful experiences
|
||||
const testQuery = await computeEmbedding(JSON.stringify(testState));
|
||||
const result = await adapter.retrieveWithReasoning(testQuery, {
|
||||
domain: 'task-domain',
|
||||
k: 10,
|
||||
synthesizeContext: true,
|
||||
});
|
||||
|
||||
// Evaluate action quality
|
||||
const suggestedAction = result.memories[0].pattern.action;
|
||||
const confidence = result.memories[0].similarity;
|
||||
|
||||
console.log('Suggested Action:', suggestedAction);
|
||||
console.log('Confidence:', confidence);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Training Techniques
|
||||
|
||||
### Experience Replay
|
||||
|
||||
```typescript
|
||||
// Store experiences in buffer
|
||||
const replayBuffer = [];
|
||||
|
||||
// Sample random batch for training
|
||||
const batch = sampleRandomBatch(replayBuffer, batchSize: 32);
|
||||
|
||||
// Train on batch
|
||||
await adapter.train({
|
||||
data: batch,
|
||||
epochs: 1,
|
||||
batchSize: 32,
|
||||
});
|
||||
```
|
||||
|
||||
### Prioritized Experience Replay
|
||||
|
||||
```typescript
|
||||
// Store experiences with priority (TD error)
|
||||
await adapter.insertPattern({
|
||||
// ... standard fields
|
||||
confidence: tdError, // Use TD error as confidence/priority
|
||||
// ...
|
||||
});
|
||||
|
||||
// Retrieve high-priority experiences
|
||||
const highPriority = await adapter.retrieveWithReasoning(queryEmbedding, {
|
||||
domain: 'task-domain',
|
||||
k: 32,
|
||||
minConfidence: 0.7, // Only high TD-error experiences
|
||||
});
|
||||
```
|
||||
|
||||
### Multi-Agent Training
|
||||
|
||||
```typescript
|
||||
// Collect experiences from multiple agents
|
||||
for (const agent of agents) {
|
||||
const experience = await agent.step();
|
||||
|
||||
await adapter.insertPattern({
|
||||
// ... store experience with agent ID
|
||||
domain: `multi-agent/${agent.id}`,
|
||||
});
|
||||
}
|
||||
|
||||
// Train shared model
|
||||
await adapter.train({
|
||||
epochs: 50,
|
||||
batchSize: 64,
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### Batch Training
|
||||
|
||||
```typescript
|
||||
// Collect batch of experiences
|
||||
const experiences = collectBatch(size: 1000);
|
||||
|
||||
// Batch insert (500x faster)
|
||||
for (const exp of experiences) {
|
||||
await adapter.insertPattern({ /* ... */ });
|
||||
}
|
||||
|
||||
// Train on batch
|
||||
await adapter.train({
|
||||
epochs: 10,
|
||||
batchSize: 128, // Larger batch for efficiency
|
||||
});
|
||||
```
|
||||
|
||||
### Incremental Learning
|
||||
|
||||
```typescript
|
||||
// Train incrementally as new data arrives
|
||||
setInterval(async () => {
|
||||
const newExperiences = getNewExperiences();
|
||||
|
||||
if (newExperiences.length > 100) {
|
||||
await adapter.train({
|
||||
epochs: 5,
|
||||
batchSize: 32,
|
||||
});
|
||||
}
|
||||
}, 60000); // Every minute
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration with Reasoning Agents
|
||||
|
||||
Combine learning with reasoning for better performance:
|
||||
|
||||
```typescript
|
||||
// Train learning model
|
||||
await adapter.train({ epochs: 50, batchSize: 32 });
|
||||
|
||||
// Use reasoning agents for inference
|
||||
const result = await adapter.retrieveWithReasoning(queryEmbedding, {
|
||||
domain: 'decision-making',
|
||||
k: 10,
|
||||
useMMR: true, // Diverse experiences
|
||||
synthesizeContext: true, // Rich context
|
||||
optimizeMemory: true, // Consolidate patterns
|
||||
});
|
||||
|
||||
// Make decision based on learned experiences + reasoning
|
||||
const decision = result.context.suggestedAction;
|
||||
const confidence = result.memories[0].similarity;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI Operations
|
||||
|
||||
```bash
|
||||
# Create plugin
|
||||
npx agentdb@latest create-plugin -t decision-transformer -n my-plugin
|
||||
|
||||
# List plugins
|
||||
npx agentdb@latest list-plugins
|
||||
|
||||
# Get plugin info
|
||||
npx agentdb@latest plugin-info my-plugin
|
||||
|
||||
# List templates
|
||||
npx agentdb@latest list-templates
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Issue: Training not converging
|
||||
```typescript
|
||||
// Reduce learning rate
|
||||
await adapter.train({
|
||||
epochs: 100,
|
||||
batchSize: 32,
|
||||
learningRate: 0.0001, // Lower learning rate
|
||||
});
|
||||
```
|
||||
|
||||
### Issue: Overfitting
|
||||
```typescript
|
||||
// Use validation split
|
||||
await adapter.train({
|
||||
epochs: 50,
|
||||
batchSize: 64,
|
||||
validationSplit: 0.2, // 20% validation
|
||||
});
|
||||
|
||||
// Enable memory optimization
|
||||
await adapter.retrieveWithReasoning(queryEmbedding, {
|
||||
optimizeMemory: true, // Consolidate, reduce overfitting
|
||||
});
|
||||
```
|
||||
|
||||
### Issue: Slow training
|
||||
```bash
|
||||
# Enable quantization for faster inference
|
||||
# Use binary quantization (32x faster)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Learn More
|
||||
|
||||
- **Algorithm Papers**: See docs/algorithms/ for detailed papers
|
||||
- **GitHub**: https://github.com/ruvnet/agentic-flow/tree/main/packages/agentdb
|
||||
- **MCP Integration**: `npx agentdb@latest mcp`
|
||||
- **Website**: https://agentdb.ruv.io
|
||||
|
||||
---
|
||||
|
||||
**Category**: Machine Learning / Reinforcement Learning
|
||||
**Difficulty**: Intermediate to Advanced
|
||||
**Estimated Time**: 30-60 minutes
|
||||
Reference in New Issue
Block a user