feat: CRM Clinicas SaaS - MVP completo

- Auth: Login/Register con creacion de clinica - Dashboard: KPIs reales, graficas recharts - Pacientes: CRUD completo con busqueda - Agenda: FullCalendar, drag-and-drop, vista recepcion - Expediente: Notas SOAP, signos vitales, CIE-10 - Facturacion: Facturas con IVA, campos CFDI SAT - Inventario: Productos, stock, movimientos, alertas - Configuracion: Clinica, equipo, catalogo servicios - Supabase self-hosted: 18 tablas con RLS multi-tenant - Docker + Nginx para produccion Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-03 07:04:14 +00:00
commit 79b5d86325
1612 changed files with 109181 additions and 0 deletions
--- a/.claude/skills/agentdb-learning/SKILL.md
+++ b/.claude/skills/agentdb-learning/SKILL.md
@@ -0,0 +1,545 @@
+---
+name: "AgentDB Learning Plugins"
+description: "Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and more. Use when building self-learning agents, implementing RL, or optimizing agent behavior through experience."
+---
+
+# AgentDB Learning Plugins
+
+## What This Skill Does
+
+Provides access to 9 reinforcement learning algorithms via AgentDB's plugin system. Create, train, and deploy learning plugins for autonomous agents that improve through experience. Includes offline RL (Decision Transformer), value-based learning (Q-Learning), policy gradients (Actor-Critic), and advanced techniques.
+
+**Performance**: Train models 10-100x faster with WASM-accelerated neural inference.
+
+## Prerequisites
+
+- Node.js 18+
+- AgentDB v1.0.7+ (via agentic-flow)
+- Basic understanding of reinforcement learning (recommended)
+
+---
+
+## Quick Start with CLI
+
+### Create Learning Plugin
+
+```bash
+# Interactive wizard
+npx agentdb@latest create-plugin
+
+# Use specific template
+npx agentdb@latest create-plugin -t decision-transformer -n my-agent
+
+# Preview without creating
+npx agentdb@latest create-plugin -t q-learning --dry-run
+
+# Custom output directory
+npx agentdb@latest create-plugin -t actor-critic -o ./plugins
+```
+
+### List Available Templates
+
+```bash
+# Show all plugin templates
+npx agentdb@latest list-templates
+
+# Available templates:
+# - decision-transformer (sequence modeling RL - recommended)
+# - q-learning (value-based learning)
+# - sarsa (on-policy TD learning)
+# - actor-critic (policy gradient with baseline)
+# - curiosity-driven (exploration-based)
+```
+
+### Manage Plugins
+
+```bash
+# List installed plugins
+npx agentdb@latest list-plugins
+
+# Get plugin information
+npx agentdb@latest plugin-info my-agent
+
+# Shows: algorithm, configuration, training status
+```
+
+---
+
+## Quick Start with API
+
+```typescript
+import { createAgentDBAdapter } from 'agentic-flow/reasoningbank';
+
+// Initialize with learning enabled
+const adapter = await createAgentDBAdapter({
+  dbPath: '.agentdb/learning.db',
+  enableLearning: true,       // Enable learning plugins
+  enableReasoning: true,
+  cacheSize: 1000,
+});
+
+// Store training experience
+await adapter.insertPattern({
+  id: '',
+  type: 'experience',
+  domain: 'game-playing',
+  pattern_data: JSON.stringify({
+    embedding: await computeEmbedding('state-action-reward'),
+    pattern: {
+      state: [0.1, 0.2, 0.3],
+      action: 2,
+      reward: 1.0,
+      next_state: [0.15, 0.25, 0.35],
+      done: false
+    }
+  }),
+  confidence: 0.9,
+  usage_count: 1,
+  success_count: 1,
+  created_at: Date.now(),
+  last_used: Date.now(),
+});
+
+// Train learning model
+const metrics = await adapter.train({
+  epochs: 50,
+  batchSize: 32,
+});
+
+console.log('Training Loss:', metrics.loss);
+console.log('Duration:', metrics.duration, 'ms');
+```
+
+---
+
+## Available Learning Algorithms (9 Total)
+
+### 1. Decision Transformer (Recommended)
+
+**Type**: Offline Reinforcement Learning
+**Best For**: Learning from logged experiences, imitation learning
+**Strengths**: No online interaction needed, stable training
+
+```bash
+npx agentdb@latest create-plugin -t decision-transformer -n dt-agent
+```
+
+**Use Cases**:
+- Learn from historical data
+- Imitation learning from expert demonstrations
+- Safe learning without environment interaction
+- Sequence modeling tasks
+
+**Configuration**:
+```json
+{
+  "algorithm": "decision-transformer",
+  "model_size": "base",
+  "context_length": 20,
+  "embed_dim": 128,
+  "n_heads": 8,
+  "n_layers": 6
+}
+```
+
+### 2. Q-Learning
+
+**Type**: Value-Based RL (Off-Policy)
+**Best For**: Discrete action spaces, sample efficiency
+**Strengths**: Proven, simple, works well for small/medium problems
+
+```bash
+npx agentdb@latest create-plugin -t q-learning -n q-agent
+```
+
+**Use Cases**:
+- Grid worlds, board games
+- Navigation tasks
+- Resource allocation
+- Discrete decision-making
+
+**Configuration**:
+```json
+{
+  "algorithm": "q-learning",
+  "learning_rate": 0.001,
+  "gamma": 0.99,
+  "epsilon": 0.1,
+  "epsilon_decay": 0.995
+}
+```
+
+### 3. SARSA
+
+**Type**: Value-Based RL (On-Policy)
+**Best For**: Safe exploration, risk-sensitive tasks
+**Strengths**: More conservative than Q-Learning, better for safety
+
+```bash
+npx agentdb@latest create-plugin -t sarsa -n sarsa-agent
+```
+
+**Use Cases**:
+- Safety-critical applications
+- Risk-sensitive decision-making
+- Online learning with exploration
+
+**Configuration**:
+```json
+{
+  "algorithm": "sarsa",
+  "learning_rate": 0.001,
+  "gamma": 0.99,
+  "epsilon": 0.1
+}
+```
+
+### 4. Actor-Critic
+
+**Type**: Policy Gradient with Value Baseline
+**Best For**: Continuous actions, variance reduction
+**Strengths**: Stable, works for continuous/discrete actions
+
+```bash
+npx agentdb@latest create-plugin -t actor-critic -n ac-agent
+```
+
+**Use Cases**:
+- Continuous control (robotics, simulations)
+- Complex action spaces
+- Multi-agent coordination
+
+**Configuration**:
+```json
+{
+  "algorithm": "actor-critic",
+  "actor_lr": 0.001,
+  "critic_lr": 0.002,
+  "gamma": 0.99,
+  "entropy_coef": 0.01
+}
+```
+
+### 5. Active Learning
+
+**Type**: Query-Based Learning
+**Best For**: Label-efficient learning, human-in-the-loop
+**Strengths**: Minimizes labeling cost, focuses on uncertain samples
+
+**Use Cases**:
+- Human feedback incorporation
+- Label-efficient training
+- Uncertainty sampling
+- Annotation cost reduction
+
+### 6. Adversarial Training
+
+**Type**: Robustness Enhancement
+**Best For**: Safety, robustness to perturbations
+**Strengths**: Improves model robustness, adversarial defense
+
+**Use Cases**:
+- Security applications
+- Robust decision-making
+- Adversarial defense
+- Safety testing
+
+### 7. Curriculum Learning
+
+**Type**: Progressive Difficulty Training
+**Best For**: Complex tasks, faster convergence
+**Strengths**: Stable learning, faster convergence on hard tasks
+
+**Use Cases**:
+- Complex multi-stage tasks
+- Hard exploration problems
+- Skill composition
+- Transfer learning
+
+### 8. Federated Learning
+
+**Type**: Distributed Learning
+**Best For**: Privacy, distributed data
+**Strengths**: Privacy-preserving, scalable
+
+**Use Cases**:
+- Multi-agent systems
+- Privacy-sensitive data
+- Distributed training
+- Collaborative learning
+
+### 9. Multi-Task Learning
+
+**Type**: Transfer Learning
+**Best For**: Related tasks, knowledge sharing
+**Strengths**: Faster learning on new tasks, better generalization
+
+**Use Cases**:
+- Task families
+- Transfer learning
+- Domain adaptation
+- Meta-learning
+
+---
+
+## Training Workflow
+
+### 1. Collect Experiences
+
+```typescript
+// Store experiences during agent execution
+for (let i = 0; i < numEpisodes; i++) {
+  const episode = runEpisode();
+
+  for (const step of episode.steps) {
+    await adapter.insertPattern({
+      id: '',
+      type: 'experience',
+      domain: 'task-domain',
+      pattern_data: JSON.stringify({
+        embedding: await computeEmbedding(JSON.stringify(step)),
+        pattern: {
+          state: step.state,
+          action: step.action,
+          reward: step.reward,
+          next_state: step.next_state,
+          done: step.done
+        }
+      }),
+      confidence: step.reward > 0 ? 0.9 : 0.5,
+      usage_count: 1,
+      success_count: step.reward > 0 ? 1 : 0,
+      created_at: Date.now(),
+      last_used: Date.now(),
+    });
+  }
+}
+```
+
+### 2. Train Model
+
+```typescript
+// Train on collected experiences
+const trainingMetrics = await adapter.train({
+  epochs: 100,
+  batchSize: 64,
+  learningRate: 0.001,
+  validationSplit: 0.2,
+});
+
+console.log('Training Metrics:', trainingMetrics);
+// {
+//   loss: 0.023,
+//   valLoss: 0.028,
+//   duration: 1523,
+//   epochs: 100
+// }
+```
+
+### 3. Evaluate Performance
+
+```typescript
+// Retrieve similar successful experiences
+const testQuery = await computeEmbedding(JSON.stringify(testState));
+const result = await adapter.retrieveWithReasoning(testQuery, {
+  domain: 'task-domain',
+  k: 10,
+  synthesizeContext: true,
+});
+
+// Evaluate action quality
+const suggestedAction = result.memories[0].pattern.action;
+const confidence = result.memories[0].similarity;
+
+console.log('Suggested Action:', suggestedAction);
+console.log('Confidence:', confidence);
+```
+
+---
+
+## Advanced Training Techniques
+
+### Experience Replay
+
+```typescript
+// Store experiences in buffer
+const replayBuffer = [];
+
+// Sample random batch for training
+const batch = sampleRandomBatch(replayBuffer, batchSize: 32);
+
+// Train on batch
+await adapter.train({
+  data: batch,
+  epochs: 1,
+  batchSize: 32,
+});
+```
+
+### Prioritized Experience Replay
+
+```typescript
+// Store experiences with priority (TD error)
+await adapter.insertPattern({
+  // ... standard fields
+  confidence: tdError,  // Use TD error as confidence/priority
+  // ...
+});
+
+// Retrieve high-priority experiences
+const highPriority = await adapter.retrieveWithReasoning(queryEmbedding, {
+  domain: 'task-domain',
+  k: 32,
+  minConfidence: 0.7,  // Only high TD-error experiences
+});
+```
+
+### Multi-Agent Training
+
+```typescript
+// Collect experiences from multiple agents
+for (const agent of agents) {
+  const experience = await agent.step();
+
+  await adapter.insertPattern({
+    // ... store experience with agent ID
+    domain: `multi-agent/${agent.id}`,
+  });
+}
+
+// Train shared model
+await adapter.train({
+  epochs: 50,
+  batchSize: 64,
+});
+```
+
+---
+
+## Performance Optimization
+
+### Batch Training
+
+```typescript
+// Collect batch of experiences
+const experiences = collectBatch(size: 1000);
+
+// Batch insert (500x faster)
+for (const exp of experiences) {
+  await adapter.insertPattern({ /* ... */ });
+}
+
+// Train on batch
+await adapter.train({
+  epochs: 10,
+  batchSize: 128,  // Larger batch for efficiency
+});
+```
+
+### Incremental Learning
+
+```typescript
+// Train incrementally as new data arrives
+setInterval(async () => {
+  const newExperiences = getNewExperiences();
+
+  if (newExperiences.length > 100) {
+    await adapter.train({
+      epochs: 5,
+      batchSize: 32,
+    });
+  }
+}, 60000);  // Every minute
+```
+
+---
+
+## Integration with Reasoning Agents
+
+Combine learning with reasoning for better performance:
+
+```typescript
+// Train learning model
+await adapter.train({ epochs: 50, batchSize: 32 });
+
+// Use reasoning agents for inference
+const result = await adapter.retrieveWithReasoning(queryEmbedding, {
+  domain: 'decision-making',
+  k: 10,
+  useMMR: true,              // Diverse experiences
+  synthesizeContext: true,    // Rich context
+  optimizeMemory: true,       // Consolidate patterns
+});
+
+// Make decision based on learned experiences + reasoning
+const decision = result.context.suggestedAction;
+const confidence = result.memories[0].similarity;
+```
+
+---
+
+## CLI Operations
+
+```bash
+# Create plugin
+npx agentdb@latest create-plugin -t decision-transformer -n my-plugin
+
+# List plugins
+npx agentdb@latest list-plugins
+
+# Get plugin info
+npx agentdb@latest plugin-info my-plugin
+
+# List templates
+npx agentdb@latest list-templates
+```
+
+---
+
+## Troubleshooting
+
+### Issue: Training not converging
+```typescript
+// Reduce learning rate
+await adapter.train({
+  epochs: 100,
+  batchSize: 32,
+  learningRate: 0.0001,  // Lower learning rate
+});
+```
+
+### Issue: Overfitting
+```typescript
+// Use validation split
+await adapter.train({
+  epochs: 50,
+  batchSize: 64,
+  validationSplit: 0.2,  // 20% validation
+});
+
+// Enable memory optimization
+await adapter.retrieveWithReasoning(queryEmbedding, {
+  optimizeMemory: true,  // Consolidate, reduce overfitting
+});
+```
+
+### Issue: Slow training
+```bash
+# Enable quantization for faster inference
+# Use binary quantization (32x faster)
+```
+
+---
+
+## Learn More
+
+- **Algorithm Papers**: See docs/algorithms/ for detailed papers
+- **GitHub**: https://github.com/ruvnet/agentic-flow/tree/main/packages/agentdb
+- **MCP Integration**: `npx agentdb@latest mcp`
+- **Website**: https://agentdb.ruv.io
+
+---
+
+**Category**: Machine Learning / Reinforcement Learning
+**Difficulty**: Intermediate to Advanced
+**Estimated Time**: 30-60 minutes