diff --git a/docs/superpowers/specs/2026-03-15-saas-transformation-design.md b/docs/superpowers/specs/2026-03-15-saas-transformation-design.md index 48946fb..a6974d0 100644 --- a/docs/superpowers/specs/2026-03-15-saas-transformation-design.md +++ b/docs/superpowers/specs/2026-03-15-saas-transformation-design.md @@ -38,13 +38,29 @@ PostgreSQL Server (max_connections: 300) Existing tables (modified): - `tenants` — add `database_name` column, remove `schema_name` - `users` — no changes -- `refresh_tokens` — no changes +- `refresh_tokens` — flush all existing tokens at migration cutover (invalidate all sessions) - `fiel_credentials` — no changes New tables: - `subscriptions` — MercadoPago subscription tracking - `payments` — payment history +### Prisma schema migration + +The Prisma schema (`apps/api/prisma/schema.prisma`) must be updated: +- Replace `schema_name String @unique @map("schema_name")` with `database_name String @unique @map("database_name")` on the `Tenant` model +- Add `Subscription` and `Payment` models +- Run `prisma migrate dev` to generate and apply migration +- Update `Tenant` type in `packages/shared/src/types/tenant.ts`: replace `schemaName` with `databaseName` + +### JWT payload migration + +The current JWT payload embeds `schemaName`. This must change: +- Update `JWTPayload` in `packages/shared/src/types/auth.ts`: replace `schemaName` with `databaseName` +- Update token generation in `auth.service.ts`: read `tenant.databaseName` instead of `tenant.schemaName` +- Update `refreshTokens` function to embed `databaseName` +- At migration cutover: flush `refresh_tokens` table to invalidate all existing sessions (forces re-login) + ### Client DB naming Formula: `horux_` @@ -85,10 +101,12 @@ class TenantConnectionManager { ``` Pool configuration per tenant: -- `max`: 5 connections +- `max`: 3 connections (with 2 PM2 cluster instances, this means 6 connections/tenant max; at 50 tenants = 300, matching `max_connections`) - `idleTimeoutMillis`: 300000 (5 min) - `connectionTimeoutMillis`: 10000 (10 sec) +**Note on PM2 cluster mode:** Each PM2 worker is a separate Node.js process with its own `TenantConnectionManager` instance. With `instances: 2` and `max: 3` per pool, worst case is 50 tenants × 3 connections × 2 workers = 300 connections, which matches `max_connections = 300`. If scaling beyond 50 tenants, either increase `max_connections` or reduce pool `max` to 2. + ### Tenant middleware change Current: Sets `search_path` on a shared connection. @@ -105,6 +123,20 @@ req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName) All tenant service functions change from using a shared pool with schema prefix to using `req.tenantPool` with direct table names. +### Admin impersonation (X-View-Tenant) + +The current `X-View-Tenant` header support for admin "view-as" functionality is preserved. The new middleware resolves the `databaseName` for the viewed tenant: + +```typescript +// If admin is viewing another tenant +if (req.headers['x-view-tenant'] && req.user.role === 'admin') { + const viewedTenant = await getTenantByRfc(req.headers['x-view-tenant']); + req.tenantPool = tenantConnectionManager.getPool(viewedTenant.id, viewedTenant.databaseName); +} else { + req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName); +} +``` + ### Provisioning flow (new client) 1. Admin creates tenant via UI → POST `/api/tenants/` @@ -115,6 +147,13 @@ All tenant service functions change from using a shared pool with schema prefix 6. Send welcome email with temporary credentials 7. Generate MercadoPago subscription link +**Rollback on partial failure:** If any step 3-7 fails: +- Drop the created database if it exists (`DROP DATABASE IF EXISTS horux_`) +- Delete the `tenants` row +- Delete the `users` row if created +- Return error to admin with the specific step that failed +- The entire provisioning is wrapped in a try/catch with explicit cleanup + ### PostgreSQL tuning ``` @@ -162,7 +201,9 @@ When a client uploads their FIEL (.cer + .key + password): - Algorithm: AES-256-GCM - Key: `FIEL_ENCRYPTION_KEY` environment variable (separate from other secrets) -- Each file gets its own IV (initialization vector) +- **Code change required:** `sat-crypto.service.ts` currently derives the key from `JWT_SECRET` via `createHash('sha256').update(env.JWT_SECRET).digest()`. This must be changed to read `FIEL_ENCRYPTION_KEY` from the env schema. The `env.ts` Zod schema must be updated to declare `FIEL_ENCRYPTION_KEY` as required. +- Each component (certificate, private key, password) is encrypted separately with its own IV and auth tag. The `fiel_credentials` table stores separate `encryption_iv` and `encryption_tag` per row. The filesystem also stores each file independently encrypted. +- **Code change required:** The current `sat-crypto.service.ts` shares a single IV/tag across all three components. Refactor to encrypt each component independently with its own IV/tag. Store per-component IV/tags in the DB (add columns: `cer_iv`, `cer_tag`, `key_iv`, `key_tag`, `password_iv`, `password_tag` — or use a JSON column). - Password is encrypted, never stored in plaintext ### Manual decryption CLI @@ -179,7 +220,7 @@ node scripts/decrypt-fiel.js --rfc CAS2408138W2 - `/var/horux/fiel/` permissions: `700` (root only) - Encrypted files are useless without `FIEL_ENCRYPTION_KEY` -- `metadata.json` is NOT encrypted (contains only non-sensitive info: serial number, validity dates) +- `metadata.json` is also encrypted (contains serial number + RFC which could be used to query SAT's certificate validation service, violating NDA confidentiality requirements) ### Upload flow @@ -216,6 +257,9 @@ CREATE TABLE subscriptions ( updated_at TIMESTAMP NOT NULL DEFAULT NOW() ); +CREATE INDEX idx_subscriptions_tenant_id ON subscriptions(tenant_id); +CREATE INDEX idx_subscriptions_status ON subscriptions(status); + CREATE TABLE payments ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id UUID NOT NULL REFERENCES tenants(id), @@ -228,6 +272,9 @@ CREATE TABLE payments ( paid_at TIMESTAMP, created_at TIMESTAMP NOT NULL DEFAULT NOW() ); + +CREATE INDEX idx_payments_tenant_id ON payments(tenant_id); +CREATE INDEX idx_payments_subscription_id ON payments(subscription_id); ``` ### Plans and pricing @@ -398,6 +445,10 @@ Auto-restart on crash. Log rotation via `pm2-logrotate`. ### Nginx reverse proxy ```nginx +# Rate limiting zone definitions (in http block of nginx.conf) +limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m; +limit_req_zone $binary_remote_addr zone=webhooks:10m rate=30r/m; + server { listen 80; server_name horux360.consultoria-as.com; @@ -420,6 +471,11 @@ server { gzip on; gzip_types text/plain application/json application/javascript text/css; + # Health check (for monitoring) + location /api/health { + proxy_pass http://127.0.0.1:4000; + } + # Rate limiting for public endpoints location /api/auth/ { limit_req zone=auth burst=5 nodelay; @@ -438,7 +494,7 @@ server { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; - client_max_body_size 1G; # For bulk XML uploads + client_max_body_size 200M; # Bulk XML uploads (200MB is enough for ~50k XML files) } # Next.js @@ -452,6 +508,10 @@ server { } ``` +### Health check endpoint + +The existing `GET /health` endpoint returns `{ status: 'ok', timestamp }`. PM2 uses this for liveness checks. Nginx can optionally use it for upstream health monitoring. + ### SSL Let's Encrypt with certbot. Auto-renewal via cron. @@ -473,29 +533,53 @@ PostgreSQL only on localhost (no external access). ### Backups -Cron job at 2:00 AM daily: +Cron job at **1:00 AM** daily (runs before SAT cron at 3:00 AM, with enough gap to complete): + +**Authentication:** Create a `.pgpass` file at `/root/.pgpass` with `localhost:5432:*:postgres:` and `chmod 600`. This allows `pg_dump` to authenticate without inline passwords. ```bash #!/bin/bash # /var/horux/scripts/backup.sh +set -euo pipefail + BACKUP_DIR=/var/horux/backups DATE=$(date +%Y-%m-%d) +DOW=$(date +%u) # Day of week: 1=Monday, 7=Sunday +DAILY_DIR=$BACKUP_DIR/daily +WEEKLY_DIR=$BACKUP_DIR/weekly + +mkdir -p $DAILY_DIR $WEEKLY_DIR # Backup central DB -pg_dump -h localhost -U postgres horux360 | gzip > $BACKUP_DIR/horux360_$DATE.sql.gz +pg_dump -h localhost -U postgres horux360 | gzip > $DAILY_DIR/horux360_$DATE.sql.gz # Backup each tenant DB -for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true"); do - pg_dump -h localhost -U postgres $db | gzip > $BACKUP_DIR/${db}_${DATE}.sql.gz +for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true" horux360); do + db_trimmed=$(echo $db | xargs) # trim whitespace + pg_dump -h localhost -U postgres "$db_trimmed" | gzip > $DAILY_DIR/${db_trimmed}_${DATE}.sql.gz done -# Remove backups older than 7 days -find $BACKUP_DIR -name "*.sql.gz" -mtime +7 -delete +# On Sundays, copy to weekly directory +if [ "$DOW" -eq 7 ]; then + cp $DAILY_DIR/*_${DATE}.sql.gz $WEEKLY_DIR/ +fi -# Keep weekly backups (Sundays) for 4 weeks -# (daily cleanup skips files from Sundays in the last 28 days) +# Remove daily backups older than 7 days +find $DAILY_DIR -name "*.sql.gz" -mtime +7 -delete + +# Remove weekly backups older than 28 days +find $WEEKLY_DIR -name "*.sql.gz" -mtime +28 -delete + +# Verify backup files are not empty (catch silent pg_dump failures) +for f in $DAILY_DIR/*_${DATE}.sql.gz; do + if [ ! -s "$f" ]; then + echo "WARNING: Empty backup file: $f" >&2 + fi +done ``` +**Schedule separation:** Backups run at 1:00 AM, SAT cron runs at 3:00 AM. With 50 clients, backup should complete in ~15-30 minutes, leaving ample gap before SAT sync starts. + ### Environment variables (production) ``` @@ -534,9 +618,12 @@ async function checkPlanLimits(req, res, next) { const tenant = await getTenantWithCache(req.user.tenantId); // cached 5 min const subscription = await getActiveSubscription(tenant.id); - // Check subscription is active - if (!subscription || subscription.status !== 'authorized') { - // Allow read-only access + // Allowed statuses: 'authorized' (paid) or 'pending' (grace period for new clients) + const allowedStatuses = ['authorized', 'pending']; + + // Check subscription status + if (!subscription || !allowedStatuses.includes(subscription.status)) { + // Allow read-only access for cancelled/paused subscriptions if (req.method !== 'GET') { return res.status(403).json({ message: 'Suscripción inactiva. Contacta soporte para reactivar.' @@ -544,10 +631,18 @@ async function checkPlanLimits(req, res, next) { } } + // Admin-impersonated requests bypass subscription check + // (admin needs to complete client setup regardless of payment status) + if (req.headers['x-view-tenant'] && req.user.role === 'admin') { + return next(); + } + next(); } ``` +**Grace period:** New clients start with `status: 'pending'` and have full write access (can upload FIEL, upload CFDIs, etc.). Once the subscription moves to `'cancelled'` or `'paused'` (e.g., failed payment), write access is revoked. Admin can also manually set status to `'authorized'` for clients who pay by bank transfer. + ### CFDI limit check Applied on `POST /api/cfdi/` and `POST /api/cfdi/bulk`: