docs: fix blockers and warnings in SaaS design spec
Fixes from spec review: - BLOCKER: JWT payload migration (schemaName → databaseName) - BLOCKER: FIEL encryption key separation from JWT_SECRET - BLOCKER: PM2 cluster pool count (max:3 × 2 workers = 6/tenant) - BLOCKER: Pending subscription grace period for new clients - WARNING: Add indexes on subscriptions/payments tables - WARNING: Fix Nginx rate limit zone definitions - WARNING: Fix backup auth (.pgpass), retention, and schedule - WARNING: Preserve admin X-View-Tenant impersonation - WARNING: Encrypt metadata.json for NDA compliance - SUGGESTION: Add health check, reduce upload limit, add rollback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -38,13 +38,29 @@ PostgreSQL Server (max_connections: 300)
|
||||
Existing tables (modified):
|
||||
- `tenants` — add `database_name` column, remove `schema_name`
|
||||
- `users` — no changes
|
||||
- `refresh_tokens` — no changes
|
||||
- `refresh_tokens` — flush all existing tokens at migration cutover (invalidate all sessions)
|
||||
- `fiel_credentials` — no changes
|
||||
|
||||
New tables:
|
||||
- `subscriptions` — MercadoPago subscription tracking
|
||||
- `payments` — payment history
|
||||
|
||||
### Prisma schema migration
|
||||
|
||||
The Prisma schema (`apps/api/prisma/schema.prisma`) must be updated:
|
||||
- Replace `schema_name String @unique @map("schema_name")` with `database_name String @unique @map("database_name")` on the `Tenant` model
|
||||
- Add `Subscription` and `Payment` models
|
||||
- Run `prisma migrate dev` to generate and apply migration
|
||||
- Update `Tenant` type in `packages/shared/src/types/tenant.ts`: replace `schemaName` with `databaseName`
|
||||
|
||||
### JWT payload migration
|
||||
|
||||
The current JWT payload embeds `schemaName`. This must change:
|
||||
- Update `JWTPayload` in `packages/shared/src/types/auth.ts`: replace `schemaName` with `databaseName`
|
||||
- Update token generation in `auth.service.ts`: read `tenant.databaseName` instead of `tenant.schemaName`
|
||||
- Update `refreshTokens` function to embed `databaseName`
|
||||
- At migration cutover: flush `refresh_tokens` table to invalidate all existing sessions (forces re-login)
|
||||
|
||||
### Client DB naming
|
||||
|
||||
Formula: `horux_<rfc_normalized>`
|
||||
@@ -85,10 +101,12 @@ class TenantConnectionManager {
|
||||
```
|
||||
|
||||
Pool configuration per tenant:
|
||||
- `max`: 5 connections
|
||||
- `max`: 3 connections (with 2 PM2 cluster instances, this means 6 connections/tenant max; at 50 tenants = 300, matching `max_connections`)
|
||||
- `idleTimeoutMillis`: 300000 (5 min)
|
||||
- `connectionTimeoutMillis`: 10000 (10 sec)
|
||||
|
||||
**Note on PM2 cluster mode:** Each PM2 worker is a separate Node.js process with its own `TenantConnectionManager` instance. With `instances: 2` and `max: 3` per pool, worst case is 50 tenants × 3 connections × 2 workers = 300 connections, which matches `max_connections = 300`. If scaling beyond 50 tenants, either increase `max_connections` or reduce pool `max` to 2.
|
||||
|
||||
### Tenant middleware change
|
||||
|
||||
Current: Sets `search_path` on a shared connection.
|
||||
@@ -105,6 +123,20 @@ req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName)
|
||||
|
||||
All tenant service functions change from using a shared pool with schema prefix to using `req.tenantPool` with direct table names.
|
||||
|
||||
### Admin impersonation (X-View-Tenant)
|
||||
|
||||
The current `X-View-Tenant` header support for admin "view-as" functionality is preserved. The new middleware resolves the `databaseName` for the viewed tenant:
|
||||
|
||||
```typescript
|
||||
// If admin is viewing another tenant
|
||||
if (req.headers['x-view-tenant'] && req.user.role === 'admin') {
|
||||
const viewedTenant = await getTenantByRfc(req.headers['x-view-tenant']);
|
||||
req.tenantPool = tenantConnectionManager.getPool(viewedTenant.id, viewedTenant.databaseName);
|
||||
} else {
|
||||
req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName);
|
||||
}
|
||||
```
|
||||
|
||||
### Provisioning flow (new client)
|
||||
|
||||
1. Admin creates tenant via UI → POST `/api/tenants/`
|
||||
@@ -115,6 +147,13 @@ All tenant service functions change from using a shared pool with schema prefix
|
||||
6. Send welcome email with temporary credentials
|
||||
7. Generate MercadoPago subscription link
|
||||
|
||||
**Rollback on partial failure:** If any step 3-7 fails:
|
||||
- Drop the created database if it exists (`DROP DATABASE IF EXISTS horux_<rfc>`)
|
||||
- Delete the `tenants` row
|
||||
- Delete the `users` row if created
|
||||
- Return error to admin with the specific step that failed
|
||||
- The entire provisioning is wrapped in a try/catch with explicit cleanup
|
||||
|
||||
### PostgreSQL tuning
|
||||
|
||||
```
|
||||
@@ -162,7 +201,9 @@ When a client uploads their FIEL (.cer + .key + password):
|
||||
|
||||
- Algorithm: AES-256-GCM
|
||||
- Key: `FIEL_ENCRYPTION_KEY` environment variable (separate from other secrets)
|
||||
- Each file gets its own IV (initialization vector)
|
||||
- **Code change required:** `sat-crypto.service.ts` currently derives the key from `JWT_SECRET` via `createHash('sha256').update(env.JWT_SECRET).digest()`. This must be changed to read `FIEL_ENCRYPTION_KEY` from the env schema. The `env.ts` Zod schema must be updated to declare `FIEL_ENCRYPTION_KEY` as required.
|
||||
- Each component (certificate, private key, password) is encrypted separately with its own IV and auth tag. The `fiel_credentials` table stores separate `encryption_iv` and `encryption_tag` per row. The filesystem also stores each file independently encrypted.
|
||||
- **Code change required:** The current `sat-crypto.service.ts` shares a single IV/tag across all three components. Refactor to encrypt each component independently with its own IV/tag. Store per-component IV/tags in the DB (add columns: `cer_iv`, `cer_tag`, `key_iv`, `key_tag`, `password_iv`, `password_tag` — or use a JSON column).
|
||||
- Password is encrypted, never stored in plaintext
|
||||
|
||||
### Manual decryption CLI
|
||||
@@ -179,7 +220,7 @@ node scripts/decrypt-fiel.js --rfc CAS2408138W2
|
||||
|
||||
- `/var/horux/fiel/` permissions: `700` (root only)
|
||||
- Encrypted files are useless without `FIEL_ENCRYPTION_KEY`
|
||||
- `metadata.json` is NOT encrypted (contains only non-sensitive info: serial number, validity dates)
|
||||
- `metadata.json` is also encrypted (contains serial number + RFC which could be used to query SAT's certificate validation service, violating NDA confidentiality requirements)
|
||||
|
||||
### Upload flow
|
||||
|
||||
@@ -216,6 +257,9 @@ CREATE TABLE subscriptions (
|
||||
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_subscriptions_tenant_id ON subscriptions(tenant_id);
|
||||
CREATE INDEX idx_subscriptions_status ON subscriptions(status);
|
||||
|
||||
CREATE TABLE payments (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
tenant_id UUID NOT NULL REFERENCES tenants(id),
|
||||
@@ -228,6 +272,9 @@ CREATE TABLE payments (
|
||||
paid_at TIMESTAMP,
|
||||
created_at TIMESTAMP NOT NULL DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_payments_tenant_id ON payments(tenant_id);
|
||||
CREATE INDEX idx_payments_subscription_id ON payments(subscription_id);
|
||||
```
|
||||
|
||||
### Plans and pricing
|
||||
@@ -398,6 +445,10 @@ Auto-restart on crash. Log rotation via `pm2-logrotate`.
|
||||
### Nginx reverse proxy
|
||||
|
||||
```nginx
|
||||
# Rate limiting zone definitions (in http block of nginx.conf)
|
||||
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
|
||||
limit_req_zone $binary_remote_addr zone=webhooks:10m rate=30r/m;
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name horux360.consultoria-as.com;
|
||||
@@ -420,6 +471,11 @@ server {
|
||||
gzip on;
|
||||
gzip_types text/plain application/json application/javascript text/css;
|
||||
|
||||
# Health check (for monitoring)
|
||||
location /api/health {
|
||||
proxy_pass http://127.0.0.1:4000;
|
||||
}
|
||||
|
||||
# Rate limiting for public endpoints
|
||||
location /api/auth/ {
|
||||
limit_req zone=auth burst=5 nodelay;
|
||||
@@ -438,7 +494,7 @@ server {
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
client_max_body_size 1G; # For bulk XML uploads
|
||||
client_max_body_size 200M; # Bulk XML uploads (200MB is enough for ~50k XML files)
|
||||
}
|
||||
|
||||
# Next.js
|
||||
@@ -452,6 +508,10 @@ server {
|
||||
}
|
||||
```
|
||||
|
||||
### Health check endpoint
|
||||
|
||||
The existing `GET /health` endpoint returns `{ status: 'ok', timestamp }`. PM2 uses this for liveness checks. Nginx can optionally use it for upstream health monitoring.
|
||||
|
||||
### SSL
|
||||
|
||||
Let's Encrypt with certbot. Auto-renewal via cron.
|
||||
@@ -473,29 +533,53 @@ PostgreSQL only on localhost (no external access).
|
||||
|
||||
### Backups
|
||||
|
||||
Cron job at 2:00 AM daily:
|
||||
Cron job at **1:00 AM** daily (runs before SAT cron at 3:00 AM, with enough gap to complete):
|
||||
|
||||
**Authentication:** Create a `.pgpass` file at `/root/.pgpass` with `localhost:5432:*:postgres:<password>` and `chmod 600`. This allows `pg_dump` to authenticate without inline passwords.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /var/horux/scripts/backup.sh
|
||||
set -euo pipefail
|
||||
|
||||
BACKUP_DIR=/var/horux/backups
|
||||
DATE=$(date +%Y-%m-%d)
|
||||
DOW=$(date +%u) # Day of week: 1=Monday, 7=Sunday
|
||||
DAILY_DIR=$BACKUP_DIR/daily
|
||||
WEEKLY_DIR=$BACKUP_DIR/weekly
|
||||
|
||||
mkdir -p $DAILY_DIR $WEEKLY_DIR
|
||||
|
||||
# Backup central DB
|
||||
pg_dump -h localhost -U postgres horux360 | gzip > $BACKUP_DIR/horux360_$DATE.sql.gz
|
||||
pg_dump -h localhost -U postgres horux360 | gzip > $DAILY_DIR/horux360_$DATE.sql.gz
|
||||
|
||||
# Backup each tenant DB
|
||||
for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true"); do
|
||||
pg_dump -h localhost -U postgres $db | gzip > $BACKUP_DIR/${db}_${DATE}.sql.gz
|
||||
for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true" horux360); do
|
||||
db_trimmed=$(echo $db | xargs) # trim whitespace
|
||||
pg_dump -h localhost -U postgres "$db_trimmed" | gzip > $DAILY_DIR/${db_trimmed}_${DATE}.sql.gz
|
||||
done
|
||||
|
||||
# Remove backups older than 7 days
|
||||
find $BACKUP_DIR -name "*.sql.gz" -mtime +7 -delete
|
||||
# On Sundays, copy to weekly directory
|
||||
if [ "$DOW" -eq 7 ]; then
|
||||
cp $DAILY_DIR/*_${DATE}.sql.gz $WEEKLY_DIR/
|
||||
fi
|
||||
|
||||
# Keep weekly backups (Sundays) for 4 weeks
|
||||
# (daily cleanup skips files from Sundays in the last 28 days)
|
||||
# Remove daily backups older than 7 days
|
||||
find $DAILY_DIR -name "*.sql.gz" -mtime +7 -delete
|
||||
|
||||
# Remove weekly backups older than 28 days
|
||||
find $WEEKLY_DIR -name "*.sql.gz" -mtime +28 -delete
|
||||
|
||||
# Verify backup files are not empty (catch silent pg_dump failures)
|
||||
for f in $DAILY_DIR/*_${DATE}.sql.gz; do
|
||||
if [ ! -s "$f" ]; then
|
||||
echo "WARNING: Empty backup file: $f" >&2
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
**Schedule separation:** Backups run at 1:00 AM, SAT cron runs at 3:00 AM. With 50 clients, backup should complete in ~15-30 minutes, leaving ample gap before SAT sync starts.
|
||||
|
||||
### Environment variables (production)
|
||||
|
||||
```
|
||||
@@ -534,9 +618,12 @@ async function checkPlanLimits(req, res, next) {
|
||||
const tenant = await getTenantWithCache(req.user.tenantId); // cached 5 min
|
||||
const subscription = await getActiveSubscription(tenant.id);
|
||||
|
||||
// Check subscription is active
|
||||
if (!subscription || subscription.status !== 'authorized') {
|
||||
// Allow read-only access
|
||||
// Allowed statuses: 'authorized' (paid) or 'pending' (grace period for new clients)
|
||||
const allowedStatuses = ['authorized', 'pending'];
|
||||
|
||||
// Check subscription status
|
||||
if (!subscription || !allowedStatuses.includes(subscription.status)) {
|
||||
// Allow read-only access for cancelled/paused subscriptions
|
||||
if (req.method !== 'GET') {
|
||||
return res.status(403).json({
|
||||
message: 'Suscripción inactiva. Contacta soporte para reactivar.'
|
||||
@@ -544,10 +631,18 @@ async function checkPlanLimits(req, res, next) {
|
||||
}
|
||||
}
|
||||
|
||||
// Admin-impersonated requests bypass subscription check
|
||||
// (admin needs to complete client setup regardless of payment status)
|
||||
if (req.headers['x-view-tenant'] && req.user.role === 'admin') {
|
||||
return next();
|
||||
}
|
||||
|
||||
next();
|
||||
}
|
||||
```
|
||||
|
||||
**Grace period:** New clients start with `status: 'pending'` and have full write access (can upload FIEL, upload CFDIs, etc.). Once the subscription moves to `'cancelled'` or `'paused'` (e.g., failed payment), write access is revoked. Admin can also manually set status to `'authorized'` for clients who pay by bank transfer.
|
||||
|
||||
### CFDI limit check
|
||||
|
||||
Applied on `POST /api/cfdi/` and `POST /api/cfdi/bulk`:
|
||||
|
||||
Reference in New Issue
Block a user