docs: fix blockers and warnings in SaaS design spec
Fixes from spec review: - BLOCKER: JWT payload migration (schemaName → databaseName) - BLOCKER: FIEL encryption key separation from JWT_SECRET - BLOCKER: PM2 cluster pool count (max:3 × 2 workers = 6/tenant) - BLOCKER: Pending subscription grace period for new clients - WARNING: Add indexes on subscriptions/payments tables - WARNING: Fix Nginx rate limit zone definitions - WARNING: Fix backup auth (.pgpass), retention, and schedule - WARNING: Preserve admin X-View-Tenant impersonation - WARNING: Encrypt metadata.json for NDA compliance - SUGGESTION: Add health check, reduce upload limit, add rollback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -38,13 +38,29 @@ PostgreSQL Server (max_connections: 300)
|
|||||||
Existing tables (modified):
|
Existing tables (modified):
|
||||||
- `tenants` — add `database_name` column, remove `schema_name`
|
- `tenants` — add `database_name` column, remove `schema_name`
|
||||||
- `users` — no changes
|
- `users` — no changes
|
||||||
- `refresh_tokens` — no changes
|
- `refresh_tokens` — flush all existing tokens at migration cutover (invalidate all sessions)
|
||||||
- `fiel_credentials` — no changes
|
- `fiel_credentials` — no changes
|
||||||
|
|
||||||
New tables:
|
New tables:
|
||||||
- `subscriptions` — MercadoPago subscription tracking
|
- `subscriptions` — MercadoPago subscription tracking
|
||||||
- `payments` — payment history
|
- `payments` — payment history
|
||||||
|
|
||||||
|
### Prisma schema migration
|
||||||
|
|
||||||
|
The Prisma schema (`apps/api/prisma/schema.prisma`) must be updated:
|
||||||
|
- Replace `schema_name String @unique @map("schema_name")` with `database_name String @unique @map("database_name")` on the `Tenant` model
|
||||||
|
- Add `Subscription` and `Payment` models
|
||||||
|
- Run `prisma migrate dev` to generate and apply migration
|
||||||
|
- Update `Tenant` type in `packages/shared/src/types/tenant.ts`: replace `schemaName` with `databaseName`
|
||||||
|
|
||||||
|
### JWT payload migration
|
||||||
|
|
||||||
|
The current JWT payload embeds `schemaName`. This must change:
|
||||||
|
- Update `JWTPayload` in `packages/shared/src/types/auth.ts`: replace `schemaName` with `databaseName`
|
||||||
|
- Update token generation in `auth.service.ts`: read `tenant.databaseName` instead of `tenant.schemaName`
|
||||||
|
- Update `refreshTokens` function to embed `databaseName`
|
||||||
|
- At migration cutover: flush `refresh_tokens` table to invalidate all existing sessions (forces re-login)
|
||||||
|
|
||||||
### Client DB naming
|
### Client DB naming
|
||||||
|
|
||||||
Formula: `horux_<rfc_normalized>`
|
Formula: `horux_<rfc_normalized>`
|
||||||
@@ -85,10 +101,12 @@ class TenantConnectionManager {
|
|||||||
```
|
```
|
||||||
|
|
||||||
Pool configuration per tenant:
|
Pool configuration per tenant:
|
||||||
- `max`: 5 connections
|
- `max`: 3 connections (with 2 PM2 cluster instances, this means 6 connections/tenant max; at 50 tenants = 300, matching `max_connections`)
|
||||||
- `idleTimeoutMillis`: 300000 (5 min)
|
- `idleTimeoutMillis`: 300000 (5 min)
|
||||||
- `connectionTimeoutMillis`: 10000 (10 sec)
|
- `connectionTimeoutMillis`: 10000 (10 sec)
|
||||||
|
|
||||||
|
**Note on PM2 cluster mode:** Each PM2 worker is a separate Node.js process with its own `TenantConnectionManager` instance. With `instances: 2` and `max: 3` per pool, worst case is 50 tenants × 3 connections × 2 workers = 300 connections, which matches `max_connections = 300`. If scaling beyond 50 tenants, either increase `max_connections` or reduce pool `max` to 2.
|
||||||
|
|
||||||
### Tenant middleware change
|
### Tenant middleware change
|
||||||
|
|
||||||
Current: Sets `search_path` on a shared connection.
|
Current: Sets `search_path` on a shared connection.
|
||||||
@@ -105,6 +123,20 @@ req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName)
|
|||||||
|
|
||||||
All tenant service functions change from using a shared pool with schema prefix to using `req.tenantPool` with direct table names.
|
All tenant service functions change from using a shared pool with schema prefix to using `req.tenantPool` with direct table names.
|
||||||
|
|
||||||
|
### Admin impersonation (X-View-Tenant)
|
||||||
|
|
||||||
|
The current `X-View-Tenant` header support for admin "view-as" functionality is preserved. The new middleware resolves the `databaseName` for the viewed tenant:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// If admin is viewing another tenant
|
||||||
|
if (req.headers['x-view-tenant'] && req.user.role === 'admin') {
|
||||||
|
const viewedTenant = await getTenantByRfc(req.headers['x-view-tenant']);
|
||||||
|
req.tenantPool = tenantConnectionManager.getPool(viewedTenant.id, viewedTenant.databaseName);
|
||||||
|
} else {
|
||||||
|
req.tenantPool = tenantConnectionManager.getPool(tenant.id, tenant.databaseName);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
### Provisioning flow (new client)
|
### Provisioning flow (new client)
|
||||||
|
|
||||||
1. Admin creates tenant via UI → POST `/api/tenants/`
|
1. Admin creates tenant via UI → POST `/api/tenants/`
|
||||||
@@ -115,6 +147,13 @@ All tenant service functions change from using a shared pool with schema prefix
|
|||||||
6. Send welcome email with temporary credentials
|
6. Send welcome email with temporary credentials
|
||||||
7. Generate MercadoPago subscription link
|
7. Generate MercadoPago subscription link
|
||||||
|
|
||||||
|
**Rollback on partial failure:** If any step 3-7 fails:
|
||||||
|
- Drop the created database if it exists (`DROP DATABASE IF EXISTS horux_<rfc>`)
|
||||||
|
- Delete the `tenants` row
|
||||||
|
- Delete the `users` row if created
|
||||||
|
- Return error to admin with the specific step that failed
|
||||||
|
- The entire provisioning is wrapped in a try/catch with explicit cleanup
|
||||||
|
|
||||||
### PostgreSQL tuning
|
### PostgreSQL tuning
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -162,7 +201,9 @@ When a client uploads their FIEL (.cer + .key + password):
|
|||||||
|
|
||||||
- Algorithm: AES-256-GCM
|
- Algorithm: AES-256-GCM
|
||||||
- Key: `FIEL_ENCRYPTION_KEY` environment variable (separate from other secrets)
|
- Key: `FIEL_ENCRYPTION_KEY` environment variable (separate from other secrets)
|
||||||
- Each file gets its own IV (initialization vector)
|
- **Code change required:** `sat-crypto.service.ts` currently derives the key from `JWT_SECRET` via `createHash('sha256').update(env.JWT_SECRET).digest()`. This must be changed to read `FIEL_ENCRYPTION_KEY` from the env schema. The `env.ts` Zod schema must be updated to declare `FIEL_ENCRYPTION_KEY` as required.
|
||||||
|
- Each component (certificate, private key, password) is encrypted separately with its own IV and auth tag. The `fiel_credentials` table stores separate `encryption_iv` and `encryption_tag` per row. The filesystem also stores each file independently encrypted.
|
||||||
|
- **Code change required:** The current `sat-crypto.service.ts` shares a single IV/tag across all three components. Refactor to encrypt each component independently with its own IV/tag. Store per-component IV/tags in the DB (add columns: `cer_iv`, `cer_tag`, `key_iv`, `key_tag`, `password_iv`, `password_tag` — or use a JSON column).
|
||||||
- Password is encrypted, never stored in plaintext
|
- Password is encrypted, never stored in plaintext
|
||||||
|
|
||||||
### Manual decryption CLI
|
### Manual decryption CLI
|
||||||
@@ -179,7 +220,7 @@ node scripts/decrypt-fiel.js --rfc CAS2408138W2
|
|||||||
|
|
||||||
- `/var/horux/fiel/` permissions: `700` (root only)
|
- `/var/horux/fiel/` permissions: `700` (root only)
|
||||||
- Encrypted files are useless without `FIEL_ENCRYPTION_KEY`
|
- Encrypted files are useless without `FIEL_ENCRYPTION_KEY`
|
||||||
- `metadata.json` is NOT encrypted (contains only non-sensitive info: serial number, validity dates)
|
- `metadata.json` is also encrypted (contains serial number + RFC which could be used to query SAT's certificate validation service, violating NDA confidentiality requirements)
|
||||||
|
|
||||||
### Upload flow
|
### Upload flow
|
||||||
|
|
||||||
@@ -216,6 +257,9 @@ CREATE TABLE subscriptions (
|
|||||||
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
|
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
|
||||||
);
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_subscriptions_tenant_id ON subscriptions(tenant_id);
|
||||||
|
CREATE INDEX idx_subscriptions_status ON subscriptions(status);
|
||||||
|
|
||||||
CREATE TABLE payments (
|
CREATE TABLE payments (
|
||||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
tenant_id UUID NOT NULL REFERENCES tenants(id),
|
tenant_id UUID NOT NULL REFERENCES tenants(id),
|
||||||
@@ -228,6 +272,9 @@ CREATE TABLE payments (
|
|||||||
paid_at TIMESTAMP,
|
paid_at TIMESTAMP,
|
||||||
created_at TIMESTAMP NOT NULL DEFAULT NOW()
|
created_at TIMESTAMP NOT NULL DEFAULT NOW()
|
||||||
);
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_payments_tenant_id ON payments(tenant_id);
|
||||||
|
CREATE INDEX idx_payments_subscription_id ON payments(subscription_id);
|
||||||
```
|
```
|
||||||
|
|
||||||
### Plans and pricing
|
### Plans and pricing
|
||||||
@@ -398,6 +445,10 @@ Auto-restart on crash. Log rotation via `pm2-logrotate`.
|
|||||||
### Nginx reverse proxy
|
### Nginx reverse proxy
|
||||||
|
|
||||||
```nginx
|
```nginx
|
||||||
|
# Rate limiting zone definitions (in http block of nginx.conf)
|
||||||
|
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
|
||||||
|
limit_req_zone $binary_remote_addr zone=webhooks:10m rate=30r/m;
|
||||||
|
|
||||||
server {
|
server {
|
||||||
listen 80;
|
listen 80;
|
||||||
server_name horux360.consultoria-as.com;
|
server_name horux360.consultoria-as.com;
|
||||||
@@ -420,6 +471,11 @@ server {
|
|||||||
gzip on;
|
gzip on;
|
||||||
gzip_types text/plain application/json application/javascript text/css;
|
gzip_types text/plain application/json application/javascript text/css;
|
||||||
|
|
||||||
|
# Health check (for monitoring)
|
||||||
|
location /api/health {
|
||||||
|
proxy_pass http://127.0.0.1:4000;
|
||||||
|
}
|
||||||
|
|
||||||
# Rate limiting for public endpoints
|
# Rate limiting for public endpoints
|
||||||
location /api/auth/ {
|
location /api/auth/ {
|
||||||
limit_req zone=auth burst=5 nodelay;
|
limit_req zone=auth burst=5 nodelay;
|
||||||
@@ -438,7 +494,7 @@ server {
|
|||||||
proxy_set_header X-Real-IP $remote_addr;
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||||
proxy_set_header X-Forwarded-Proto $scheme;
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||||||
client_max_body_size 1G; # For bulk XML uploads
|
client_max_body_size 200M; # Bulk XML uploads (200MB is enough for ~50k XML files)
|
||||||
}
|
}
|
||||||
|
|
||||||
# Next.js
|
# Next.js
|
||||||
@@ -452,6 +508,10 @@ server {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Health check endpoint
|
||||||
|
|
||||||
|
The existing `GET /health` endpoint returns `{ status: 'ok', timestamp }`. PM2 uses this for liveness checks. Nginx can optionally use it for upstream health monitoring.
|
||||||
|
|
||||||
### SSL
|
### SSL
|
||||||
|
|
||||||
Let's Encrypt with certbot. Auto-renewal via cron.
|
Let's Encrypt with certbot. Auto-renewal via cron.
|
||||||
@@ -473,29 +533,53 @@ PostgreSQL only on localhost (no external access).
|
|||||||
|
|
||||||
### Backups
|
### Backups
|
||||||
|
|
||||||
Cron job at 2:00 AM daily:
|
Cron job at **1:00 AM** daily (runs before SAT cron at 3:00 AM, with enough gap to complete):
|
||||||
|
|
||||||
|
**Authentication:** Create a `.pgpass` file at `/root/.pgpass` with `localhost:5432:*:postgres:<password>` and `chmod 600`. This allows `pg_dump` to authenticate without inline passwords.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
# /var/horux/scripts/backup.sh
|
# /var/horux/scripts/backup.sh
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
BACKUP_DIR=/var/horux/backups
|
BACKUP_DIR=/var/horux/backups
|
||||||
DATE=$(date +%Y-%m-%d)
|
DATE=$(date +%Y-%m-%d)
|
||||||
|
DOW=$(date +%u) # Day of week: 1=Monday, 7=Sunday
|
||||||
|
DAILY_DIR=$BACKUP_DIR/daily
|
||||||
|
WEEKLY_DIR=$BACKUP_DIR/weekly
|
||||||
|
|
||||||
|
mkdir -p $DAILY_DIR $WEEKLY_DIR
|
||||||
|
|
||||||
# Backup central DB
|
# Backup central DB
|
||||||
pg_dump -h localhost -U postgres horux360 | gzip > $BACKUP_DIR/horux360_$DATE.sql.gz
|
pg_dump -h localhost -U postgres horux360 | gzip > $DAILY_DIR/horux360_$DATE.sql.gz
|
||||||
|
|
||||||
# Backup each tenant DB
|
# Backup each tenant DB
|
||||||
for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true"); do
|
for db in $(psql -h localhost -U postgres -t -c "SELECT database_name FROM tenants WHERE active = true" horux360); do
|
||||||
pg_dump -h localhost -U postgres $db | gzip > $BACKUP_DIR/${db}_${DATE}.sql.gz
|
db_trimmed=$(echo $db | xargs) # trim whitespace
|
||||||
|
pg_dump -h localhost -U postgres "$db_trimmed" | gzip > $DAILY_DIR/${db_trimmed}_${DATE}.sql.gz
|
||||||
done
|
done
|
||||||
|
|
||||||
# Remove backups older than 7 days
|
# On Sundays, copy to weekly directory
|
||||||
find $BACKUP_DIR -name "*.sql.gz" -mtime +7 -delete
|
if [ "$DOW" -eq 7 ]; then
|
||||||
|
cp $DAILY_DIR/*_${DATE}.sql.gz $WEEKLY_DIR/
|
||||||
|
fi
|
||||||
|
|
||||||
# Keep weekly backups (Sundays) for 4 weeks
|
# Remove daily backups older than 7 days
|
||||||
# (daily cleanup skips files from Sundays in the last 28 days)
|
find $DAILY_DIR -name "*.sql.gz" -mtime +7 -delete
|
||||||
|
|
||||||
|
# Remove weekly backups older than 28 days
|
||||||
|
find $WEEKLY_DIR -name "*.sql.gz" -mtime +28 -delete
|
||||||
|
|
||||||
|
# Verify backup files are not empty (catch silent pg_dump failures)
|
||||||
|
for f in $DAILY_DIR/*_${DATE}.sql.gz; do
|
||||||
|
if [ ! -s "$f" ]; then
|
||||||
|
echo "WARNING: Empty backup file: $f" >&2
|
||||||
|
fi
|
||||||
|
done
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Schedule separation:** Backups run at 1:00 AM, SAT cron runs at 3:00 AM. With 50 clients, backup should complete in ~15-30 minutes, leaving ample gap before SAT sync starts.
|
||||||
|
|
||||||
### Environment variables (production)
|
### Environment variables (production)
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -534,9 +618,12 @@ async function checkPlanLimits(req, res, next) {
|
|||||||
const tenant = await getTenantWithCache(req.user.tenantId); // cached 5 min
|
const tenant = await getTenantWithCache(req.user.tenantId); // cached 5 min
|
||||||
const subscription = await getActiveSubscription(tenant.id);
|
const subscription = await getActiveSubscription(tenant.id);
|
||||||
|
|
||||||
// Check subscription is active
|
// Allowed statuses: 'authorized' (paid) or 'pending' (grace period for new clients)
|
||||||
if (!subscription || subscription.status !== 'authorized') {
|
const allowedStatuses = ['authorized', 'pending'];
|
||||||
// Allow read-only access
|
|
||||||
|
// Check subscription status
|
||||||
|
if (!subscription || !allowedStatuses.includes(subscription.status)) {
|
||||||
|
// Allow read-only access for cancelled/paused subscriptions
|
||||||
if (req.method !== 'GET') {
|
if (req.method !== 'GET') {
|
||||||
return res.status(403).json({
|
return res.status(403).json({
|
||||||
message: 'Suscripción inactiva. Contacta soporte para reactivar.'
|
message: 'Suscripción inactiva. Contacta soporte para reactivar.'
|
||||||
@@ -544,10 +631,18 @@ async function checkPlanLimits(req, res, next) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Admin-impersonated requests bypass subscription check
|
||||||
|
// (admin needs to complete client setup regardless of payment status)
|
||||||
|
if (req.headers['x-view-tenant'] && req.user.role === 'admin') {
|
||||||
|
return next();
|
||||||
|
}
|
||||||
|
|
||||||
next();
|
next();
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Grace period:** New clients start with `status: 'pending'` and have full write access (can upload FIEL, upload CFDIs, etc.). Once the subscription moves to `'cancelled'` or `'paused'` (e.g., failed payment), write access is revoked. Admin can also manually set status to `'authorized'` for clients who pay by bank transfer.
|
||||||
|
|
||||||
### CFDI limit check
|
### CFDI limit check
|
||||||
|
|
||||||
Applied on `POST /api/cfdi/` and `POST /api/cfdi/bulk`:
|
Applied on `POST /api/cfdi/` and `POST /api/cfdi/bulk`:
|
||||||
|
|||||||
Reference in New Issue
Block a user