Hacker Newsnew | past | comments | ask | show | jobs | submit | Raviteja_'s commentslogin

This is Ravi, the developer who built this. Few months back, I was building an AI app and realized my users data was going to OpenAI in plaintext. I searched for a local firewall something that would sit between my app and the LLM provider, inspect everything, and block threats. Nothing open-source existed that was complete. So I built Sentinel Protocol. It works like this:

1. You run one command: npx sentinel bootstrap --profile minimal 2. It starts a proxy at localhost:8787 3. You point your existing OpenAI SDK at localhost:8787 instead 4. Done.

Every prompt and response now goes through 81 security engines on YOUR machine. What makes it different from LLM Guard / Rebuff / NeMo Guardrails: - Runs 100% locally (they mostly require cloud) - 9 npm dependencies (they have 50–200+) Formal verification with TLA+ and Alloy specs Self-healing immune system that learns from attacks Federated threat mesh - share threat signatures with peers. The thing I'm most proud of is the self-healing engine. When it detects a new attack pattern it has never seen, it auto-generates a blocking rule for future attacks. It gets smarter every day.

Happy to answer any questions about the architecture, the security model, or specific engines.


I open sourced Sentinel Protocol today.

52,069 lines of code. 81 security engines. 9 runtime dependencies. MIT.

It's a local proxy that sits between your app and any LLM (OpenAI, Anthropic, Gemini, Ollama) and enforces PII protection, injection defense, output scanning, and a full audit trail completely on your machine. Zero cloud calls for security work.

One command: npx --yes --package sentinel-protocol sentinel bootstrap --profile paranoid --mode enforce --dashboard


I've been quietly building this forlast few months. Today I am open-sourcing it.

One thing that kept bothering me is that teams I worked with were sending raw user input to OpenAI with literally zero filtering. Not even a regex. Users would type their SSN or credit card number, it would go straight to the API, and nobody noticed. There's no safety net here. The model doesn't care. The SDK doesn't care. Your app doesn't care.

So I built Sentinel Protocol. It's a local proxy that sits between your app and any LLM API - OpenAI, Anthropic, Google Gemini, Ollama,etc and enforces security on every request.

What it actually does: On the way in: - Scans for 40+ PII types (SSN, credit card, email, phone, passport, tax ID, AWS keys, API tokens, etc.), blocks critical ones, silently redacts medium ones - Neural injection classifier (built a custom LFRL engine - rule language plus ML scoring) + regex + semantic similarity - layered defense - MCP poisoning detection for agentic apps using tool calls - Loop detection, intent drift tracking, swarm isolation for multi-agent systems - Deception engine that intentionally returns fake responses to detected attackers - Cold start analyzer (heightened sensitivity during first N seconds of warmup)

On the way out: - Output classifier for toxicity, code execution, hallucination signals, unauthorized disclosure - Hallucination tripwire (catches fabricated URLs, nonexistent citations, numeric contradictions in the model's own response) - Real-time PII redaction in SSE/streaming responses — not after the stream, during - Stego exfil detection (zero-width characters, invisible Unicode used to embed data in model output - real attack vector) - Token watermarking with timing-safe verification

Governance: - OWASP LLM Top 10 - all 10 categories covered - MITRE ATLAS threat attribution on every blocked event - JSONL audit log at ~/.sentinel/audit.jsonl (grep-friendly, plain text, yours) - Forensic debugger with full replay capability — change a config, re-run any blocked request against the new settings - AIBOM (AI Bill of Materials) generator for compliance - TLA+ and Alloy formal verification specs included

Numbers: - 52,069 lines of source code - 81 security engines - 139 test suites, 567 tests, 0 failures - 306 linted files, 0 warnings - 9 total runtime dependencies (yes, nine — I kept it tight on purpose) - <5ms p95 proxy overhead - Zero cloud dependency — everything runs on your machine

Start with one command: npx --yes --package sentinel-protocol sentinel bootstrap --profile paranoid --mode enforce --dashboard

Drop-in for any OpenAI SDK — change baseURL to http://127.0.0.1:8787/v1 and add the x-sentinel-target header. That's it.

I wanted to build something that could run in a hospital, a law firm, or a two-person startup with the same trust model: your data doesn't leave your machine.

GitHub: https://github.com/myProjectsRavi/sentinel-protocol npm: sentinel-protocol (v1.2.7, MIT)

Expecting feedback from every developer.


3 Tools you are overpaying for:

- Prompt Compressor ($10/mo)

- PII Redactor ($20/mo)

- Fake Data Gen ($15/mo)

Risk Mirror does all 3 in the browser.

The Text Suite is $0 right now.

No card required. No logs kept.

Try for free. Feedback would be great !


Every time you paste a stack trace into ChatGPT, you might be leaking:

- User session tokens

- Database connection strings

- API keys from env variables

I built Risk Mirror to scan and redact sensitive data BEFORE it touches any AI.

It's deterministic (no AI used for scanning because that would defeat the purpose).

Feedback welcome !


Stop burning $20/mo on Claude credits. I unlocked the Prompt Optimizer for free.

If you're hitting the message cap on Claude/Cursor, you're sending too much fluff. "Please", "Thank you," and verbose context contexts are eating 30% of your token budget.

I built Risk Mirror to mathematically compress prompts (removing filler, preserving logic).

It’s usually a Pro feature, but is completely free for now while I benchmark compression rates.

Free Tools Included:

* Prompt Optimizer (Save 40% tokens)

* Safe Share (Redact PII from LOGS/Text instantly)

* Risk Scanner (Check prompts before pasting)

* Clarity Analyzer (Fix vague inputs)

Grab it before I have to close the free tier


If your Cursor/Claude credits vanish fast, it’s probably prompt bloat.

Polite filler + repeated context + messy JSON = wasted tokens.

Risk Mirror compresses prompts 20–40% without changing meaning.

More credits. Same results.

Try free: https://risk-mirror.vercel.app


Hey all,

I'm the creator of the PII Firewall Edge API (currently on RapidAPI).

I saw a lot of devs struggling to implement safety guardrails correctly—most were just using basic regex or heavy LLMs that hallucinate.

So, I decided to package my API into a full-featured UI/Toolkit called Risk Mirror.

What it does: It sits between your users and your LLM (OpenAI/Anthropic) and strips out sensitive data before it leaves your server.

The Tech (Zero AI Inference): Instead of asking an LLM "is this safe?", I use:

152 PII Types: My custom engine covers everything from US Social Security Numbers to Indian Aadhaar cards and HIPAA identifiers. Shannon Entropy: To detect high-entropy strings (API keys, passwords) that regex misses. Deterministic Rules: 100% consistency. No "maybe." Why use this?

It's Tested: The underlying API engine is already battle-tested. It's Fast: <10ms latency.

Includes a 'Twin Dataset' generator for Data Scientists (redact CSVs securely). Feedback welcome!"


Great catch! Emails with spaces around @ (like "test @ example.com") slip through. This is a classic obfuscation bypass.

The current pattern intentionally matches RFC 5321 compliant emails (no spaces). Adding support for spaced variants creates a trade off. wewould catch more bypass attempts but also increase false positives on text like "send @ 5pm". I'll add this to the roadmap. Appreciate the feedback ! this is exactly the kind of edge case I need to hear about to make my api more better


Quick technical notes for HN:

Why no AI?

The irony of sending PII to an AI model to detect PII is lost on most "privacy" APIs. This is pure algorithmic detection – the same approach your credit card company uses to validate card numbers.

What's validated (not just pattern-matched): - Credit cards → Luhn checksum - Aadhaar → Verhoeff (the algorithm that catches single-digit and transposition errors) - IBAN → Mod 97 (same as banks use) - Singapore NRIC → Mod 11 with offset - Brazilian CPF → Dual Mod 11

Latency breakdown: - Heuristic scan: O(n) single pass for trigger characters (@, -, digits) - Pattern matching: Only runs if triggers found - Validation: Only on pattern matches - Total: 2-5ms for /fast, 5-15ms for /deep

False positive mitigation: - "Order ID: 123-45-6789" won't trigger SSN (negative context) - Timestamps won't match phone patterns (separator requirements) - Random 16-digit numbers won't trigger credit card (Luhn must pass)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: