diff --git a/techniques/nepenthes/technical_nepenthes.md b/techniques/nepenthes/technical_nepenthes.md new file mode 100644 index 0000000..8920e16 --- /dev/null +++ b/techniques/nepenthes/technical_nepenthes.md @@ -0,0 +1,37 @@ +# Nepenthes — Infinite Tarpit for Content Protection + +**Nepenthes** is a lightweight, self-hosted tarpit system designed to trap and waste the resources of non-compliant AI crawlers. Created by Aaron (zadzmo), it generates procedurally infinite, nonsensical web pages filled with Markov chain text and endless links. Compliant crawlers that honor `robots.txt` never see it; aggressive scrapers that ignore `Disallow` rules become trapped in an ever-expanding maze of garbage content. + +## Why Nepenthes Matters + +Traditional polite mechanisms (`robots.txt`, `ai.txt`, opt-out forms) have proven ineffective against frontier AI labs and their contractors. Nepenthes flips the economic model: instead of the content creator bearing bandwidth and compute costs, the scraper is forced to spend time, bandwidth, and storage on worthless data. A single persistent crawler can be held for hours or days, dramatically raising the marginal cost of unauthorized ingestion. + +This directly implements the "tarpit" layer described in Section 4.2 of the primary dissertation and complements the proof-of-work protection provided by Anubis. + +## Key Features for Individuals + +- **Zero ongoing maintenance** - Once deployed behind a `Disallow` path, it runs autonomously. +- **Extremely low resource usage** - Designed to serve infinite content with minimal CPU and memory. +- **Seamless integration** - Works alongside the aggressive-bot UA list in `known-aggressive-bot-user-agents.md`. +- **Multiple deployment modes** - Docker, Python source, or static file generation (Quixotic style). +- **Open source** - Transparent and auditable. + +## How It Fits the Defense Stack + +1. **Anubis** (`anubis.md`) - First filter (PoW challenge for suspicious clients). +2. **Nepenthes** (this document) - Second filter for any crawler that bypasses or ignores the PoW. +3. **Active denial techniques** (decompression bombs, malformed content, slowloris) - Third layer for persistent offenders. +4. **UA reference list** (`known-aggressive-bot-user-agents.md`) - Shared intelligence used by all layers. + +Nepenthes is the natural next step after Anubis. It ensures that even if a sophisticated scraper eventually solves the proof-of-work, it still pays a heavy ongoing cost. + +## Official Resources + +- Project: https://zadzmo.org/code/nepenthes/ +- Coverage: Ars Technica, "AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt" (28 Jan 2025) + +## Recommended Starting Point + +Place Nepenthes behind any path listed in `Disallow` (e.g., `/tarpit/`, `/garbage/`). Only non-compliant user-agents will ever reach it. When combined with Anubis, the two tools form a powerful, low-cost passive perimeter that returns control to the individual creator. + +*Nepenthes is the cornerstone tarpit technology in the passive defense layer. All other techniques in this repository are designed to work alongside it.* \ No newline at end of file