Skip to content

Add Redis storage backend #3407

@vdusek

Description

@vdusek

Motivation

Crawlee Python already ships a Redis storage backend (RedisStorageClient) using redis[hiredis]. It covers all three storage types — Dataset, KeyValueStore, and RequestQueue — with Lua scripts for atomic operations and support for both exact (set-based) and probabilistic (Bloom filter) request deduplication.

A Redis backend in JS Crawlee would enable distributed crawling, shared state across processes/machines, and better scalability for high-throughput workloads.

Scope

  • Implement a Redis-based storage client for JS Crawlee covering Dataset, KeyValueStore, and RequestQueue.
  • Port the Lua scripts for atomic fetch, add, and stale request reclaim operations.
  • Support configurable deduplication strategies (Redis sets vs Bloom filters).

Blockers

Python reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureIssues that represent new features or improvements to existing features.t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions