RWKV Work Summary (as of July 2025)
Overview
You are building a multi-agent AI cluster using RWKV-7 models. Your focus is on low-resource, efficient models that support long-term memory, recursion, and modular specialization. You’re using RWKV due to its linear-time, constant-space RNN-style architecture, making it ideal for edge devices and scalable, recursive reasoning.
π§ System Architecture
- Cluster Composition:
- Node 0 (
192.168.1.30
) β Central file share and primary comms. - Node 1 (
192.168.1.31
) β 26 CPUs, 32GB RAM, NVIDIA 2060 GPU.
- Node 0 (
- Model Deployment:
- RWKV-7 is the focus; RWKV-65M is used as the fallback base.
- Tokenizer sourced from
rwkv_pip_package/src/rwkv/rwkv_tokenizer.py
.
- Environment:
- Ubuntu 24.04
- Isolated in Python
venv
environments - MPICH used for message-passing instead of OpenMPI
- Excluded:
fail2ban
,ufw
,htop
,nvtop
(lean install) tmux
used for persistent sessions- Flask + Jinja2 for Web UI, styled as a dark terminal theme
Model Logic and Behavior
- RWKV Purpose:
- Serves as the default fallback model per node
- Meant to operate autonomously, adaptively evolving
- Agents self-assign based on logs, runtime behavior, and symbolic drift
- Tokenizer & Inference:
- Pipeline inference scripts were developed, including debugging steps for:
- Tokenizer loading
- Input/output shape validation
- CUDA and driver conflicts (e.g., NVIDIA-SMI checks)
- Pipeline inference scripts were developed, including debugging steps for:
Deployment Enhancements
You designed and iterated on an install and diagnostics script, which:
- Parses and validates virtual environment paths
- Confirms NVIDIA GPU visibility
- Logs and verifies each installation step
- Supports AutoFix steps during deployment
- Includes version/timestamped cluster checkpoint JSONs for traceability
Special Features and Goals
- Recursive Memory Engine (planned): Eventually intend to include decentralized memory graph for long-term token tracking.
- Idle-cycle Evolution: RWKV agents mutate and specialize during low usage.
- Persistent Indexing: AI cluster parses and learns from your ChatGPT export logs to form a personal memory base.
- Multi-Agent Design: RWKV nodes are treated as evolving digital organisms with emerging roles.
β Achievements
- β Successful installation of RWKV-7 and tokenizer
- β Inference test pipeline created
- β Diagnostic script with error tracing implemented
- β MPICH integrated for inter-node comms
- β Web UI design completed
- β RWKV-65M deployed as fallback on each node
Upcoming/Planned
- Final integration of persistent memory graph
- RWKV agent mutation ruleset (drift logic)
- Multi-agent coordination and specialization engine
- Dynamic RPC/ZeroMQ communication layer
- Connection of model logs to training behaviors