Subnet 96

FLOCK OFF

ABOUT

What exactly does it do?

FLock OFF is a new Bittensor subnet (UID 96) built by FLock.io to crowd‑source high‑quality training data for small AI models on edge devices. Its core purpose is to incentivize creation of compact, high‑signal datasets and use them to improve on‑device model performance. Mminers compete to create high-quality datasets, and validators evaluate them using LoRA training with rewards paid in Bittensor’s TAO token based on dataset performance. The FLock team emphasizes that this approach solves the challenge of making datasets “small in size but massive in knowledge,” ideal for federated learning on resource‑limited devices. In effect, FLock OFF serves as an open, permissionless data marketplace for Bittensor: anyone can join as a miner (data contributor) or validator (evaluator) to help build an ultra‑compact, high‑quality training corpus.

FLock OFF’s value proposition is to create a crowd‑sourced dataset tailored for tiny models (SLMs) running on phones or IoT hardware. By rewarding only the utility of the data (measured by downstream loss) rather than compute, it encourages contributors to curate novel, useful examples. The network aims for an “ultra-high-quality dataset that maximizes knowledge within a fixed size limit,” suitable for supervised fine-tuning or preference optimization of small language models. This aligns with the broader FLock vision of making edge AI efficient and privacy‑preserving, leveraging federated learning and blockchain coordination.

What exactly does it do?

PURPOSE

What exactly is the 'product/build'?

FLock OFF follows Bittensor’s standard miner–validator architecture. Participation is fully permissionless – “anyone can join as a miner or validator”– and coordinated on-chain via Bittensor’s subtensor protocol. Key roles and steps include:

Miner (Data Contributor): Curate a high‑quality dataset (e.g. conversational JSONL records), then upload it to Hugging Face for storage/versioning. The miner’s client registers the dataset’s metadata (commit ID, dataset ID) on-chain with a small TAO fee. The goal is to produce data that, when used for training, yields lower loss than competitors. (Miners need only minimal compute – no GPU – and a Hugging Face account.)

Validator (Evaluator): Periodically, validators scan the chain for active miners and download each submitted dataset via the Hugging Face API. For each dataset, the validator fine-tunes a fixed base model using LoRA (Low-Rank Adaptation) and measures its performance on a standard test set. In FLock OFF, the base model is Qwen 2.5B Instruct (an open LLM from Alibaba), and the LoRA config uses rank 16, alpha 32, dropout 0.1 for a few epochs. After training, the validator computes a win rate or adjusted score for each dataset (effectively the inverse of the loss) and submits these weights back to the chain. This determines how much TAO reward each miner earns. (Validators require heavy GPUs – e.g. an NVIDIA 4090 with 24 GB VRAM – to run these training jobs.)

Rewards & Protocol: Instead of paying simply for compute time, FLock OFF distributes TAO tokens according to dataset utility. As the GitHub docs state, “Rewards (in TAO) are distributed based on dataset performance, not raw compute power”. In practice, miners whose data leads to lower validation loss gain higher “win rates,” increasing their on-chain weight and future earnings. The network enforces deduplication and fairness via identical training settings: duplicates or poor datasets are penalized. All activity (dataset commits, validation results) is recorded on Bittensor’s blockchain, ensuring transparency and reproducibility. This design creates an incentive loop: miners strive to collect novel, informative examples, and validators enforce quality via on‑chain weight updates.

Open Participation: As one newsletter summarized, FLock OFF is “an open network for training small AI models at the edge. Anyone can join as a miner or validator to help improve model performance”. There are no whitelists or permissions required beyond having a Bittensor wallet (coldkey/hotkey) to register on the subnet. This permissionless nature fits Bittensor’s ethos, making FLock OFF a true community-driven federated learning experiment.

Product & Tools

The product of FLock OFF is essentially the federated data pipeline itself. All code is open‑source on GitHub (FLock-io/FLock-subnet). The main deliverables and tools include:

FLock OFF Python Package: Participants run the provided CLI tools (miner.py and validator.py) from the GitHub package. These automate dataset upload, on-chain registration, model training, and weight submission. For example, the miner script packages data.jsonl, pushes it to the user’s HF repo, and commits metadata. The validator script syncs the metagraph, selects miners, downloads datasets, fine-tunes the model, and updates scores. Detailed instructions (installation via uv pip, environment setup, command usage) are documented in the repo.

Hugging Face Integration: FLock OFF leverages Hugging Face as its data storage and version control system. Each miner’s dataset is a Hugging Face repo (with a Git-like commit), which guarantees reproducibility and easy access. Validators fetch datasets via the Hugging Face Hub API. In short, miners use HF repos for “storage” and “versioning,” and validators download data the same way one would pull a Git commit. This removes the need for a custom data backend and taps into the existing ML ecosystem.

Dataset Output: The ultimate “product” is the aggregated high-quality dataset created by all miners. FLock OFF is explicitly designed to yield an “ultra-high-quality dataset” that is compact and knowledge-dense. This crowd‑sourced corpus can then be used for fine-tuning small models or for direct deployment on edge hardware. The FLock team notes that the goal is a dataset ideal for supervised fine-tuning (SFT) and direct preference optimization (DPO) of SLMs. In summary, FLock OFF provides an end‑to‑end pipeline: data collection (by miners) → storage/versioning (HF) → evaluation (validators) → reward distribution.

No separate web UI or API beyond this pipeline is currently advertised. The community can monitor the subnet via standard Bittensor explorers (e.g. tao.app) or the GitHub activity. All elements (code, data) are public, aligning with FLock’s vision of an open, crowd-sourced AI training framework.

Technical Architecture

Under the hood, FLock OFF is a Python-based federated learning application built on the Bittensor stack. Key technical components include:

Software Stack: The FLock OFF repository is a Python project (with an pyproject.toml) that uses PyTorch for model training, the HuggingFace Transformers library (for Qwen and LoRA), and the Bittensor SDK for blockchain integration. Validators install CUDA/cuDNN and Torch to run GPU training. The project uses an “astral uv” environment (a lightweight Python venv manager) for dependencies. In short, it’s a standard PyTorch/Transformers pipeline orchestrated by Python scripts.

Model Architecture: The fixed base model is Qwen 2.5B Instruct, a Chinese open large language model by Alibaba. Validators apply LoRA (Low-Rank Adaptation) to this model for fast fine-tuning. The LoRA configuration uses rank 16, alpha 32, dropout 0.1, and fine-tunes all linear layers. Training is lightweight (batch size 2, gradient accumulation 4, 2 epochs, 4096-token context) so that even large base models can be adapted on a 24 GB GPU. Loss on a fixed eval set determines dataset quality. (This setup reflects FLock’s focus on parameter-efficient fine-tuning for edge models.)

Hardware & Infra: Validators need high-end GPUs: NVIDIA RTX 4090 (24 GB) is recommended; even an RTX 3060 (12 GB) is the minimum. They also need ~50 GB disk, 16 GB RAM, and a multi-core CPU. Miners, by contrast, need only a normal CPU (~8 GB RAM, ~10 GB storage) and a HuggingFace token. In practice, validators are run by participants (including possibly FLock’s own nodes) on public clouds or personal hardware. Data is stored and accessed via HuggingFace’s distributed storage, so no centralized server is required.

Blockchain Integration: FLock OFF runs on Bittensor’s mainnet (subtensor network). Participants must register a Bittensor wallet (cold key) and “hotkey” on the subnet to transact. The miner/validator scripts internally call subtensor commands to submit metadata and weight updates on-chain. All economics (TAO token rewards, emission rates) follow Bittensor’s subnet rules. .

In summary, FLock OFF is a fully open-source PyTorch/Transformers pipeline on Bittensor, using LoRA and HuggingFace for efficiency and distribution. Its dependencies are all public ML libraries and Bittensor’s own networking code, making the system transparent and extensible.

What exactly is the 'product/build'?

Product & Tools

The product of FLock OFF is essentially the federated data pipeline itself. All code is open‑source on GitHub (FLock-io/FLock-subnet). The main deliverables and tools include:

Technical Architecture

Under the hood, FLock OFF is a Python-based federated learning application built on the Bittensor stack. Key technical components include:

WHO

Team Info

FLock OFF is developed by FLock.io, a London-based decentralized AI startup. Public records identify key team members and partners:

Jiahao Sun – Founder and CEO of FLock.io. An Oxford-educated AI researcher, Jiahao spearheads FLock’s vision for privacy-preserving federated learning on-chain.

Yifan Xie, EngD – VP of Developer Relations at FLock.io. Holds a doctorate in data science and leads community/technical outreach for FLock.

Vatsal Shah – Machine Learning Specialist at FLock.io. Focuses on federated ML research and model training methodologies.

Sameeha Rehman – Chief of Staff

Nsikakabasi (Thomas) – Project Developer at FLock.io. Works on blockchain and protocol integration.

Team Info

FLock OFF is developed by FLock.io, a London-based decentralized AI startup. Public records identify key team members and partners:

Jiahao Sun – Founder and CEO of FLock.io. An Oxford-educated AI researcher, Jiahao spearheads FLock’s vision for privacy-preserving federated learning on-chain.

Yifan Xie, EngD – VP of Developer Relations at FLock.io. Holds a doctorate in data science and leads community/technical outreach for FLock.

Vatsal Shah – Machine Learning Specialist at FLock.io. Focuses on federated ML research and model training methodologies.

Sameeha Rehman – Chief of Staff

Nsikakabasi (Thomas) – Project Developer at FLock.io. Works on blockchain and protocol integration.

FUTURE

Roadmap

Dec 2024: FLock community vote approved building a Bittensor subnet. This internal decision (by token holders or community poll) officially green-lit the project.

May 2, 2025: Season 1 mining began at 9:06 AM EST, as announced on FLock’s blog. (Season 1 will run until the emissions schedule depletes or a new season is declared.)

Ongoing (Summer 2025): The subnet is actively collecting data and running LoRA evaluations. Community members are joining as miners/validators. FLock is soliciting high-quality data contributions, emphasizing domains needed for edge AI.

Looking ahead, public statements indicate two main goals:

Complete the Dataset: FLock OFF aims to produce a final dataset release once the competition is complete. The FLock team describes the outcome as a “best dataset for SFT and DPO… open, evolving, and crowd-sourced by an aligned community”. This suggests they will curate and publish the aggregated data for broader use (possibly through HuggingFace or their own platform).

Train SLMs on the Data: A long-term ambition is to use FLock OFF’s dataset to build an open-source small language model that can rival proprietary models. The blog states: “Our long-term goal is to build an open-source SLM that can surpass GPT-4.1-nano and other efficient closed models.”. In other words, after completing data collection on Bittensor, FLock plans to fine-tune or train SLMs (potentially in their own infrastructure or via future subnets) using the FLock OFF data. This aligns with FLock’s broader roadmap (e.g. releasing new models, FL Alliance expansion).

While FLock.io’s public 2025 roadmap does not detail the subnet, it emphasizes growth in edge AI and federated learning through new models and community tools. FLock OFF fits into this plan as the data foundation. No timelines for “Season 2” have been announced, but likely FLock will declare further seasons or fork new subnets as needed. In summary, FLock OFF’s history is short but eventful: a community-initiated project in late 2024, launched in mid-2025, now collecting data on-chain. Its immediate status is an ongoing competition (Season 1). Future plans revolve around consolidating the data and leveraging it to advance FLock’s vision of a decentralized edge-AI ecosystem.

Roadmap

Dec 2024: FLock community vote approved building a Bittensor subnet. This internal decision (by token holders or community poll) officially green-lit the project.

May 2, 2025: Season 1 mining began at 9:06 AM EST, as announced on FLock’s blog. (Season 1 will run until the emissions schedule depletes or a new season is declared.)

Looking ahead, public statements indicate two main goals:

NEWS

Announcements

FLock.io Follow 3,810 187,873

Not your models, not your AI

FLock.io @flock_io ·

8h 1944964974281146729

We've finalized 10 model training tasks on FLock AI Arena

You've built this, not us 💙

Reply on Twitter 1944964974281146729 Retweet on Twitter 1944964974281146729 4 Like on Twitter 1944964974281146729 42 X 1944964974281146729

Retweet on Twitter FLock.io Retweeted

Animoca Intern (evil clone summer arc) @animocainsights ·

13 Jul 1944422895809183990

Congrats to our big brain frens at @flock_io for winning Best Application at the prestigious IEEE Global Blockchain Conference! 🙌💻

Reply on Twitter 1944422895809183990 Retweet on Twitter 1944422895809183990 6 Like on Twitter 1944422895809183990 25 X 1944422895809183990

FLock.io @flock_io ·

14 Jul 1944687355702780305

There’s no future where AI can be both private and scalable without zero-knowledge (ZK)

Ethereum going all in on ZK is a big step for scaling trust onchain, and it’s also aligned with our research in zkFL.

Cathie Wood @CathieDWood

I can’t say I understand all of the details here, but the Ethereum Foundation does seem to be proposing the right moves for scalability and privacy to maintain its lead in the institutional world.

Reply on Twitter 1944687355702780305 Retweet on Twitter 1944687355702780305 1 Like on Twitter 1944687355702780305 25 X 1944687355702780305

FLock.io @flock_io ·

14 Jul 1944687357900607540

ICYMI, we received an Academic Grant from @ethereum Foundation to explore how ZK enhances gradient aggregation for Federated Learning.

More on what we’re building:

FLock Awarded Ethereum Foundation Research Grant

FLock awarded grant to fund research into incentive mechanisms for blockchain-based machine learning solutions

www.flock.io

Reply on Twitter 1944687357900607540 Retweet on Twitter 1944687357900607540 2 Like on Twitter 1944687357900607540 13 X 1944687357900607540

FLock.io @flock_io ·

14 Jul 1944602192415178950

gmflock

What's AI trending on Base?

Big week coming for the @base ecosystem

jesse.base.eth @jessepollak

creators, memes, DeFi, and AI trending on TBA

Reply on Twitter 1944602192415178950 Retweet on Twitter 1944602192415178950 4 Like on Twitter 1944602192415178950 40 X 1944602192415178950

FLock.io @flock_io ·

12 Jul 1943958514562629980

trust the process. flock coded

Reply on Twitter 1943958514562629980 Retweet on Twitter 1943958514562629980 1 Like on Twitter 1943958514562629980 31 X 1943958514562629980

FLock.io @flock_io ·

11 Jul 1943614242462335174

We’re exploring what it takes to train healthcare AI models across continents — without moving a single datapoint.

Our latest research shows how blockchain + federated learning (FL) can solve healthcare’s biggest challenge: data sharing.

(we won a prestigious award for it) 🧵