With the amount of new subnets being added it can be hard to get up to date information across all subnets, so data may be slightly out of date from time to time
Bittensor Subnet 9 (Pretraining) is a specialized subnet of the Bittensor network designed to incentivize the open training of large language models (“foundation models”) on a massive web dataset. In this subnet, AI miners (participants training models) are rewarded with the native token TAO for producing the best pretrained models on the Falcon Refined Web dataset – a web-scale corpus on the order of hundreds of millions of pages. The subnet functions as a continuous benchmark competition: all miners train models of a given architecture on the same data, and those achieving the lowest language modeling loss on random data samples earn the highest rewards. In essence, Subnet 9’s purpose is to crowdsource the pre-training of state-of-the-art (SOTA) AI models in a decentralized way, rewarding participants for collectively pushing model performance on an open dataset.
This incentivized pre-training mechanism serves two key goals: (1) Produce high-quality pretrained models that can serve as foundations for downstream tasks in the Bittensor ecosystem, and (2) Demonstrate decentralized AI training – showing that multiple independent actors can coordinate (via crypto incentives) to train models that rival those developed by large centralized labs. By making pretraining into a competitive, open marketplace, Subnet 9 aims to unlock “the true use case that blockchains have been searching for” – i.e. the creation of machine intelligence as a communal effort. This is viewed as a crucial contribution to Bittensor’s vision of a universal AI network, turning decentralized compute and incentives into tangible improvements in AI capabilities.
Bittensor Subnet 9 (Pretraining) is a specialized subnet of the Bittensor network designed to incentivize the open training of large language models (“foundation models”) on a massive web dataset. In this subnet, AI miners (participants training models) are rewarded with the native token TAO for producing the best pretrained models on the Falcon Refined Web dataset – a web-scale corpus on the order of hundreds of millions of pages. The subnet functions as a continuous benchmark competition: all miners train models of a given architecture on the same data, and those achieving the lowest language modeling loss on random data samples earn the highest rewards. In essence, Subnet 9’s purpose is to crowdsource the pre-training of state-of-the-art (SOTA) AI models in a decentralized way, rewarding participants for collectively pushing model performance on an open dataset.
This incentivized pre-training mechanism serves two key goals: (1) Produce high-quality pretrained models that can serve as foundations for downstream tasks in the Bittensor ecosystem, and (2) Demonstrate decentralized AI training – showing that multiple independent actors can coordinate (via crypto incentives) to train models that rival those developed by large centralized labs. By making pretraining into a competitive, open marketplace, Subnet 9 aims to unlock “the true use case that blockchains have been searching for” – i.e. the creation of machine intelligence as a communal effort. This is viewed as a crucial contribution to Bittensor’s vision of a universal AI network, turning decentralized compute and incentives into tangible improvements in AI capabilities.
Subnet 9’s design revolves around two roles: Miners who train and upload models, and Validators who evaluate those models. The process works as follows:
Competition dynamics: Subnet 9 runs in epochs (e.g. 360 blockchain blocks per epoch in one design) during which validators count how often each miner’s model achieves the lowest loss on a given batch compared to others. Each such “win” increments the miner’s score. At epoch end, rewards are proportional to wins (with the base +1 applied)
github.com. To incentivize miners to quickly reach better models, an “epsilon” advantage is given: the miner who had the overall lowest loss in the prior epoch is treated as if its loss were a bit lower (multiplied by ε < 1) in the next epoch’s comparisons. This gives a slight head-start to the top model, simulating a “winner’s momentum.” Overall, this mechanism encourages miners to continuously improve their models and outdo each other in a fair, transparent race for lower loss.
Example: In practice, miners might start by training a ~700M-parameter Transformer on the dataset. Validators sample, say, 22 random text excerpts (“pages”) per evaluation step and compute each model’s loss. If Miner A’s model yields the lowest loss on a given excerpt (after any ε adjustment if applicable), Miner A gets a win for that batch. After many such samples, suppose Miner A has 100 wins, Miner B 80 wins, others fewer – these translate into weighting scores. If Miner A’s model consistently outperforms all others, Miner A will receive the largest portion of the TAO reward for that epoch. This setup effectively gamifies the pretraining process: miners are in a continual competition to push their model’s accuracy ahead of peers.
Technical Architecture and Tooling
Model Architecture & Dataset: To ensure a fair competition, Subnet 9 fixes the model structure and training data across miners. Initially, the subnet focused on GPT-2 style causal language models, using a uniform architecture and size for all participants. Over time, it has expanded to support newer Transformer architectures (for example, LLaMA, Falcon, Mistral, GPT-J/NeoX, BART and others) under controlled parameter limits. The allowed model classes are defined in the subnet’s code (e.g. a list of HuggingFace Transformer models that miners can choose from) and may update at specific block heights to introduce larger or different architectures. For instance, as of 2024 the subnet allowed models roughly in the 700M parameter range and then opened tiers for ~3B, ~7B, and ~14B models (each tier having its own competition). All models are trained on the Falcon RefinedWeb dataset – a cleaned web crawl of about 900 million pages (~3 trillion tokens) developed by TII/UAE for the Falcon LLM project. This dataset provides near-infinite random samples of text, ensuring miners never run out of training data. The training sequence length is typically 2048 or 4096 tokens to match modern LLM standards.
Frameworks & Infrastructure: The subnet leverages popular AI frameworks for its implementation. Miners run on PyTorch with models implemented via the Hugging Face Transformers library (the allowed architectures correspond to classes in transformers like GPT2LMHeadModel, LlamaForCausalLM, etc.). Training optimizations such as FlashAttention and mixed precision (bfloat16) are used to speed up training and reduce memory, especially for larger models. The codebase (open-sourced on GitHub) provides scripts for miners to periodically save and upload model weights to Hugging Face Hub and for validators to download those weights for evaluation. Each miner/validator node runs a Bittensor client that interacts with the Subtensor blockchain (the Bittensor chain) for registering the node and writing/reading metadata. Miners must have a Bittensor wallet with a registered hotkey (identity) to participate.
Because model files can be large (potentially many gigabytes), miners need sufficient disk space (50+ GB recommended) and only one model upload per ~20 minutes is allowed by the chain to prevent spamming. This rate-limit means miners typically train for a while and only publish when a significant improvement (lower loss) is reached, controlled by a configurable loss threshold trigger for uploads. Validators run in a loop, grabbing new model versions and scoring them on batches of text. They use a small batch size (often 1) and may evaluate up to a certain number of batches per dataset per cycle (e.g. 50) to balance speed and thoroughness. The evaluation results (losses) are reported and also logged to a public Weights & Biases dashboard (an official project page at wandb.ai) for community transparency. This allows anyone to see how each miner’s model is doing in terms of loss curves, etc., fostering an open research environment.
Network parameters: Subnet 9 is a permissionless subnet – as of 2025, registration is open for anyone who stakes a small amount of TAO and has the computational resources to participate. It supports up to 256 miners and 256 validators (a limit set by Bittensor’s design) – though the active counts have been lower (e.g. ~21 miners and 11 validators at one point in 2024). Each miner and validator is identified by a UID on-chain. The subnet’s on-chain UID = 9, and it emits a certain fraction of the overall TAO inflation (on the order of ~0.9% of TAO emissions dedicated to this subnet, as per mid-2024 data). The consensus mechanism (Yuma Consensus) uses both the submitted weights and each validator’s stake to finalize a global weight for each miner. This means validators with more staked TAO (or delegated stake) carry more influence, aligning incentives for them to evaluate honestly (since they have “skin in the game”). The reward payout is continuous; effectively at every block, a portion of newly minted TAO is distributed to Subnet 9 participants in proportion to their weights. By aligning economic rewards with model quality, the subnet’s architecture creates a self-sustaining training loop where better models earn more currency, and that currency can in turn attract more compute and talent to improve the models.
Recent Updates and Achievements
Subnet 9 launched in late 2023 (the team refers to it as a “living experiment which began in November 2023”) and has rapidly progressed. Official updates from 2024 highlight several noteworthy milestones:
Looking ahead into 2025, the expectation is that Subnet 9 will continue to iterate rapidly. Official communications hint at upcoming larger-model milestones (possibly 30B or 70B parameter ranges if collaborative training becomes viable), and deeper integration with other subnets (so that, for example, a model pretrained on SN9 can be seamlessly fine-tuned on a specialized subnet like a Q&A or coding subnet). The tone of recent announcements is optimistic: the team often emphasizes that these are still “early stages” and that every few months the network is hitting a new level (either in model quality or new features). By making all developments public and publishing research (including negative results or challenges) openly, the contributors aim to attract more AI developers to join the effort. The TAO token incentives continue to be a unique draw – as miners who contribute significant computing power can earn a stake in the network’s currency, which itself has grown in market value as the project gains prominence. This feedback loop of technical progress and economic incentive suggests that Subnet 9’s most impactful updates are still to come. As one community summary put it, “Pretraining is the perfect case study to prove decentralized AI’s potential – turning decentralization’s advantages into tangible benefits”, and Subnet 9’s ongoing evolution is closely watched as a bellwether for the Bittensor ecosystem at large.
Subnet 9’s design revolves around two roles: Miners who train and upload models, and Validators who evaluate those models. The process works as follows:
Competition dynamics: Subnet 9 runs in epochs (e.g. 360 blockchain blocks per epoch in one design) during which validators count how often each miner’s model achieves the lowest loss on a given batch compared to others. Each such “win” increments the miner’s score. At epoch end, rewards are proportional to wins (with the base +1 applied)
github.com. To incentivize miners to quickly reach better models, an “epsilon” advantage is given: the miner who had the overall lowest loss in the prior epoch is treated as if its loss were a bit lower (multiplied by ε < 1) in the next epoch’s comparisons. This gives a slight head-start to the top model, simulating a “winner’s momentum.” Overall, this mechanism encourages miners to continuously improve their models and outdo each other in a fair, transparent race for lower loss.
Example: In practice, miners might start by training a ~700M-parameter Transformer on the dataset. Validators sample, say, 22 random text excerpts (“pages”) per evaluation step and compute each model’s loss. If Miner A’s model yields the lowest loss on a given excerpt (after any ε adjustment if applicable), Miner A gets a win for that batch. After many such samples, suppose Miner A has 100 wins, Miner B 80 wins, others fewer – these translate into weighting scores. If Miner A’s model consistently outperforms all others, Miner A will receive the largest portion of the TAO reward for that epoch. This setup effectively gamifies the pretraining process: miners are in a continual competition to push their model’s accuracy ahead of peers.
Technical Architecture and Tooling
Model Architecture & Dataset: To ensure a fair competition, Subnet 9 fixes the model structure and training data across miners. Initially, the subnet focused on GPT-2 style causal language models, using a uniform architecture and size for all participants. Over time, it has expanded to support newer Transformer architectures (for example, LLaMA, Falcon, Mistral, GPT-J/NeoX, BART and others) under controlled parameter limits. The allowed model classes are defined in the subnet’s code (e.g. a list of HuggingFace Transformer models that miners can choose from) and may update at specific block heights to introduce larger or different architectures. For instance, as of 2024 the subnet allowed models roughly in the 700M parameter range and then opened tiers for ~3B, ~7B, and ~14B models (each tier having its own competition). All models are trained on the Falcon RefinedWeb dataset – a cleaned web crawl of about 900 million pages (~3 trillion tokens) developed by TII/UAE for the Falcon LLM project. This dataset provides near-infinite random samples of text, ensuring miners never run out of training data. The training sequence length is typically 2048 or 4096 tokens to match modern LLM standards.
Frameworks & Infrastructure: The subnet leverages popular AI frameworks for its implementation. Miners run on PyTorch with models implemented via the Hugging Face Transformers library (the allowed architectures correspond to classes in transformers like GPT2LMHeadModel, LlamaForCausalLM, etc.). Training optimizations such as FlashAttention and mixed precision (bfloat16) are used to speed up training and reduce memory, especially for larger models. The codebase (open-sourced on GitHub) provides scripts for miners to periodically save and upload model weights to Hugging Face Hub and for validators to download those weights for evaluation. Each miner/validator node runs a Bittensor client that interacts with the Subtensor blockchain (the Bittensor chain) for registering the node and writing/reading metadata. Miners must have a Bittensor wallet with a registered hotkey (identity) to participate.
Because model files can be large (potentially many gigabytes), miners need sufficient disk space (50+ GB recommended) and only one model upload per ~20 minutes is allowed by the chain to prevent spamming. This rate-limit means miners typically train for a while and only publish when a significant improvement (lower loss) is reached, controlled by a configurable loss threshold trigger for uploads. Validators run in a loop, grabbing new model versions and scoring them on batches of text. They use a small batch size (often 1) and may evaluate up to a certain number of batches per dataset per cycle (e.g. 50) to balance speed and thoroughness. The evaluation results (losses) are reported and also logged to a public Weights & Biases dashboard (an official project page at wandb.ai) for community transparency. This allows anyone to see how each miner’s model is doing in terms of loss curves, etc., fostering an open research environment.
Network parameters: Subnet 9 is a permissionless subnet – as of 2025, registration is open for anyone who stakes a small amount of TAO and has the computational resources to participate. It supports up to 256 miners and 256 validators (a limit set by Bittensor’s design) – though the active counts have been lower (e.g. ~21 miners and 11 validators at one point in 2024). Each miner and validator is identified by a UID on-chain. The subnet’s on-chain UID = 9, and it emits a certain fraction of the overall TAO inflation (on the order of ~0.9% of TAO emissions dedicated to this subnet, as per mid-2024 data). The consensus mechanism (Yuma Consensus) uses both the submitted weights and each validator’s stake to finalize a global weight for each miner. This means validators with more staked TAO (or delegated stake) carry more influence, aligning incentives for them to evaluate honestly (since they have “skin in the game”). The reward payout is continuous; effectively at every block, a portion of newly minted TAO is distributed to Subnet 9 participants in proportion to their weights. By aligning economic rewards with model quality, the subnet’s architecture creates a self-sustaining training loop where better models earn more currency, and that currency can in turn attract more compute and talent to improve the models.
Recent Updates and Achievements
Subnet 9 launched in late 2023 (the team refers to it as a “living experiment which began in November 2023”) and has rapidly progressed. Official updates from 2024 highlight several noteworthy milestones:
Looking ahead into 2025, the expectation is that Subnet 9 will continue to iterate rapidly. Official communications hint at upcoming larger-model milestones (possibly 30B or 70B parameter ranges if collaborative training becomes viable), and deeper integration with other subnets (so that, for example, a model pretrained on SN9 can be seamlessly fine-tuned on a specialized subnet like a Q&A or coding subnet). The tone of recent announcements is optimistic: the team often emphasizes that these are still “early stages” and that every few months the network is hitting a new level (either in model quality or new features). By making all developments public and publishing research (including negative results or challenges) openly, the contributors aim to attract more AI developers to join the effort. The TAO token incentives continue to be a unique draw – as miners who contribute significant computing power can earn a stake in the network’s currency, which itself has grown in market value as the project gains prominence. This feedback loop of technical progress and economic incentive suggests that Subnet 9’s most impactful updates are still to come. As one community summary put it, “Pretraining is the perfect case study to prove decentralized AI’s potential – turning decentralization’s advantages into tangible benefits”, and Subnet 9’s ongoing evolution is closely watched as a bellwether for the Bittensor ecosystem at large.
The development of Subnet 9 is the result of collaboration between the core Bittensor team and community contributors organized under the banner of Macrocosmos. Macrocosmos is an open-source AI research lab building on Bittensor, which manages Subnet 9 (“SN9”) operations and research. In other words, Subnet 9 is managed by Macrocosmos in coordination with the Bittensor/OpenTensor core team. Macrocosmos contributors have designed the incentive mechanisms and code for SN9, and they actively run dashboards and publish research updates. According to the August 2024 Subnet 9 Pretraining Whitepaper, the authors specifically thank Const, Fish, Sid, Rustic, Alan, Rodrigo, Will, and Steffen as part of the Subnet 9 team.
Will Squires – CEO and Co-Founder
Will has dedicated his career to navigating complexity, spanning from designing and constructing significant infrastructure to spearheading the establishment of an AI accelerator. With a background in engineering, he made notable contributions to transport projects such as Crossrail and HS2. Will’s expertise led to an invitation to serve on the Mayor of London’s infrastructure advisory panel and to lecture at UCL’s Centre for Advanced Spatial Analysis (CASA). He was appointed by AtkinsRéalis to develop an AI accelerator, which expanded to encompass over 60 staff members globally. At XYZ Reality, a company specializing in augmented reality headsets, Will played a pivotal role in product and software development, focusing on holographic technology. Since 2023, Will has provided advisory services for the Opentensor Foundation, contributing to the launch of Revolution.
Steffen Cruz – CTO and Co-Founder
Steffen earned his PhD in subatomic physics from the University of British Columbia, Canada, focusing on developing software to enhance the detection of extremely rare events (10^-7). His groundbreaking research contributed to the identification of novel exotic states of nuclear matter and has been published in prestigious scientific journals. As the founding engineer of SolidState AI, he pioneered innovative techniques for physics-informed machine learning (PIML). Steffen was subsequently appointed as the Chief Technology Officer of the Opentensor Foundation, where he played a pivotal role as a core developer of Subnet 1, the foundation’s flagship subnet. In this capacity, he enhanced the adoption and accessibility of Bittensor by authoring technical documentation, tutorials, and collaborating on the development of the subnet template.
Michael Bunting – CFO
Before joining Macrocosmos, Mike spent 12 years in investment banking, where he guided clients through major strategic and financial transitions across more than £1 billion in international M&A and capital raising deals. Most recently serving as a Director at Piper Sandler, he brings deep experience in advising high-growth startups on strategy, business planning, funding pathways, and corporate governance. Mike has also worked closely with multinational corporations and prominent financial investors throughout his career.
Elena Nesterova – Head of Delivery
Volodymyr Truba – Senior Machine Learning Engineer
Alma Schalèn – Head of Product Design
Felix Quinque – Machine Learning Lead
Dmytro Bobrenko – Machine Learning/AI Lead
Alan Aboudib – Machine Learning Lead
Alex Williams – People & Talent Manager
Chris Zacharia – Communications Lead
Brian McCrindle – Senior Machine Learning Engineer
Lawrence Hunt – Frontend Engineer
Nicholas Miller – Senior Software Engineer
Kalei Brady – Data Scientist
Szymon Fonau – Machine Learning Engineer
Monika Stankiewicz – Executive Assistant
Amy Chai – Junior Machine Learning Engineer
Giannis Evagorou – Senior Software Engineer
Richard Wardle – Junior Software Engineer
Kai Morris – Content & Community specialist
Lewis Sword – Junior Software Engineer
The development of Subnet 9 is the result of collaboration between the core Bittensor team and community contributors organized under the banner of Macrocosmos. Macrocosmos is an open-source AI research lab building on Bittensor, which manages Subnet 9 (“SN9”) operations and research. In other words, Subnet 9 is managed by Macrocosmos in coordination with the Bittensor/OpenTensor core team. Macrocosmos contributors have designed the incentive mechanisms and code for SN9, and they actively run dashboards and publish research updates. According to the August 2024 Subnet 9 Pretraining Whitepaper, the authors specifically thank Const, Fish, Sid, Rustic, Alan, Rodrigo, Will, and Steffen as part of the Subnet 9 team.
Will Squires – CEO and Co-Founder
Will has dedicated his career to navigating complexity, spanning from designing and constructing significant infrastructure to spearheading the establishment of an AI accelerator. With a background in engineering, he made notable contributions to transport projects such as Crossrail and HS2. Will’s expertise led to an invitation to serve on the Mayor of London’s infrastructure advisory panel and to lecture at UCL’s Centre for Advanced Spatial Analysis (CASA). He was appointed by AtkinsRéalis to develop an AI accelerator, which expanded to encompass over 60 staff members globally. At XYZ Reality, a company specializing in augmented reality headsets, Will played a pivotal role in product and software development, focusing on holographic technology. Since 2023, Will has provided advisory services for the Opentensor Foundation, contributing to the launch of Revolution.
Steffen Cruz – CTO and Co-Founder
Steffen earned his PhD in subatomic physics from the University of British Columbia, Canada, focusing on developing software to enhance the detection of extremely rare events (10^-7). His groundbreaking research contributed to the identification of novel exotic states of nuclear matter and has been published in prestigious scientific journals. As the founding engineer of SolidState AI, he pioneered innovative techniques for physics-informed machine learning (PIML). Steffen was subsequently appointed as the Chief Technology Officer of the Opentensor Foundation, where he played a pivotal role as a core developer of Subnet 1, the foundation’s flagship subnet. In this capacity, he enhanced the adoption and accessibility of Bittensor by authoring technical documentation, tutorials, and collaborating on the development of the subnet template.
Michael Bunting – CFO
Before joining Macrocosmos, Mike spent 12 years in investment banking, where he guided clients through major strategic and financial transitions across more than £1 billion in international M&A and capital raising deals. Most recently serving as a Director at Piper Sandler, he brings deep experience in advising high-growth startups on strategy, business planning, funding pathways, and corporate governance. Mike has also worked closely with multinational corporations and prominent financial investors throughout his career.
Elena Nesterova – Head of Delivery
Volodymyr Truba – Senior Machine Learning Engineer
Alma Schalèn – Head of Product Design
Felix Quinque – Machine Learning Lead
Dmytro Bobrenko – Machine Learning/AI Lead
Alan Aboudib – Machine Learning Lead
Alex Williams – People & Talent Manager
Chris Zacharia – Communications Lead
Brian McCrindle – Senior Machine Learning Engineer
Lawrence Hunt – Frontend Engineer
Nicholas Miller – Senior Software Engineer
Kalei Brady – Data Scientist
Szymon Fonau – Machine Learning Engineer
Monika Stankiewicz – Executive Assistant
Amy Chai – Junior Machine Learning Engineer
Giannis Evagorou – Senior Software Engineer
Richard Wardle – Junior Software Engineer
Kai Morris – Content & Community specialist
Lewis Sword – Junior Software Engineer
Subnet 9 is still in active development, and the team has outlined an ambitious roadmap to expand its capabilities. According to the August 2024 whitepaper and subsequent updates, upcoming and ongoing plans include:
In summary, the future of Subnet 9 involves scaling up (bigger models, more modalities), fine-tuning the game mechanics for fairness and efficiency, broadening evaluation metrics, and integrating the subnet’s output into real-world AI pipelines. All of these steps aim to continuously push the frontier of what a decentralized AI network can do, with the endgame being a network that can autonomously train SOTA models as effectively as (or in novel ways better than) a centralized tech company could.
Subnet 9 is still in active development, and the team has outlined an ambitious roadmap to expand its capabilities. According to the August 2024 whitepaper and subsequent updates, upcoming and ongoing plans include:
In summary, the future of Subnet 9 involves scaling up (bigger models, more modalities), fine-tuning the game mechanics for fairness and efficiency, broadening evaluation metrics, and integrating the subnet’s output into real-world AI pipelines. All of these steps aim to continuously push the frontier of what a decentralized AI network can do, with the endgame being a network that can autonomously train SOTA models as effectively as (or in novel ways better than) a centralized tech company could.
Huge thanks to Keith Singery (aka Bittensor Guru) for all of his fantastic work in the Bittensor community. Make sure to check out his other video/audio interviews by clicking HERE.
Steffen Cruz, previously the CTO of the Opentensor Foundation, has joined forces with his longtime friend Will Squires to establish Macrocosmos. Leading subnets 1, 9, 13, 25 and 37, this team is actively shaping the future of Bittensor and stands as one of the most influential entities within the ecosystem.
In this second video, they spend much of the episode covering Subnet 13’s rebranding to “Gravity” and the team’s prediction of a Trump victory along with how this has managed to build a team of PHDs and machine learning professionals around Bittensor.
Keep ahead of the Bittensor exponential development curve…
Subnet Alpha is an informational platform for Bittensor Subnets.
This site is not affiliated with the Opentensor Foundation or TaoStats.
The content provided on this website is for informational purposes only. We make no guarantees regarding the accuracy or currency of the information at any given time.
Subnet Alpha is created and maintained by The Realistic Trader. If you have any suggestions or encounter any issues, please contact us at [email protected].
Copyright 2024