With the amount of new subnets being added it can be hard to get up to date information across all subnets, so data may be slightly out of date from time to time

Subnet 105

SoundsRight

Emissions
Value
Recycled
Value
Recycled (24h)
Value
Registration Cost
Value
Active Validators
Value
Active Miners
Value
Active Dual Miners/Validators
Value

ABOUT

What exactly does it do?

SoundsRight, designated as Subnet 105, is dedicated to the research and development of non-proprietary speech enhancement models. As more of our daily lives revolve around consuming online content, there is growing emphasis on high-quality audio. Speech enhancement is a complex field that involves tasks like separating desired speech from background noise, which requires training sophisticated models capable of distinguishing between different audio components under various circumstances.

The fundamental challenge that SoundsRight addresses is that much of speech enhancement technology is currently hidden behind paywalls, despite all necessary components for open-source innovation being readily available. SoundsRight aims to spearhead open-source speech enhancement technology through daily fine-tuning competitions, making high-quality audio processing more accessible to the broader community.

SoundsRight, designated as Subnet 105, is dedicated to the research and development of non-proprietary speech enhancement models. As more of our daily lives revolve around consuming online content, there is growing emphasis on high-quality audio. Speech enhancement is a complex field that involves tasks like separating desired speech from background noise, which requires training sophisticated models capable of distinguishing between different audio components under various circumstances.

The fundamental challenge that SoundsRight addresses is that much of speech enhancement technology is currently hidden behind paywalls, despite all necessary components for open-source innovation being readily available. SoundsRight aims to spearhead open-source speech enhancement technology through daily fine-tuning competitions, making high-quality audio processing more accessible to the broader community.

PURPOSE

What exactly is the 'product/build'?

SoundsRight operates as a specialized subnet within the Bittensor decentralized ecosystem, focusing exclusively on speech enhancement technology. The subnet creates a competitive environment where participants (miners) develop and fine-tune speech enhancement models, which are then evaluated by validators to determine the best-performing solutions.

The core function of SoundsRight is to facilitate daily fine-tuning competitions for speech enhancement models. These competitions currently focus on two primary tasks:

  • Denoising: Removing unwanted background noise from speech recordings while preserving the quality and intelligibility of the desired speech.
  • Dereverberation: Reducing or eliminating the echo and reverberation effects that occur when audio is recorded in spaces with reflective surfaces.

 

Each competition follows a winner-takes-all format, which incentivizes miners to submit their absolute best models rather than multiple variations. This format, combined with the validation mechanism, deters miner factions by making model duplication unviable.

 

How SoundsRight Works

The SoundsRight subnet operates through a well-defined workflow involving miners, validators, HuggingFace (as a model repository), the Bittensor blockchain, and the subnet’s website. Here’s a detailed breakdown of how the system functions:

Miner-Validator Architecture

There are two main entities in the subnet:

  1. Miners: These participants upload fine-tuned speech enhancement models to HuggingFace. Miners are responsible for developing and continuously improving speech enhancement models that can effectively perform denoising or dereverberation tasks.
  2. Validators: These entities benchmark the models and determine which miners’ models perform best. Validators generate fresh datasets daily, download models from HuggingFace, verify model ownership, run benchmarks, and assign scores based on performance metrics.

 

Competition Workflow

The daily competition process follows these steps:

  1. Miners fine-tune speech enhancement models and upload them to HuggingFace.
  2. Validators generate new benchmarking datasets to ensure models are not susceptible to overfitting.
  3. Validators send synapse requests for model information to the Bittensor chain.
  4. The Bittensor chain returns synapse containing model information.
  5. Validators reference model metadata and confirm model ownership.
  6. Validators download models from HuggingFace.
  7. Validators benchmark models on locally generated datasets.
  8. Validators report benchmarking results to the subnet website.
  9. The subnet website constructs competition leaderboards.
  10. Validators set weights for miners based on performance.

 

This continuous cycle ensures that models are constantly being improved and evaluated on fresh data, driving innovation in speech enhancement technology.

 

Technical Architecture

The SoundsRight subnet is built on the Bittensor ecosystem, with a technical architecture designed to facilitate the competition and evaluation process efficiently.

Repository Structure

The codebase is organized into several key directories:

  • soundsright/: Main code directory containing the core implementation
  • base/: Contains base classes and utilities for the subnet
  • core/: Core functionality of the subnet
  • neurons/: Implementation of validator and miner neurons
  • benchmarking/: Code for benchmarking models
  • data/: Data handling utilities
  • models/: Model definitions and implementations
  • templates/: Template files
  • utils/: Utility functions

 

Core Components

  • BaseNeuron Class: Handles base operations for both miner and validator neurons.
  • Validator Neurons: Responsible for benchmarking models and setting weights.
  • Miner Neurons: Responsible for developing and uploading models.
  • Configuration System: Manages paths, logging, and other parameters.

 

Technical Implementation

The implementation uses Python with dependencies including:

  • argparse for command-line arguments
  • bittensor for blockchain integration
  • numpy for numerical operations
  • Various file system and path handling utilities

 

Competition Metrics

The subnet currently hosts competitions at a 16 kHz sample rate, with plans to expand to 48 kHz competitions in upcoming updates. The benchmarking metrics used include:

  • PESQ (Perceptual Evaluation of Speech Quality): 15% of total weights for denoising, 15% for dereverberation
  • ESTOI (Extended Short-Time Objective Intelligibility): 12.5% of total weights for denoising, 12.5% for dereverberation
  • SI-SDR (Scale-Invariant Signal-to-Distortion Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation
  • SI-SAR (Scale-Invariant Signal-to-Artifacts Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation
  • SI-SIR (Scale-Invariant Signal-to-Interference Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation

 

These metrics ensure comprehensive evaluation of model performance across different aspects of speech enhancement quality.

 

SoundsRight operates as a specialized subnet within the Bittensor decentralized ecosystem, focusing exclusively on speech enhancement technology. The subnet creates a competitive environment where participants (miners) develop and fine-tune speech enhancement models, which are then evaluated by validators to determine the best-performing solutions.

The core function of SoundsRight is to facilitate daily fine-tuning competitions for speech enhancement models. These competitions currently focus on two primary tasks:

  • Denoising: Removing unwanted background noise from speech recordings while preserving the quality and intelligibility of the desired speech.
  • Dereverberation: Reducing or eliminating the echo and reverberation effects that occur when audio is recorded in spaces with reflective surfaces.

 

Each competition follows a winner-takes-all format, which incentivizes miners to submit their absolute best models rather than multiple variations. This format, combined with the validation mechanism, deters miner factions by making model duplication unviable.

 

How SoundsRight Works

The SoundsRight subnet operates through a well-defined workflow involving miners, validators, HuggingFace (as a model repository), the Bittensor blockchain, and the subnet’s website. Here’s a detailed breakdown of how the system functions:

Miner-Validator Architecture

There are two main entities in the subnet:

  1. Miners: These participants upload fine-tuned speech enhancement models to HuggingFace. Miners are responsible for developing and continuously improving speech enhancement models that can effectively perform denoising or dereverberation tasks.
  2. Validators: These entities benchmark the models and determine which miners’ models perform best. Validators generate fresh datasets daily, download models from HuggingFace, verify model ownership, run benchmarks, and assign scores based on performance metrics.

 

Competition Workflow

The daily competition process follows these steps:

  1. Miners fine-tune speech enhancement models and upload them to HuggingFace.
  2. Validators generate new benchmarking datasets to ensure models are not susceptible to overfitting.
  3. Validators send synapse requests for model information to the Bittensor chain.
  4. The Bittensor chain returns synapse containing model information.
  5. Validators reference model metadata and confirm model ownership.
  6. Validators download models from HuggingFace.
  7. Validators benchmark models on locally generated datasets.
  8. Validators report benchmarking results to the subnet website.
  9. The subnet website constructs competition leaderboards.
  10. Validators set weights for miners based on performance.

 

This continuous cycle ensures that models are constantly being improved and evaluated on fresh data, driving innovation in speech enhancement technology.

 

Technical Architecture

The SoundsRight subnet is built on the Bittensor ecosystem, with a technical architecture designed to facilitate the competition and evaluation process efficiently.

Repository Structure

The codebase is organized into several key directories:

  • soundsright/: Main code directory containing the core implementation
  • base/: Contains base classes and utilities for the subnet
  • core/: Core functionality of the subnet
  • neurons/: Implementation of validator and miner neurons
  • benchmarking/: Code for benchmarking models
  • data/: Data handling utilities
  • models/: Model definitions and implementations
  • templates/: Template files
  • utils/: Utility functions

 

Core Components

  • BaseNeuron Class: Handles base operations for both miner and validator neurons.
  • Validator Neurons: Responsible for benchmarking models and setting weights.
  • Miner Neurons: Responsible for developing and uploading models.
  • Configuration System: Manages paths, logging, and other parameters.

 

Technical Implementation

The implementation uses Python with dependencies including:

  • argparse for command-line arguments
  • bittensor for blockchain integration
  • numpy for numerical operations
  • Various file system and path handling utilities

 

Competition Metrics

The subnet currently hosts competitions at a 16 kHz sample rate, with plans to expand to 48 kHz competitions in upcoming updates. The benchmarking metrics used include:

  • PESQ (Perceptual Evaluation of Speech Quality): 15% of total weights for denoising, 15% for dereverberation
  • ESTOI (Extended Short-Time Objective Intelligibility): 12.5% of total weights for denoising, 12.5% for dereverberation
  • SI-SDR (Scale-Invariant Signal-to-Distortion Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation
  • SI-SAR (Scale-Invariant Signal-to-Artifacts Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation
  • SI-SIR (Scale-Invariant Signal-to-Interference Ratio): 7.5% of total weights for denoising, 7.5% for dereverberation

 

These metrics ensure comprehensive evaluation of model performance across different aspects of speech enhancement quality.

 

WHO

Team Info

Based on the GitHub repository, the project is maintained by the @synapsec-ai/subnet-owners team. Contributors visible in the GitHub interface include:

  • m4k1-dev
  • ceterum1

Based on the GitHub repository, the project is maintained by the @synapsec-ai/subnet-owners team. Contributors visible in the GitHub interface include:

  • m4k1-dev
  • ceterum1

FUTURE

Roadmap

The subnet uses semantic versioning (Major.Minor.Patch) with specific implications for each release type:

Major Releases (X.0.0):

  • May include breaking changes
  • Updates are mandatory for all subnet users
  • The weights_version hyperparameter is adjusted immediately after release
  • Major releases are communicated at least 1 week in advance
  • Registration may be disabled for up to 24 hours

 

Minor Releases (0.X.0):

  • May include breaking changes
  • If breaking changes are included, updates are announced at least 48 hours in advance
  • Otherwise, a minimum of 24-hour notice is given
  • Updates are mandatory for all subnet users
  • Registration may be disabled for up to 24 hours

 

Patch Releases (0.0.X):

  • Do not contain breaking changes
  • Updates are not mandatory unless they include hotfixes for scoring or penalty algorithms
  • Releases without changes to scoring or penalty algorithms are pushed without prior notice

 

 

Version Milestones

SoundsRight v1.0.0

  • Register on testnet
  • 16 kHz competitions for denoising and dereverberation tasks

 

SoundsRight v1.1.0

  • Register on mainnet

 

SoundsRight v2.0.0

  • TTS generation upgrade
  • 48 kHz competitions for denoising and dereverberation tasks

 

SoundsRight v3.0.0

  • More utilities provided to miners and validators
  • Validator performance dashboards

 

SoundsRight v4.0.0

  • Complete subnet overhaul to a monetized API

 

Long-term Vision

The current goal for the subnet is to facilitate open-source research and development of state-of-the-art speech enhancement models. The documentation acknowledges that there is potential to create far more open-source work in this field.
The ultimate goal of the subnet is to create a monetized product in the form of an API. However, to make the product as competitive as possible, the subnet’s first goal is to create a large body of work for miners to draw their inspiration from.

 

The subnet uses semantic versioning (Major.Minor.Patch) with specific implications for each release type:

Major Releases (X.0.0):

  • May include breaking changes
  • Updates are mandatory for all subnet users
  • The weights_version hyperparameter is adjusted immediately after release
  • Major releases are communicated at least 1 week in advance
  • Registration may be disabled for up to 24 hours

 

Minor Releases (0.X.0):

  • May include breaking changes
  • If breaking changes are included, updates are announced at least 48 hours in advance
  • Otherwise, a minimum of 24-hour notice is given
  • Updates are mandatory for all subnet users
  • Registration may be disabled for up to 24 hours

 

Patch Releases (0.0.X):

  • Do not contain breaking changes
  • Updates are not mandatory unless they include hotfixes for scoring or penalty algorithms
  • Releases without changes to scoring or penalty algorithms are pushed without prior notice

 

 

Version Milestones

SoundsRight v1.0.0

  • Register on testnet
  • 16 kHz competitions for denoising and dereverberation tasks

 

SoundsRight v1.1.0

  • Register on mainnet

 

SoundsRight v2.0.0

  • TTS generation upgrade
  • 48 kHz competitions for denoising and dereverberation tasks

 

SoundsRight v3.0.0

  • More utilities provided to miners and validators
  • Validator performance dashboards

 

SoundsRight v4.0.0

  • Complete subnet overhaul to a monetized API

 

Long-term Vision

The current goal for the subnet is to facilitate open-source research and development of state-of-the-art speech enhancement models. The documentation acknowledges that there is potential to create far more open-source work in this field.
The ultimate goal of the subnet is to create a monetized product in the form of an API. However, to make the product as competitive as possible, the subnet’s first goal is to create a large body of work for miners to draw their inspiration from.