With the amount of new subnets being added it can be hard to get up to date information across all subnets, so data may be slightly out of date from time to time

Subnet 78

Vocence

Alpha Price
Value
Market Cap
Value
Neurons
Value
Registration Cost
Value
TAO Liquidity
Value
Alpha in Pool
Value
Total Alpha Supply
Value
% Alpha Staked
Value

ABOUT

What exactly does it do?

Vocence is a voice-intelligence subnet on Bittensor. Its core job is to create an open market where miners build and serve voice models, while validators measure which systems best follow natural-language voice prompts and produce the highest-quality speech. Official materials describe the broader scope as covering Prompt-based Text-to-Speech, Speech-to-Text, Speech-to-Speech, voice cloning, Text-to-Music, and voice agents, while the site and terms pages frame Vocence as the “voice layer for decentralised intelligence” and a decentralised voice synthesis and voice-cloning network built on Bittensor.

In practice, the current live implementation is narrower than the full vision. The public repository and homepage are clear that the present operational focus is PromptTTS: miners generate speech from a piece of text plus an instruction describing desired vocal traits, and validators score the result on content correctness, audio quality, and prompt adherence. The homepage also explicitly says that Vocence is “currently supporting PromptTTS only,” even though the wider whitepaper and docs describe future support for STT, STS, cloning, TTM, and voice agents. That means the subnet today is best understood as a prompt-controlled speech-generation competition with a broader multimodal roadmap already defined.

The subnet’s role inside the broader Bittensor model follows the standard miner-validator pattern, but specialised for voice. In Bittensor terms, miners produce the digital commodity and validators measure it; in Vocence’s implementation, miners publish and deploy voice models, while validators generate test cases, call miner endpoints, score the results, and set weights on-chain. Vocence’s README describes this as a decentralised marketplace where miners compete on open model performance, validators run a shared evaluation pipeline, and rewards are distributed according to measurable improvements.

One important nuance is that the governance language and the current operational flow are not identical. The whitepaper says Vocence supports permissionless miner and validator participation across its voice domains, but the current validator setup documented in the public repo still requires the team to grant access to miner deployments, provide the owner API endpoint, and supply storage credentials for the shared evaluation system. So, the stated long-term design is open participation, while the currently documented operations still include controlled onboarding for validators.

Vocence is a voice-intelligence subnet on Bittensor. Its core job is to create an open market where miners build and serve voice models, while validators measure which systems best follow natural-language voice prompts and produce the highest-quality speech. Official materials describe the broader scope as covering Prompt-based Text-to-Speech, Speech-to-Text, Speech-to-Speech, voice cloning, Text-to-Music, and voice agents, while the site and terms pages frame Vocence as the “voice layer for decentralised intelligence” and a decentralised voice synthesis and voice-cloning network built on Bittensor.

In practice, the current live implementation is narrower than the full vision. The public repository and homepage are clear that the present operational focus is PromptTTS: miners generate speech from a piece of text plus an instruction describing desired vocal traits, and validators score the result on content correctness, audio quality, and prompt adherence. The homepage also explicitly says that Vocence is “currently supporting PromptTTS only,” even though the wider whitepaper and docs describe future support for STT, STS, cloning, TTM, and voice agents. That means the subnet today is best understood as a prompt-controlled speech-generation competition with a broader multimodal roadmap already defined.

The subnet’s role inside the broader Bittensor model follows the standard miner-validator pattern, but specialised for voice. In Bittensor terms, miners produce the digital commodity and validators measure it; in Vocence’s implementation, miners publish and deploy voice models, while validators generate test cases, call miner endpoints, score the results, and set weights on-chain. Vocence’s README describes this as a decentralised marketplace where miners compete on open model performance, validators run a shared evaluation pipeline, and rewards are distributed according to measurable improvements.

One important nuance is that the governance language and the current operational flow are not identical. The whitepaper says Vocence supports permissionless miner and validator participation across its voice domains, but the current validator setup documented in the public repo still requires the team to grant access to miner deployments, provide the owner API endpoint, and supply storage credentials for the shared evaluation system. So, the stated long-term design is open participation, while the currently documented operations still include controlled onboarding for validators.

PURPOSE

What exactly is the 'product/build'?

The “product” is really two connected things. First, there is the subnet itself: a Bittensor incentive mechanism for training, deploying, evaluating, and rewarding open voice models. Second, there is an application layer around that subnet: Vocence Studio, a developer API, a monitoring dashboard, and the associated operational tooling. Public product pages describe Studio as a place to create speech, design voices or characters from prompts, clone voices, and generate music, while the pricing documentation adds a developer API layer and an enterprise tier with private quotas and tailored billing. At the same time, the homepage notes that PromptTTS is the currently supported live capability.

On the miner side, Vocence standardises how models are packaged and exposed. Miners host their model logic in a Hugging Face repository, render a canonical wrapper template, and deploy that package as a private deployment on Chutes. The Hugging Face repo must include a miner.py file implementing a Miner class with warmup() and generate_wav(instruction, text) methods, plus a chute_config.yml file and optionally a vocence_config.yaml file. The public miner guide also shows that the live runtime interface is intentionally simple: a GET /health endpoint for status and a POST /speak endpoint that accepts {“instruction”: “…”, “text”: “…”} and returns WAV audio. The owner service performs a wrapper-integrity hash check against the canonical template, so miners can swap in different TTS engines, but they cannot arbitrarily modify the serving wrapper.

On the validator and owner side, the build is more elaborate. The CLI documentation shows four distinct operational roles: validators can run the full validator or split it into separate sample-generation and weight-setting services; owners can run the HTTP API and/or the corpus downloader; miners can deploy and commit models; and query commands can list committed miners. The owner-side API exposes participants, evaluations, metrics, blocklist, validators, and status endpoints, while the owner service also runs a downloader that pulls source audio from LibriVox, clips it, and uploads those clips into a shared corpus bucket on Hippius. That means the product is not just “miners and validators”; it is a coordinated system with model registry, scoring storage, evaluation orchestration, and a central metadata API.

The evaluation pipeline is the most technically distinctive part of the build. Validators take a real source-audio clip from the shared corpus bucket, derive a structured task specification from that clip using OpenAI GPT-4o audio, then ask miners to synthesise speech that matches both the spoken text and the requested vocal style. The extracted specification includes transcription, gender, pitch, speed, age group, emotion, tone, and accent. That spec is flattened into a canonical /speak payload with a text field and a pipe-delimited instruction field describing vocal attributes. After a miner returns audio, the validator runs two further GPT-4o-audio evaluations in parallel: one pointwise extraction pass on the miner output and one pairwise naturalness comparison between the source clip and the generated clip. This means Vocence is not simply ranking raw speech quality; it is ranking how faithfully a model reproduces prompted vocal traits as judged against real source audio.

Scoring is explicit and relatively transparent. Script correctness is scored through a word-error-rate style comparison; pitch, speed, and age group are treated as ordinal buckets; gender, emotion, tone, and accent are exact categorical matches; and naturalness is a head-to-head judge decision. The weighted score sums to 1.0, with script at 0.30, naturalness at 0.15, and most of the core prompt traits at 0.10 or 0.05. A generated sample counts as a “win” only if it meets a default PASS_THRESHOLD of 0.9. The docs also note that judge temperature is fixed at 0.0 and the naturalness comparison order is randomised to reduce position bias, which shows the team is trying to make the evaluation both deterministic and resistant to simple artefacts.

Weight setting is now global rather than purely local. Each validator still generates its own samples and uploads results to its own Hippius bucket, but final ranking pulls from all active validator buckets it can read. A validator is treated as active if it has submitted recent evaluation data, with a default 24-hour activity window. The most recent scoring window is read from each active bucket, miner win rates are aggregated using stake-weighted scoring with sqrt(stake) influence, and only miners with more than 40 evaluations in at least 3 validator buckets become globally eligible. Winners must also beat earlier eligible miners by a default threshold margin of 0.02. If those conditions are not met, validators burn by assigning weight 1.0 to UID 0. Tie-breaks then proceed deterministically by global win rate, validator count, weighted evaluation volume, earlier commit block, and finally hotkey ordering. In other words, the build is explicitly trying to force honest validators into the same answer from shared evidence.

The codebase itself reflects that architecture. The public GitHub repository is split into adapters, domain, engine, gateway, pipeline, ranking, registry, and shared modules, plus CLI entry points. The dependency manifest indicates a stack built around bittensor==9.12.2, huggingface-hub, minio, openai, audiojudge, fastapi, uvicorn, asyncpg, sqlalchemy[asyncio], alembic, and the Chutes SDK. The docs further show that the owner API is an HTTP service, validators can be run as a single vocence serve process or as split generator/validator services, and the default Docker image is built for both linux/amd64 and linux/arm64. Operationally, validators are encouraged to run through Docker Compose with Watchtower, which polls for new images and automatically restarts the validator to stay in sync with releases.

The public-facing application layer extends beyond the raw subnet. Studio is already marketed as the interface for creating speech, custom voices, cloned voices, and music, while the pricing docs state that developer API access requires at least one successful Premium purchase. Those same docs say API keys are rate-limited to 4 requests per minute per key, and the homepage exposes a Custom/Enterprise tier with private quotas, tailored billing, and full API support for product and platform use cases. So even though the evaluated subnet logic is currently PromptTTS-first, the build around it is already being positioned as both an end-user creative interface and a commercial developer platform.

The “product” is really two connected things. First, there is the subnet itself: a Bittensor incentive mechanism for training, deploying, evaluating, and rewarding open voice models. Second, there is an application layer around that subnet: Vocence Studio, a developer API, a monitoring dashboard, and the associated operational tooling. Public product pages describe Studio as a place to create speech, design voices or characters from prompts, clone voices, and generate music, while the pricing documentation adds a developer API layer and an enterprise tier with private quotas and tailored billing. At the same time, the homepage notes that PromptTTS is the currently supported live capability.

On the miner side, Vocence standardises how models are packaged and exposed. Miners host their model logic in a Hugging Face repository, render a canonical wrapper template, and deploy that package as a private deployment on Chutes. The Hugging Face repo must include a miner.py file implementing a Miner class with warmup() and generate_wav(instruction, text) methods, plus a chute_config.yml file and optionally a vocence_config.yaml file. The public miner guide also shows that the live runtime interface is intentionally simple: a GET /health endpoint for status and a POST /speak endpoint that accepts {“instruction”: “…”, “text”: “…”} and returns WAV audio. The owner service performs a wrapper-integrity hash check against the canonical template, so miners can swap in different TTS engines, but they cannot arbitrarily modify the serving wrapper.

On the validator and owner side, the build is more elaborate. The CLI documentation shows four distinct operational roles: validators can run the full validator or split it into separate sample-generation and weight-setting services; owners can run the HTTP API and/or the corpus downloader; miners can deploy and commit models; and query commands can list committed miners. The owner-side API exposes participants, evaluations, metrics, blocklist, validators, and status endpoints, while the owner service also runs a downloader that pulls source audio from LibriVox, clips it, and uploads those clips into a shared corpus bucket on Hippius. That means the product is not just “miners and validators”; it is a coordinated system with model registry, scoring storage, evaluation orchestration, and a central metadata API.

The evaluation pipeline is the most technically distinctive part of the build. Validators take a real source-audio clip from the shared corpus bucket, derive a structured task specification from that clip using OpenAI GPT-4o audio, then ask miners to synthesise speech that matches both the spoken text and the requested vocal style. The extracted specification includes transcription, gender, pitch, speed, age group, emotion, tone, and accent. That spec is flattened into a canonical /speak payload with a text field and a pipe-delimited instruction field describing vocal attributes. After a miner returns audio, the validator runs two further GPT-4o-audio evaluations in parallel: one pointwise extraction pass on the miner output and one pairwise naturalness comparison between the source clip and the generated clip. This means Vocence is not simply ranking raw speech quality; it is ranking how faithfully a model reproduces prompted vocal traits as judged against real source audio.

Scoring is explicit and relatively transparent. Script correctness is scored through a word-error-rate style comparison; pitch, speed, and age group are treated as ordinal buckets; gender, emotion, tone, and accent are exact categorical matches; and naturalness is a head-to-head judge decision. The weighted score sums to 1.0, with script at 0.30, naturalness at 0.15, and most of the core prompt traits at 0.10 or 0.05. A generated sample counts as a “win” only if it meets a default PASS_THRESHOLD of 0.9. The docs also note that judge temperature is fixed at 0.0 and the naturalness comparison order is randomised to reduce position bias, which shows the team is trying to make the evaluation both deterministic and resistant to simple artefacts.

Weight setting is now global rather than purely local. Each validator still generates its own samples and uploads results to its own Hippius bucket, but final ranking pulls from all active validator buckets it can read. A validator is treated as active if it has submitted recent evaluation data, with a default 24-hour activity window. The most recent scoring window is read from each active bucket, miner win rates are aggregated using stake-weighted scoring with sqrt(stake) influence, and only miners with more than 40 evaluations in at least 3 validator buckets become globally eligible. Winners must also beat earlier eligible miners by a default threshold margin of 0.02. If those conditions are not met, validators burn by assigning weight 1.0 to UID 0. Tie-breaks then proceed deterministically by global win rate, validator count, weighted evaluation volume, earlier commit block, and finally hotkey ordering. In other words, the build is explicitly trying to force honest validators into the same answer from shared evidence.

The codebase itself reflects that architecture. The public GitHub repository is split into adapters, domain, engine, gateway, pipeline, ranking, registry, and shared modules, plus CLI entry points. The dependency manifest indicates a stack built around bittensor==9.12.2, huggingface-hub, minio, openai, audiojudge, fastapi, uvicorn, asyncpg, sqlalchemy[asyncio], alembic, and the Chutes SDK. The docs further show that the owner API is an HTTP service, validators can be run as a single vocence serve process or as split generator/validator services, and the default Docker image is built for both linux/amd64 and linux/arm64. Operationally, validators are encouraged to run through Docker Compose with Watchtower, which polls for new images and automatically restarts the validator to stay in sync with releases.

The public-facing application layer extends beyond the raw subnet. Studio is already marketed as the interface for creating speech, custom voices, cloned voices, and music, while the pricing docs state that developer API access requires at least one successful Premium purchase. Those same docs say API keys are rate-limited to 4 requests per minute per key, and the homepage exposes a Custom/Enterprise tier with private quotas, tailored billing, and full API support for product and platform use cases. So even though the evaluated subnet logic is currently PromptTTS-first, the build around it is already being positioned as both an end-user creative interface and a commercial developer platform.

WHO

Team Info

Public team information is limited. Across the official site, whitepaper, docs, and legal pages reviewed here, there is branding, documentation, pricing, and contact information, but no official team page listing founders, executives, engineers, or advisors by name. The clearest direct public contact surfaced in the official materials is the email address [email protected], and the site footer identifies the brand simply as Vocence.

What can be verified operationally is that the docs refer repeatedly to a “Vocence team” that grants validator access to miner deployments on Chutes, provides the owner API endpoint, and shares Hippius bucket keys needed for the evaluation workflow. Separately, the public GitHub repository currently shows one visible contributor account and a total of one contributor in the repo overview, which means the public code footprint is much easier to verify than the real-world team roster. The contributor handle shown publicly is concil859856 space, but the repo does not map that account to a real name or published biography.

Because the official sources reviewed for this report do not publish founder biographies, employee lists, or a formal leadership page, there is not enough verifiable public information to provide a named team roster beyond the operational references above. The safest accurate summary is that the organisation behind Vocence is publicly visible as a brand and operator, but not yet publicly documented as a named leadership team in the materials reviewed here.

Public team information is limited. Across the official site, whitepaper, docs, and legal pages reviewed here, there is branding, documentation, pricing, and contact information, but no official team page listing founders, executives, engineers, or advisors by name. The clearest direct public contact surfaced in the official materials is the email address [email protected], and the site footer identifies the brand simply as Vocence.

What can be verified operationally is that the docs refer repeatedly to a “Vocence team” that grants validator access to miner deployments on Chutes, provides the owner API endpoint, and shares Hippius bucket keys needed for the evaluation workflow. Separately, the public GitHub repository currently shows one visible contributor account and a total of one contributor in the repo overview, which means the public code footprint is much easier to verify than the real-world team roster. The contributor handle shown publicly is concil859856 space, but the repo does not map that account to a real name or published biography.

Because the official sources reviewed for this report do not publish founder biographies, employee lists, or a formal leadership page, there is not enough verifiable public information to provide a named team roster beyond the operational references above. The safest accurate summary is that the organisation behind Vocence is publicly visible as a brand and operator, but not yet publicly documented as a named leadership team in the materials reviewed here.

FUTURE

Roadmap

The official roadmap is clearly broader than the currently deployed PromptTTS implementation. The homepage describes the journey as building a “Voice Intelligence Layer” from foundation to ecosystem expansion across PromptTTS, STT, STS, voice cloning, TTM, and voice agents, and the whitepaper exposes at least three roadmap phases: Q1 Foundation, Q2 Scaling and Robustness, and Q3 Ecosystem Expansion. Publicly retrievable roadmap text did not expose a further phase beyond those three during this review.

The Q1 Foundation phase is public and concrete. The whitepaper snippet lists subnet launch on Bittensor, an official website and monitoring dashboard, and a baseline PromptTTS evaluation pipeline focused on voice quality validation, voice trait accuracy, content correctness, and environmental consistency. That matches the present public state: there is a website, an analytics/dashboard layer, a live GitHub repo, and a PromptTTS-centred validator pipeline. In effect, Q1 is the phase where Vocence stands up the subnet, defines the scoring logic, and establishes the core user-facing surfaces around it.

The Q2 phase is named Scaling and Robustness. The publicly surfaced whitepaper text says this phase expands the voice-trait and environmental taxonomy, introduces a baseline evaluation pipeline for STT and voice cloning on the subnet, and starts collecting datasets for model training from validator results submitted by miners and generated by the validator task-generation and evaluation pipeline. That is an important signal about how Vocence expects to grow: not just by adding modalities, but by using the subnet’s own evaluation exhaust as a data flywheel for future model improvement.

The Q3 phase is Ecosystem Expansion. The surfaced roadmap text says Vocence plans to launch Voice STS and TTM pipelines on the subnet and begin competition there, while also expanding the platform with Voice AI Agents that integrate with the multimodal models developed through the subnet. In other words, the roadmap does not stop at “better text-to-speech”; it explicitly points toward broader multimodal voice systems and application-layer agents that sit on top of the subnet’s model stack.

The public codebase changelog gives a second, more implementation-focused roadmap that is already partly completed. Version 0.1.0, dated 28 February 2025, introduced the initial voice-intelligence subnet with PromptTTS, validator sample generation, weight-setting logic, miner push/commit commands, the centralised owner API, LibriVox corpus ingestion, and unified configuration defaults for mainnet subnet 78. Version 0.1.1, dated 19 March 2026, added a base miner, refined scoring, improved CI/CD and Watchtower behaviour, and fixed subtensor connection stability. Version 0.1.2, dated 20 March 2026, added global consensus scoring across validator buckets, active-validator discovery through the owner API, live subnet graph activity support, global scoring snapshots for the dashboard, tougher global eligibility rules, and more deterministic cross-validator convergence.

There are also public rollout milestones outside the changelog. The official X account announced that SN78 was about to go live, a public LinkedIn relay described Vocence as launched on Bittensor as a decentralised voice-intelligence subnet, and later ecosystem recaps stated that Vocence Studio was live and that the project had unveiled a new website. Taken together, those public signals suggest that the rollout path moved from subnet launch preparation, to public launch/relaunch visibility, to application-layer product availability through Studio.

The official roadmap is clearly broader than the currently deployed PromptTTS implementation. The homepage describes the journey as building a “Voice Intelligence Layer” from foundation to ecosystem expansion across PromptTTS, STT, STS, voice cloning, TTM, and voice agents, and the whitepaper exposes at least three roadmap phases: Q1 Foundation, Q2 Scaling and Robustness, and Q3 Ecosystem Expansion. Publicly retrievable roadmap text did not expose a further phase beyond those three during this review.

The Q1 Foundation phase is public and concrete. The whitepaper snippet lists subnet launch on Bittensor, an official website and monitoring dashboard, and a baseline PromptTTS evaluation pipeline focused on voice quality validation, voice trait accuracy, content correctness, and environmental consistency. That matches the present public state: there is a website, an analytics/dashboard layer, a live GitHub repo, and a PromptTTS-centred validator pipeline. In effect, Q1 is the phase where Vocence stands up the subnet, defines the scoring logic, and establishes the core user-facing surfaces around it.

The Q2 phase is named Scaling and Robustness. The publicly surfaced whitepaper text says this phase expands the voice-trait and environmental taxonomy, introduces a baseline evaluation pipeline for STT and voice cloning on the subnet, and starts collecting datasets for model training from validator results submitted by miners and generated by the validator task-generation and evaluation pipeline. That is an important signal about how Vocence expects to grow: not just by adding modalities, but by using the subnet’s own evaluation exhaust as a data flywheel for future model improvement.

The Q3 phase is Ecosystem Expansion. The surfaced roadmap text says Vocence plans to launch Voice STS and TTM pipelines on the subnet and begin competition there, while also expanding the platform with Voice AI Agents that integrate with the multimodal models developed through the subnet. In other words, the roadmap does not stop at “better text-to-speech”; it explicitly points toward broader multimodal voice systems and application-layer agents that sit on top of the subnet’s model stack.

The public codebase changelog gives a second, more implementation-focused roadmap that is already partly completed. Version 0.1.0, dated 28 February 2025, introduced the initial voice-intelligence subnet with PromptTTS, validator sample generation, weight-setting logic, miner push/commit commands, the centralised owner API, LibriVox corpus ingestion, and unified configuration defaults for mainnet subnet 78. Version 0.1.1, dated 19 March 2026, added a base miner, refined scoring, improved CI/CD and Watchtower behaviour, and fixed subtensor connection stability. Version 0.1.2, dated 20 March 2026, added global consensus scoring across validator buckets, active-validator discovery through the owner API, live subnet graph activity support, global scoring snapshots for the dashboard, tougher global eligibility rules, and more deterministic cross-validator convergence.

There are also public rollout milestones outside the changelog. The official X account announced that SN78 was about to go live, a public LinkedIn relay described Vocence as launched on Bittensor as a decentralised voice-intelligence subnet, and later ecosystem recaps stated that Vocence Studio was live and that the project had unveiled a new website. Taken together, those public signals suggest that the rollout path moved from subnet launch preparation, to public launch/relaunch visibility, to application-layer product availability through Studio.