Skip to main content
For AI labs & data teams·African languages

The verified panel your Pidgin model deserves

BVN-verified Nigerian voice contributors for ASR, TTS, RLHF, and translation pairs. Five African languages. Same-day delivery. Identity locked at the bank level — not at the IP address.

Curated panel · early-access pricing · pilot decks under NDA on request.

Language coverage

Five languages, ~420M speakers total

Underserved by Prolific (Nigeria excluded), Scale AI (no Pidgin), and Surge (English-only by default).

Pidgin
Most-spoken West African creole · 75M+ speakers
Yoruba
South-West Nigeria · 45M+ speakers
Igbo
South-East Nigeria · 30M+ speakers
Hausa
Northern Nigeria + Niger · 70M+ speakers
Nigerian English
Accented English · 200M+ speakers

What we collect for you

Voice, text, preference signals — all native to the languages we speak.

ASR / speech-to-text

Short clean reads + spontaneous narration. We tag dialect + age band + state for stratified training splits.

TTS voice modelling

Same speaker, 100+ utterances, identity-locked via BVN so you know the voice belongs to one verified human.

RLHF / preference data

Bilingual annotators rate AI outputs across Pidgin/English code-switching. Useful when your foundation model is English-trained.

Translation pairs

Sentence-level English ↔ Pidgin/Yoruba/Igbo/Hausa, source-tagged with regional variants.

Why this panel is different

Bank-grade identity. Same-day routing.

Most data marketplaces verify with an email and a captcha. Every 9jatesters contributor passes Paystack's BVN match against the central Nigerian banking system — the same KYC tier Nigerian fintechs use for moving money. Real people. Real accents. Verifiable provenance for your dataset.

BVN-verified identities

Every contributor's identity is matched against their Nigerian bank's KYC. No farms, no Mechanical-Turk-style proxies.

Same-day routing

Curated panel across every region of Nigeria — South-West, South-East, South-South, North-Central, and Northern. Pilot jobs route within hours of brief acceptance; volume scales as the panel grows.

Per-prompt human review

Two-pass QA: automated transcript-back via Gemini, then admin review. Failed clips never reach the buyer.

Direct contracting

We bill in USD via Stripe (or NGN via Paystack — your call). Standard data-licence agreement included.

Pilot

Start small. Validate the data.

Our standard pilot is 500 voice recordings in your chosen language, delivered as labelled .wav + JSON manifest within 72 hours. Pricing transparent — you only pay for clips that pass our QA pass + your spot-check.

Standard pilotMost popular
$1.20/ approved clip

Volume + multi-language discounts on contract. NDAs welcome.

  • 16-kHz mono .wavPlus a JSON manifest with prompt text, duration, contributor id (anonymised by default).
  • Demographic stratificationAge band, state, gender, occupation — useful for fair ASR benchmarks.
  • 72-hour SLA500 clips in three days. Bigger jobs ship in batched deliveries.