Data Collection — Real-World Clinical Doctor–Patient Conversations (Odia & Bengali)
Earn in 3 Easy Steps
Participate in the Project
Fill in your details and confirm your ability to collect real-world clinical audio recordings from OPD/IPD settings.
Record & Prepare the Data
Capture one audio file per consultation inside hospitals/clinics, then prepare the transcript and metadata for each encounter.
Submit for QC → Get Paid
Upload files and metadata. After QC approval, payments are processed (rate/payment terms as shared during onboarding).
Project Details
We’re running a fully compliant, end-to-end data acquisition and processing workflow to collect real-world clinical doctor–patient conversations across selected geographies and specialties.
Data Modality
- Real-world clinical audio recordings
Conversation Topics / Specialties
Cardiology | Oncology
Collection Methodology
- Field data collection inside hospitals/clinics
- Per-encounter audio capture (1 file per consultation)
Eligible Recording Categories (What you can submit)
Submit encounters that meet the criteria below:
- OPD – Cardiology (Odia / Bengali)
- OPD – Oncology (Odia / Bengali)
- IPD – Cardiology (Odia / Bengali)
- IPD – Oncology (Odia / Bengali)
Volume Targets
Total Required Dataset
| Language | Total Hours | OPD Split | IPD Split | Specialties Split |
|---|---|---|---|---|
| Odia | 70 hrs | 50 hrs | 20 hrs | 35 hrs Cardiology + 35 hrs Oncology |
| Bengali | 40 hrs | 30 hrs | 10 hrs | 20 hrs Cardiology + 20 hrs Oncology |
| Total | 110 hrs | — | — | Cardiology + Oncology |
Additional volume constraints
- Max 3 hours usable audio per doctor
- Ensure district and facility diversity
Key Requirements
Participants / Sources
- 25–30 doctors, across Cardiology & Oncology
- Encounters must be ≥60 seconds
- ≥60% new patients, ≤40% follow-ups
- Natural mixed speech allowed; avoid overuse of any single facility
Technical Specifications
- Audio format: WAV (PCM), 16kHz, 16-bit, mono
- Each encounter must include: 1 audio + 1 transcript + 1 metadata entry
- No compressed audio
- Device must clearly capture both doctor & patient
Metadata (Per Encounter)
- Encounter ID
- Language
- Specialty
- Care setting
- State
- Recording date
- Noise level
- Sensitive content flag
- Recording method
- Speaker count
- Consent
- Quality flags
Please fill the details below & we’ll be in touch
Do’s
- Record real medical encounters (≥60 seconds)
- Ensure clear two-way capture (doctor + patient)
- Maintain facility and district diversity
- Provide audio + transcript + metadata per encounter
- Confirm and document consent for every encounter
Don’ts
- Don’t submit compressed audio
- Don’t exceed 3 hours per doctor
- Don’t rely on a single clinic/hospital repeatedly (avoid facility overuse)
- Don’t submit encounters without metadata/transcript