Generate 10K product reviews, 50K training pairs, or 100K sentiment samples. Define a schema, write a prompt, inject variants — get structured JSONL back.
58+
Variant Lists
6.4M
Wikipedia Topics
1M
Max Batch Size
From schema design to batch delivery.
Define your output structure visually. Every row matches your exact format — strings, numbers, arrays, enums.
58+ built-in variant lists plus 6.4M Wikipedia topics. Weighted distributions, subset trimming, custom lists.
Write templates with {VARIABLES} or describe what you want in plain English. Auto-detected schemas and mappings.
Submit up to 1M requests per batch. Automatic splitting, progress tracking, retry logic.
Download as JSONL or CSV. Parquet coming soon. Ready for fine-tuning, RAG, or evaluation.
Track token usage, costs, and job progress in real time. Confidence scoring on every row.
58+
Built-in Variant Lists
6.4M
Wikipedia Topics
1M
Max Rows Per Batch
2
Export Formats
Plus 6.4 million English Wikipedia topics. Create your own too.
Every English Wikipedia article title as a variant source
Four steps from idea to dataset.
Set the shape of your data. Fields like "prompt", "response", "intent" — whatever you need.
Use {VARIABLES} that map to variant lists. Or describe what you want and let Sonset build it.
Select from 58+ built-in lists, 6.4M Wikipedia topics, or create your own. Weight and trim them.
Sonset batches everything, polls for completion, and delivers your dataset as JSONL or CSV.
Pay per row generated. Buy credits, use them when you need. No subscriptions.
Fast, efficient generation for most use cases.
Higher accuracy for complex schemas and nuanced data.
Maximum intelligence for the most demanding datasets.
Create your free account and start generating structured datasets.