04 · Phytoset-10M Corpus
The data behind the latent code.
A unified corpus of curated, field-collected, and synthetically mixed-disease samples. Augmentations include hyperspectral channel jitter, weather-stress overlays, and CutMix-style co-pathology synthesis.
Recent Records
showing 6 of 619,932
| ID | Crop | Label | Region | Date |
|---|---|---|---|---|
| QX-9942-B | Potato | Phytophthora · K-deficiency | Andhra Pradesh | 2024.11.18 |
| QX-9941-A | Soybean | Cercospora Sojina | Mato Grosso, BR | 2024.11.18 |
| QX-9940-D | Wheat | Septoria · Healthy (mixed) | Punjab | 2024.11.17 |
| QX-9939-C | Tomato | Mosaic Virus | Almería, ES | 2024.11.17 |
| QX-9938-B | Maize | Northern Leaf Blight | Iowa, USA | 2024.11.16 |
| QX-9937-A | Rice | Bacterial Leaf Streak | Mekong Delta | 2024.11.16 |
Source Composition
PlantVillage54,303 38%
PlantDoc field2,598 6%
Quoryn-Field412,109 41%
Synthetic mixed151,022 15%
Augmentation Pipeline
- ·RandomResizedCrop · 0.6–1.0
- ·Hyperspectral channel jitter
- ·CutMix co-pathology blend (p=0.3)
- ·Weather-stress overlay (sun · drought)
- ·Domain randomization · field vs lab
- ·MixUp on latent activations