Comprehensive analysis and statistical overview of our medical datasets including detailed comparisons and key insights.
Dataset | Entries | Avg Instruction | Avg Input | Avg Output | Input-to-Output Ratio |
---|---|---|---|---|---|
General Medical | 1,867 | 4.0 words | 17.10 words | 33.10 words | 0.52 |
Evaluation Medical | 3,324 | 46.0 words | 30.05 words | 2.67 words | 11.25 |
GenMedGPT-5k | 3,088 | 15.0 words | 23.04 words | 44.76 words | 0.51 |
syndrome
(8.94%)disease
(6.37%)carcinoma
(3.00%)infection
(2.46%)tumor
(1.98%)characterized
(high frequency)type
(high frequency)disease
(high frequency)syndrome
(high frequency)cells
(high frequency)General Medical Dataset
Evaluation Medical Dataset
GenMedGPT-5k Dataset
Pain Management Focus
pain
: 34.62% (vs. 0.37% in General Medical, 3.09% in Evaluation)General Medical Conditions
syndrome
: 8.94% (General), 4.87% (Evaluation), 1.42% (GenMedGPT)disease
: 6.37% (General), 6.17% (Evaluation), 2.81% (GenMedGPT)blood
: 1.50% (General), 3.22% (Evaluation), 3.33% (GenMedGPT)Specialized Terms