Releases: ncbi-nlp/MedCalc-Bench
Releases · ncbi-nlp/MedCalc-Bench
version-1.2
- Fixed bug in APACHE II, CCI, and Caprini Calculators
- Fixed the upper and lower limit bounds for dosage calculators (calculator ID 24, 49)
- Correctly added the specific question for the Calculator ID 24 for the one-shot example in run.py
- Replaced notes from in the test set which are not clinically eligible for the calculator with items from the training set and removed the training set. This is not exhaustive, there are still notes which may not be the best fit for a calculator, but the Relevant Entities and ground truth value can be determined for the patient note so this should not affect LLMs' ability to do the task.
- Re-annotated the one-shot examples + test examples in the CSV
version-1.1
MedCalc-Bench-v1.1 has the following changes:
- fixed all calculator implementation mistakes except for Caprini, APACHE II, and CCI (these were found after the release)
- round to 5 decimal places
- adjusted the relevant entities to be at the time of admission prior to treatment
- added notes to FENa, MeldNa, Framingham Risk, and Homa-IR so that they all have 20 notes
- Added training instances for all calculators
- Removed previous synthetic notes and replaced with new ones
- Replaced notes for calculators so that they are better matches/more relevant to pertaining calculators
version-1.0
Important: This dataset is kept for reproducibility purposes only. Please use the most up-to-date version 1.2, for the most revised and corrected dataset available.
We have fixed 12 calculator implementations, ensured the Relevant Entities section best matches with what was specified by a patient note, and have also replaced notes which are better fits for a given calculator to make the dataset more applicable for real-life sitatuons.
Because of the number of changes, we find this dataset to be highly unreliable and we cannot recommend using this asides from reproducibility purposes.