-
Notifications
You must be signed in to change notification settings - Fork 16
Closed
Feature
Copy link
Labels
Description
Relates to
Discovered in the course of
ATM we have 2 (only! phewph) dandisets with failed to be produced DOI records:
❯ grep -B1 '<!DOCTYPE html>' dandi.bib | grep DANDISET
# DANDISET 000029
# DANDISET 000252
edit: updated as of 20241101 -- no new missing bib but here is a summary over also other datacite errors (as of 4.3 version of datacite we use, which is not final 4.3) and metadata errors (with current pydantic model) we have
❯ grep msg dandi_datacite.meta-errors.json | sort | uniq -c
4 "msg": "Field required",
3 "msg": "Input should be 'Anatomy'",
3 "msg": "Input should be 'Disorder'",
9 "msg": "String should match pattern '^[a-zA-Z0-9-]+:[a-zA-Z0-9-/\\._]+$'",
❯ grep message dandi_datacite.datacite-errors.json | sort | uniq -c
16 "message": "'IsPublishedIn' is not one of ['IsCitedBy', 'Cites', 'IsSupplementTo', 'IsSupplementedBy', 'IsContinuedBy', 'Continues', 'IsDescribedBy', 'Describes', 'HasMetadata', 'IsMetadataFor', 'HasVersion', 'IsVersionOf', 'IsNewVersionOf', 'IsPreviousVersionOf', 'IsPartOf', 'HasPart', 'IsReferencedBy', 'References', 'IsDocumentedBy', 'Documents', 'IsCompiledBy', 'Compiles', 'IsVariantFormOf', 'IsOriginalFormOf', 'IsIdenticalTo', 'IsReviewedBy', 'Reviews', 'IsDerivedFrom', 'IsSourceOf', 'IsRequiredBy', 'Requires', 'IsObsoletedBy', 'Obsoletes']",
90 "message": "'ROR' is not one of ['ISNI', 'GRID', 'Crossref Funder ID', 'Other']",
❯ grep 'No valid' dandi.bib
# No valid BibTeX for 000029/0.210712.1903. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.230317.1553. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.231017.1955. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.231017.1959. Starts with <!DOCTYPE html>
# No valid BibTeX for 000252/0.230408.2207. Starts with <!DOCTYPE html>
# No valid BibTeX for 000252/0.230408.2207. Starts with <!DOCTYPE html>
Status update
With the script from
we get
- for 000029s
- dandi.000029/0.210712.1903 -- good now - was failing to create pydantic (if we disable pre-validation) instance ATM since we started to mandate to have email for ContactPerson. ...
- the other 3 -- lack
doi
in the metadata assigned... oh well -- we will just forget about it... - we also ran into a "new" issue Fake DOI was injected into published versions of various dandisets, needs a fix #2342 for which we provide a workaround only for those in 000029 to fix at least in datacite....
- for 000252 -- it is just a single DOI, and we reported a bug to datacite via email (per their request) -- there is a https://doi.datacite.org/dois/10.48324%2Fdandi.000252%2F0.230408.2207 but there is no https://doi.org/10.48324/000252/0.230408.2207 . Now fixed at datacite -- it was lacking
dandi.
prefix there! may be it was even our fault somehow?- More on this from datacite support: "Additionally, I have confirmed that all other DOIs in the
DARTLIB.DANDI repository resolve as expected. We also have added
additional monitoring since this DOI was created to detect potential
issues."
- More on this from datacite support: "Additionally, I have confirmed that all other DOIs in the