Observation facility lists from various origins and in various formats.
Supported lists:
List | Format |
---|---|
AAS | HTML |
IAU-MPC | HTML |
IMCCE/Quaero | JSON |
NAIF | HTML |
NASA/PDS | XML |
NSSDC | HTML |
SPASE | JSON |
WikiData | RDF |
Types of facilities: Spacecraft, Observatories, Telescopes, Investigations, Airborne platforms.
Download data for facility lists and save it in an unified output ontology. It will perform entity typing by LLM and try to retrieve geographical information for every entity. type_confidence and location_confidence will be added to every entity, depending on how those information were retrieved. This might take some time during the first run, but will save all data in cache for next runs.
python update.py [options]
Option | Description |
---|---|
-l , --lists |
Name(s) of the lists to extract data from. Default is all . Available options: all or specific list names from ExtractorLists.EXTRACTORS_BY_NAMES . Multiple lists can be provided. |
-i , --input-ontology |
Optional input ontology file (.ttl ). Data from this ontology will be merged with newly extracted data. Useful for running the script in multiple steps. |
-o , --output-ontology |
Output ontology file name. Default is output.ttl . |
-c , --no-cache |
If set, disables caching and forces re-download and version comparison. |
python update.py -l aas pds -i wikidata.ttl -o all_entities.ttl
Entity matching tool. Will perform external ID linking, then follow a merging strategy configuration file (default: conf/merging_strategy.conf). Then, generate a full mapping, compute discriminant criteria, compute other scores on the remaining candidate pairs.
LLM validation uses an LLM to accept/reject candidate pairs. Save the mapped data with synonym sets objects with its SSSOM ontology. The execution time depends on the scores used in the merging strategy (sentence-cosine-similarity and llm-embedding take longer to encode entities), and on the validation LLM.
python merge.py -i input_ontology.ttl [options]
Option | Description |
---|---|
-i , --input-ontologies |
(Required) One or more input ontologies (.ttl ) to process. |
-o , --output-dir |
Output directory to save the final merged ontology and the SSSOM mapping ontology. Default is a timestamped folder. |
-l , --limit |
(Optional) Limit the number of entities per source to speed up testing. Only the top N entities from each list will be compared (NxN). |
-s , --merging-strategy |
Path to the merging strategy config file. Default is conf/merging_strategy.conf . |
-d , --direct-validation |
Skip manual review. Candidate matches will be validated automatically based on scores and logic. |
--human-validation |
Enable human-in-the-loop disambiguation after scoring. This disables LLM-based validation. |
This activity is a joint effort of the EPN-VESPA, IVOA and IPDA projects.
This work has also been supported by: the Europlanet 2020 Research Infrastructure project, which received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 654208; the Europlanet 2024 Research Infrastructure project, which received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 871149; the FAIR-IMPACT project, which received funding from the European Commission's Horizon Europe Research and Innovation programme under grant agreement no 101057344; and OPAL cascading grant from the the OSCARS project, which received funding from the European Commission's Horizon Europe Research and Innovation programme under grant agreement no 101129751.