@@ -104,8 +104,16 @@ \section{Introduction}
104104section~\ref {sect:quantities }, while the mapping between our columns and the
105105VAMDC-XSAMS Data Model is given in section~\ref {sect:mapping }.
106106
107+ During the development of the standard, a major problem in molecular
108+ spectroscopy turned out to be species nomenclature. The core LineTAP
109+ table sidesteps this problem by identifying species using IUPAC standard
110+ InChIs, a choice unpopular with many practitioners. To facilitate the
111+ use of colloquial species designations (`` ethyl alcohol'' ), this
112+ specification also defines a \textit {species table } associating common
113+ names and sum formulas with InChIs in section \ref {sect:speciestable }.
114+
107115When accessed using the Table Access Protocol TAP
108- \citep {2019ivoa.spec.0927D }, the table can be queried using the
116+ \citep {2019ivoa.spec.0927D }, the tables can be queried using the
109117expressive SQL-derived query language ADQL, while query results are
110118available in the VOTable format, easily readable by VO client
111119applications. Line databases accessible in this way can be registered
@@ -220,6 +228,13 @@ \subsection{Credit}
220228repository of line data, it should be as simple as possible for users to
221229give credit to the contributors of line data.
222230
231+ \subsection {Resolution of Molecule Designation }
232+ \label {uc:resolution }
233+
234+ A researcher wants to find lines for the molecule they have been calling
235+ `` Methyl Mercaptan'' or designated by a pseudo-structural formula like
236+ \verb |CH3SHv=0 | for a long time.
237+
223238
224239\subsection {Non-Use Cases }
225240
@@ -235,6 +250,7 @@ \subsection{Non-Use Cases}
235250\end {itemize }
236251
237252
253+
238254\begin {table }[hpt]
239255\hskip -0.05\linewidth
240256\begin {tabular }{p{0.43\linewidth }cp{0.5\linewidth }}
@@ -280,7 +296,7 @@ \subsection{Non-Use Cases}
280296\end {table }
281297
282298
283- \section {Spectral Line Data }\label {sect:quantities }
299+ \section {Spectral Lines Table }\label {sect:quantities }
284300
285301Table~\ref {tab:ltcols } gives the columns that make up the LineTAP
286302relational model. Implementations MUST have all columns given in this
@@ -379,12 +395,53 @@ \section{Spectral Line Data}\label{sect:quantities}
379395
380396\end {itemize }
381397
398+ \section {Species Table }\label {sect:speciestable }
399+ \label {ref:speciestable }
400+
401+ The species table is used to facilitate the referencing of molecules. As
402+ there are many summary formulas and colloquial molecule names for common
403+ species (and more than one species may correspond to a given summary
404+ formula and even colloquial name), the resolution of such identifiers to
405+ InChIs is generally non-trivial.
382406
383- \section {Protocol }
384- \label {sect:protocol }
385- \subsection {Queries: LineTAP }
407+ LineTAP's species table contains a mapping between common names and
408+ summary formulas and InChIs. It should be populated by data providers
409+ publishing molecule data to the best of their knowledge. It is
410+ explicitly possible to associate multiple names with a single InChI.
411+ There is no explicit relationship between a species table and LineTAP
412+ tables on a given service, i.e., the presence of a species in the the
413+ species table is not a guarantee that data on it is available from any
414+ table in the service.
415+
416+ For most cases, only the InChIKey is enough to reference a molecule. The InChi
417+ column is present in this table for the case that users want to use it to confirm if the
418+ returned molecule is the one they're searching for.
419+
420+ \begin {table }[hpt]
421+ \hskip -0.05\linewidth
422+ \begin {tabular }{p{0.43\linewidth }cp{0.5\linewidth }}
423+ \sptablerule
424+ \textbf {Name [Unit] } \ucd {UCD}&\textbf {Type }&\textbf {Description }\\
425+ \sptablerule
426+ % GENERATED: python3 make-species-table.py
427+ \texttt {inchikey } \hfil\break\ucd {} & text & \raggedright InChIKey of this species\tabularnewline
428+ \rowsep
429+ \texttt {inchi } \hfil\break\ucd {} & text & \raggedright InChI of this species\tabularnewline
430+ \rowsep
431+ \texttt {name } \hfil\break\ucd {} & text & \raggedright A common name of this species\tabularnewline
432+ \rowsep
433+ \texttt {formula } \hfil\break\ucd {} & text & \raggedright Chemical formula of this species in some free-ish notation\tabularnewline
434+ \rowsep
435+ \texttt {source\_ id } \hfil\break\ucd {} & text & \raggedright VAMDC identifier of the origin of this mapping\tabularnewline
386436
387- \subsection {User-defined functions }
437+ % /GENERATED
438+ \sptablerule
439+ \end {tabular }
440+ \caption {The columns that make up the Species Table. }
441+ \label {tab:spcols }
442+ \end {table }
443+
444+ \section {ADQL User-defined functions }
388445\label {sect:udfs }
389446
390447LineTAP services MUST implement the \texttt {ivo\_ specconv } user defined
@@ -541,6 +598,24 @@ \subsubsection{Characterising a Service's Data Holdings}
541598GROUP BY inchi
542599\end {lstlisting }
543600
601+ \subsubsection {Searching With Trivial Molecule Names }
602+
603+ Searching with trivial names as discussed in use
604+ case~\ref {uc:resolution } would often be a two-step process where clients
605+ ask the researcher which InChI would correspond the the species they
606+ were looking for. In simple cases, however, a single joined query can be
607+ run, too.
608+
609+ % please-run-a-test
610+ \ begin{lstlisting} [language=SQL]
611+ SELECT
612+ *
613+ FROM casa_lines.line_tap
614+ JOIN species.main as s USING (inchikey)
615+ WHERE s.name='Methylidynium'
616+ \end {lstlisting }
617+
618+
544619\section {Mapping from VAMDCXSAMS }
545620\label {sect:mapping }
546621
@@ -665,16 +740,13 @@ \section{LineTAP and the VO Registry}
665740
666741\subsection {Registering LineTAP-conforming Tables }
667742
668- LineTAP tables are registered using VODataService \citep {2021ivoa.spec.1102D }
743+ LineTAP line tables are registered using VODataService \citep {2021ivoa.spec.1102D }
669744tablesets, where the table utype is set to
670- $$ \hbox {\verb |ivo://ivoa.net/std/linetap#table -1.0 |}.$$
745+ $$ \hbox {\verb |ivo://ivoa.net/std/linetap#lines -1.0 |}.$$
671746
672- The tableset is normally contained in a VODataService \xmlel {CatalogService}
673- record with a TAP capability, and this capability normally is an auxiliary
674- capability as per DDC \citep {2019ivoa.spec.0520D }. For one-table
675- services a full TAPRegExt \citep {2012ivoa.spec.0827D } capability is also
676- allowed; other resource types can be used for registration as
677- appropriate.
747+ The tableset is contained in a VODataService \xmlel {CatalogResource}
748+ record with a TAP auxiliary capability
749+ as per DDC \citep {2019ivoa.spec.0520D }.
678750
679751Further capabilities, for instance for full VAMDC or legacy SLAP
680752services, may be given in the same record.
@@ -714,7 +786,7 @@ \subsection{Registering LineTAP-conforming Tables}
714786 <name>toss.ivoa_lines</name>
715787 <title>TOSS</title>
716788 <description> The LineTAP version of...</description>
717- <utype>ivo://ivoa.net/std/linetap#table -1.0</utype>
789+ <utype>ivo://ivoa.net/std/linetap#lines -1.0</utype>
718790 ...
719791</table>
720792\end {lstlisting }
@@ -726,6 +798,12 @@ \subsection{Registering LineTAP-conforming Tables}
726798and is thus to be expected in most registrations of this type. Clients
727799are advised to use the resource description for full text searches.
728800
801+ Species tables are registered in exactly the same way, except their
802+ utype is
803+ $$ \hbox {\verb |ivo://ivoa.net/std/linetap#species-1.0 |}.$$
804+ Data providers should only register line and species tables in one
805+ resource record if the species table really has the same metadata
806+ (description, author, source, etc) as the line table.
729807
730808\subsection {Discovering LineTAP services }
731809
@@ -738,35 +816,34 @@ \subsection{Discovering LineTAP services}
738816would return TAP access URLs and the table names:
739817
740818\ begin{lstlisting} [language=SQL]
741- SELECT DISTINCT table_name, access_url
819+ SELECT table_name, access_url
742820FROM rr.res_table
743821 NATURAL JOIN rr.capability
744822 NATURAL JOIN rr.interface
745823WHERE
746- table_utype LIKE 'ivo://ivoa.net/std/linetap#table -1.%'
824+ table_utype LIKE 'ivo://ivoa.net/std/linetap#lines -1.%'
747825 AND standard_id LIKE 'ivo://ivoa.net/std/tap%'
748826 AND intf_role='std'
827+ AND res_type='vs:catalogresource'
749828\end {lstlisting }
750829
751- The \texttt {DISTINCT } in the main query is a rough filter that removes
752- entries duplicated because their tables are registred both in the main
753- TAP record and in an auxiliary capability.
754-
755830The regular expression in the utype match is to make sure minor version
756831increments do not prevent service discovery; by IVOA versioning rules,
757832all LineTAP services of minor version 1 can be operated by all LineTAP
758833clients of version 1. We do not constrain the version of the TAP
759834service. Clients may want to adapt the TAP discovery pattern to match
760835their specific needs.
761836
762-
837+ Adapting the utype, this query will work analogously for species tables.
763838
764839\appendix
765- \section {Changes from Previous Versions }
840+ \section {Changes from WD-2023-03-23 }
766841
767- No previous versions yet.
768- % these would be subsections "Changes from v. WD-..."
769- % Use itemize environments.
842+ \begin {itemize }
843+ \item Adding the species table
844+ \item Changing the line table utype to \dots lines-1.0 (rather than
845+ \dots table-1.0 before).
846+ \end {itemize }
770847
771848
772849\bibliography {ivoatex/ivoabib,ivoatex/docrepo, localrefs}
0 commit comments