Skip to content

Commit b5db592

Browse files
committed
Implementation of InstitutuionMetrics
1 parent 22fd32e commit b5db592

File tree

6 files changed

+757
-1
lines changed

6 files changed

+757
-1
lines changed

docs/reference.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ SciVal
8181
:maxdepth: 1
8282

8383
reference/scival/AuthorMetrics.rst
84+
reference/scival/InstitutionMetrics.rst
8485
reference/scival/PublicationLookup.rst
8586

8687

Lines changed: 301 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,301 @@
1+
pybliometrics.scival.InstitutionMetrics
2+
=======================================
3+
4+
`InstitutionMetrics()` implements the `SciVal Institution Metrics API <https://dev.elsevier.com/documentation/SciValInstitutionAPI.wadl>`_.
5+
6+
It accepts one or more SciVal Institution IDs as the main argument and retrieves various performance metrics for the specified institutions.
7+
8+
.. currentmodule:: pybliometrics.scival
9+
.. contents:: Table of Contents
10+
:local:
11+
12+
Documentation
13+
-------------
14+
15+
.. autoclass:: InstitutionMetrics
16+
:members:
17+
:inherited-members:
18+
19+
Examples
20+
--------
21+
22+
You initialize the class with one or more SciVal Institution IDs. The argument can be a single ID, a list of IDs, or a comma-separated string of IDs.
23+
24+
.. code-block:: python
25+
26+
>>> import pybliometrics
27+
>>> from pybliometrics.scival import InstitutionMetrics
28+
>>> pybliometrics.scival.init()
29+
>>> institution_metrics = InstitutionMetrics("309021")
30+
31+
You can obtain basic information just by printing the object:
32+
33+
.. code-block:: python
34+
35+
>>> print(institution_metrics)
36+
InstitutionMetrics for 1 institution(s):
37+
- Humboldt University of Berlin (ID: 309021)
38+
39+
There are many properties available that provide different types of metrics. You can explore the available institutions:
40+
41+
.. code-block:: python
42+
43+
>>> institution_metrics.institutions
44+
[Institution(id=309021, name='Humboldt University of Berlin', uri='Institution/309021')]
45+
46+
**Individual Metric Properties**
47+
48+
Each metric property returns a list of `MetricData` namedtuples with the structure: `(entity_id, entity_name, metric, metric_type, year, value, percentage, threshold)` where `entity_id` and `entity_name` refer to the institution.
49+
50+
.. code-block:: python
51+
52+
>>> institution_metrics.CitationCount
53+
[MetricData(entity_id=309021, entity_name='Humboldt University of Berlin', metric='CitationCount',
54+
metric_type=None, year='all', value=368527, percentage=None, threshold=None)]
55+
56+
>>> institution_metrics.CollaborationImpact
57+
[MetricData(entity_id=309021, entity_name='Humboldt University of Berlin', metric='CollaborationImpact',
58+
metric_type='Institutional collaboration', year='all', value=8.610204, percentage=None, threshold=None),
59+
MetricData(entity_id=309021, entity_name='Humboldt University of Berlin', metric='CollaborationImpact',
60+
metric_type='International collaboration', year='all', value=22.430689, percentage=None, threshold=None),
61+
MetricData(entity_id=309021, entity_name='Humboldt University of Berlin', metric='CollaborationImpact',
62+
metric_type='National collaboration', year='all', value=9.935493, percentage=None, threshold=None),
63+
MetricData(entity_id=309021, entity_name='Humboldt University of Berlin', metric='CollaborationImpact',
64+
metric_type='Single authorship', year='all', value=3.187361, percentage=None, threshold=None)]
65+
66+
**Available Metric Properties**:
67+
68+
- `AcademicCorporateCollaboration`
69+
- `AcademicCorporateCollaborationImpact`
70+
- `CitationCount`
71+
- `CitationsPerPublication`
72+
- `CitedPublications`
73+
- `Collaboration`
74+
- `CollaborationImpact`
75+
- `FieldWeightedCitationImpact`
76+
- `OutputsInTopCitationPercentiles`
77+
- `PublicationsInTopJournalPercentiles`
78+
- `ScholarlyOutput`
79+
80+
.. note::
81+
**Unified Data Structure**: InstitutionMetrics uses a unified `MetricData` structure with `entity_id` and `entity_name` fields. For institutions, these fields contain the institution ID and institution name respectively. This structure is compatible with `AuthorMetrics` and other SciVal metric classes, enabling consistent data analysis across different entity types.
82+
83+
**Getting All Metrics at Once**
84+
85+
You can retrieve all available metrics in a single list using the `all_metrics` property:
86+
87+
.. code-block:: python
88+
89+
>>> all_data = institution_metrics.all_metrics
90+
>>> len(all_data)
91+
28
92+
>>> # Convert to pandas DataFrame for analysis
93+
>>> import pandas as pd
94+
>>> df = pd.DataFrame(all_data)
95+
>>> df.head()
96+
97+
98+
.. raw:: html
99+
100+
<div style="overflow-x:auto; border:1px solid #ddd; padding:10px;">
101+
<style scoped>
102+
.dataframe tbody tr th:only-of-type {
103+
vertical-align: middle;
104+
}
105+
106+
.dataframe tbody tr th {
107+
vertical-align: top;
108+
}
109+
110+
.dataframe thead th {
111+
text-align: right;
112+
}
113+
.dataframe{
114+
font-size: 12px;
115+
}
116+
</style>
117+
<table border="1" class="dataframe">
118+
<thead>
119+
<tr style="text-align: right;">
120+
<th></th>
121+
<th>entity_id</th>
122+
<th>entity_name</th>
123+
<th>metric</th>
124+
<th>metric_type</th>
125+
<th>year</th>
126+
<th>value</th>
127+
<th>percentage</th>
128+
<th>threshold</th>
129+
</tr>
130+
</thead>
131+
<tbody>
132+
<tr>
133+
<th>0</th>
134+
<td>309021</td>
135+
<td>Humboldt University of Berlin</td>
136+
<td>AcademicCorporateCollaboration</td>
137+
<td>Academic-corporate collaboration</td>
138+
<td>all</td>
139+
<td>1015.000000</td>
140+
<td>4.469594</td>
141+
<td>NaN</td>
142+
</tr>
143+
<tr>
144+
<th>1</th>
145+
<td>309021</td>
146+
<td>Humboldt University of Berlin</td>
147+
<td>AcademicCorporateCollaboration</td>
148+
<td>No academic-corporate collaboration</td>
149+
<td>all</td>
150+
<td>21694.000000</td>
151+
<td>95.530410</td>
152+
<td>NaN</td>
153+
</tr>
154+
<tr>
155+
<th>2</th>
156+
<td>309021</td>
157+
<td>Humboldt University of Berlin</td>
158+
<td>AcademicCorporateCollaborationImpact</td>
159+
<td>Academic-corporate collaboration</td>
160+
<td>all</td>
161+
<td>59.104435</td>
162+
<td>NaN</td>
163+
<td>NaN</td>
164+
</tr>
165+
<tr>
166+
<th>3</th>
167+
<td>309021</td>
168+
<td>Humboldt University of Berlin</td>
169+
<td>AcademicCorporateCollaborationImpact</td>
170+
<td>No academic-corporate collaboration</td>
171+
<td>all</td>
172+
<td>14.222181</td>
173+
<td>NaN</td>
174+
<td>NaN</td>
175+
</tr>
176+
<tr>
177+
<th>4</th>
178+
<td>309021</td>
179+
<td>Humboldt University of Berlin</td>
180+
<td>Collaboration</td>
181+
<td>Institutional collaboration</td>
182+
<td>all</td>
183+
<td>980.000000</td>
184+
<td>4.320000</td>
185+
<td>NaN</td>
186+
</tr>
187+
</tbody>
188+
</table>
189+
</div>
190+
191+
192+
**Multiple Institutions**
193+
194+
You can analyze multiple institutions simultaneously and retrieve metrics `by_year`:
195+
196+
.. code-block:: python
197+
198+
>>> multi_institutions = InstitutionMetrics([309050, 309076], by_year=True)
199+
>>> print(multi_institutions)
200+
InstitutionMetrics for 2 institution(s):
201+
- Technical University of Berlin (ID: 309050)
202+
- Heidelberg University  (ID: 309076)
203+
>>> # Get all collaboration metrics for all institutions
204+
>>> df = pd.DataFrame(multi_institutions.all_metrics)
205+
>>> df.head()
206+
207+
.. raw:: html
208+
209+
<div style="overflow-x:auto; border:1px solid #ddd; padding:10px;">
210+
<style scoped>
211+
.dataframe tbody tr th:only-of-type {
212+
vertical-align: middle;
213+
}
214+
215+
.dataframe tbody tr th {
216+
vertical-align: top;
217+
}
218+
219+
.dataframe thead th {
220+
text-align: right;
221+
}
222+
.dataframe{
223+
font-size: 12px;
224+
}
225+
</style>
226+
<table border="1" class="dataframe">
227+
<thead>
228+
<tr style="text-align: right;">
229+
<th></th>
230+
<th>entity_id</th>
231+
<th>entity_name</th>
232+
<th>metric</th>
233+
<th>metric_type</th>
234+
<th>year</th>
235+
<th>value</th>
236+
<th>percentage</th>
237+
<th>threshold</th>
238+
</tr>
239+
</thead>
240+
<tbody>
241+
<tr>
242+
<th>0</th>
243+
<td>309050</td>
244+
<td>Technical University of Berlin</td>
245+
<td>AcademicCorporateCollaboration</td>
246+
<td>Academic-corporate collaboration</td>
247+
<td>2024</td>
248+
<td>282.0</td>
249+
<td>7.770736</td>
250+
<td>NaN</td>
251+
</tr>
252+
<tr>
253+
<th>1</th>
254+
<td>309050</td>
255+
<td>Technical University of Berlin</td>
256+
<td>AcademicCorporateCollaboration</td>
257+
<td>Academic-corporate collaboration</td>
258+
<td>2020</td>
259+
<td>285.0</td>
260+
<td>7.740358</td>
261+
<td>NaN</td>
262+
</tr>
263+
<tr>
264+
<th>2</th>
265+
<td>309050</td>
266+
<td>Technical University of Berlin</td>
267+
<td>AcademicCorporateCollaboration</td>
268+
<td>Academic-corporate collaboration</td>
269+
<td>2021</td>
270+
<td>250.0</td>
271+
<td>6.529120</td>
272+
<td>NaN</td>
273+
</tr>
274+
<tr>
275+
<th>3</th>
276+
<td>309050</td>
277+
<td>Technical University of Berlin</td>
278+
<td>AcademicCorporateCollaboration</td>
279+
<td>Academic-corporate collaboration</td>
280+
<td>2022</td>
281+
<td>249.0</td>
282+
<td>6.709782</td>
283+
<td>NaN</td>
284+
</tr>
285+
<tr>
286+
<th>4</th>
287+
<td>309050</td>
288+
<td>Technical University of Berlin</td>
289+
<td>AcademicCorporateCollaboration</td>
290+
<td>Academic-corporate collaboration</td>
291+
<td>2023</td>
292+
<td>253.0</td>
293+
<td>6.693122</td>
294+
<td>NaN</td>
295+
</tr>
296+
</tbody>
297+
</table>
298+
</div>
299+
300+
301+
Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as the maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `institution_metrics.get_cache_file_mdate()` to obtain the date of last modification, and `institution_metrics.get_cache_file_age()` to determine the number of days since the last modification.

pybliometrics/scival/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
from pybliometrics.utils import *
22

33
from pybliometrics.scival.author_metrics import *
4+
from pybliometrics.scival.institution_metrics import *
45
from pybliometrics.scival.publication_lookup import *

0 commit comments

Comments
 (0)