Add bandstructure support for qe #55

ndaelman-hu · 2025-04-28T15:49:35Z

Migrated here from nomad-coe/electronic-parsers#279

…on electronic parsers + Claude Sonnet 3.7 - Update main file discovery for all quantum espresso workflows

- Correct formatting issues

- Correct unit mapping - TODO: fix remaining bugs

…flowparsers/quantum_espresso_bands/parser.py` - Correct double assignment energy units

coveralls · 2025-04-29T13:55:56Z

Pull Request Test Coverage Report for Build 15002848715

Details

1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.004%) to 82.315%

Totals
Change from base Build 14262472577:	0.004%
Covered Lines:	3733
Relevant Lines:	4535

💛 - Coveralls

ndaelman-hu · 2025-04-29T14:39:16Z

Here are some notes on how this implementation should handle Fermi level searches based on discussions with Elena Molteni:

Band structure calculations typically come in (2 or) 3 steps:

PWSCF module (Self-consistent Calculation): self-consistent calculation at high, grid-like k-point sampling. Reports: Fermi-level (optional), highest occupied level (optional), no. valence electrons, no. valence orbitals.
PWSCF module (Band Structure Calculation): non-self-consistent calculation defined along the k-path of interest. Reports: eigenvalues (multiplicity applied) per k-point, k-point weights.
BANDS module: symmetry analysis of the (electronic) structure. Reports: symmetry group labels, symmetry labels + eigenvalues + multiplicities per orbital.

Match the calculation type based on step 3, and enrich the band structure with information from step 1.
Some form of band gap (with the highest occupied energy) is necessary for aligning / visualizing the band structure. Symmetry information is used for decomposition in segments.

Regarding the level alignment, there is a cascade of sources to check:

"Fermi energy": printed when the smearing is defined in the input (https://www.quantum-espresso.org/faq/faq-self-consistency/#6.7).
"highest occupied level": printed when the no. bands is included in the input (https://www.quantum-espresso.org/faq/faq-self-consistency/#6.7).
QE else estimates the necessary no. bands based on the "number of electrons" (always reported in step 1). These are either a) just the necessary no. of occupied bands for insulators, or b) the no. occupied bands + 20% (minimum of 4 bands) for metals (https://www.quantum-espresso.org/Doc/INPUT_PW.html#nbnd).
In the case of option 3, make sure to correct for the orbital occupation: "Note that in spin-polarized calculations the number of
k-point, not the number of bands per k-point, is doubled".

ndaelman-hu · 2025-04-29T14:44:59Z

I found an edge case in BANDS calculations (step 3), where none of the k-point information is displayed, reporting instead

                    xk=(   0.00000,   0.00000,   1.00000  )

     zone border point and non-symmorphic group 
     symmetry decomposition not available

In these cases, step 2 still proceeds as regular. An alternative strategy would be to parse this file instead.
Segments would then be determined via a) a symmetry analysis package (e.g. spglib, pymatgen), or b) by computing the whether points align. The b option would cover cases where the analysis fails for similar reasons as with QE BANDS. Note that a lack of symmetry labels would leave the band structure unlabelled.

- Remove superfluous file handlers

ndaelman-hu · 2025-04-29T19:54:03Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?

Should I delete it from this parser?

ladinesa · 2025-04-29T20:52:16Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?

Should I delete it from this parser?

i am.noy sure why the other entry is empty. is this for the overall workflow ?

ndaelman-hu · 2025-04-30T11:33:27Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?
Should I delete it from this parser?

i am.noy sure why the other entry is empty. is this for the overall workflow ?

Pls take a look at my comment above. Step 2 remains mostly vacant, as I don't process it. It is picked up by the quantumespresso/parser.py. This is a consequence of splitting off the parsing of the band structure into workflows.

ladinesa · 2025-04-30T11:41:34Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?
Should I delete it from this parser?

i am.noy sure why the other entry is empty. is this for the overall workflow ?

Pls take a look at my comment above. Step 2 remains mostly vacant, as I don't process it. It is picked up by the quantumespresso/parser.py. This is a consequence of splitting off the parsing of the band structure into workflows.

I do not see any problem if it is parsed by another parser. The important thing is, you are able to link it with what is parsed by the band parser in a separate worklow entry.

ndaelman-hu · 2025-04-30T11:53:29Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?
Should I delete it from this parser?

i am.noy sure why the other entry is empty. is this for the overall workflow ?

Pls take a look at my comment above. Step 2 remains mostly vacant, as I don't process it. It is picked up by the quantumespresso/parser.py. This is a consequence of splitting off the parsing of the band structure into workflows.

I do not see any problem if it is parsed by another parser. The important thing is, you are able to link it with what is parsed by the band parser in a separate worklow entry.

It is picked up by the mainfile scan of the quantumespresso/parser.py. The entry itself is hardly populated.
Barring the edge cases of faulty symmetry analysis or minimalistic band structure workflows, there's no need for this file.

Let me be clear: our new QE parser should likely target step 2 and perform the symmetry analysis itself.
These edge cases only became clear towards the end of the current implementation that targets step 3. I do not consider that it merits refactoring here.

Ways to handle the empty entry:

Improve the mainfile scan. Idk how to, as the distinguishing line is a ways down.
Have the QE BANDS parser delete it. This presupposes that the QE parser always runs first.
Leave it be, maybe adding a warning.

ladinesa · 2025-04-30T13:22:02Z

@ladinesa Pls note that targeting BANDS rather than the Band Structure Calculation PWSCF, means that there is an empty entry being created for the latter. This entry will bear a FAILURE status. Any recommendations on how to handle it?
Should I delete it from this parser?

i am.noy sure why the other entry is empty. is this for the overall workflow ?

Pls take a look at my comment above. Step 2 remains mostly vacant, as I don't process it. It is picked up by the quantumespresso/parser.py. This is a consequence of splitting off the parsing of the band structure into workflows.

I do not see any problem if it is parsed by another parser. The important thing is, you are able to link it with what is parsed by the band parser in a separate worklow entry.

It is picked up by the mainfile scan of the quantumespresso/parser.py. The entry itself is hardly populated. Barring the edge cases of faulty symmetry analysis or minimalistic band structure workflows, there's no need for this file.

Let me be clear: our new QE parser should likely target step 2 and perform the symmetry analysis itself. These edge cases only became clear towards the end of the current implementation that targets step 3. I do not consider that it merits refactoring here.

Ways to handle the empty entry:

Improve the mainfile scan. Idk how to, as the distinguishing line is a ways down.

Have the QE BANDS parser delete it. This presupposes that the QE parser always runs first.

Leave it be, maybe adding a warning.

well, it is an entry matched by the parser, it it has no output it does not matter, as the file contains no info. the other parser should not delete other entries.

ndaelman-hu · 2025-04-30T13:39:12Z

well, it is an entry matched by the parser, it it has no output it does not matter, as the file contains no info. the other parser should not delete other entries.

Largely, yes*. Then pls feel free to proceed with the review.

*: I think being specific here matters, so that's why I'm reiterating. The file in step 2 contains less information than the one in step 3. There are 2 exceptions to that statement:

The data producer skipped step 3.
The symmetry analysis failed for some k-points in step 3. Then their matching eigenvalues are not printed. These remain present in step 2, though.

workflowparsers/quantum_espresso_bands/parser.py

ladinesa · 2025-04-30T15:40:49Z

workflowparsers/quantum_espresso_bands/parser.py

+        # Find PWSCF files and parse
+        try:
+            if (pwscf_file := self._find_files()) is not None:
+                self.pwscf_parser.mainfile = pwscf_file


Thsi is already matched and parsed by the quantum espresso parser for PWSCF, you simply have to load the entry if you need to access info from the archive. This means that the BANDS parser should be executed after PWSCF. Set level to 2

I'm fine with this approach. I ended up going with a minimalistic parser here, since this is what the other QE workflow parsers did too.
I can apply this, if you're fine with deviating from the old approach.

the other qe workflow parsers also do not match and parse pwscf files, look at the parsers init file. there is no deviation afaik, the only thing missing is connecting the entries to pwscf, we should do this in pwscf so we do the child archive generation in only one place

Should your or I see to add this to the QE parser then?

i can also add it no problem, just do the bands parsing in this pr then.

That would be appreciated, as you have a better view on what you want and how to handle child archives.
I will take a look during the review or after then.

Just to be clear: I remove all population of run, except for bandstructures, correct? Or do I still copy over e.g. method sections?

yes just populate the band structure quantities, then the method and system, if you do not see them in the bands mainfile or if they are the same as in pwscf then do not parse them i simply link them with with the pwscf entry.

method and system are not in the BANDS file. I'll remove any setting of them here then.
I will only link properties like Fermi level, necessary for the band structure.

workflowparsers/quantum_espresso_bands/parser.py

ladinesa · 2025-04-30T15:42:59Z

workflowparsers/quantum_espresso_bands/parser.py

+            )
+        self._process_data()
+
+    def _process_data(self):


just put this under parse function

workflowparsers/quantum_espresso_bands/parser.py

ladinesa · 2025-04-30T15:43:46Z

workflowparsers/quantum_espresso_bands/parser.py

+
+        if self.pwscf_parser.results:
+            self._create_system_section(sec_run)
+        self._create_method_section(sec_run)


perhaps also the method

what do you mean here?

i was referring to do the same as the previous comment i.e. simply linking the method section of the pwscf entry but if they are diferent then you parse it separately

workflowparsers/quantum_espresso_bands/parser.py

- Rename BANDS text parser to `MainfileParser`

- Remove `_process_data` function header (not the code block) - TODO: incorporate `_extract_reference_energy`

ndaelman-hu · 2025-05-07T13:30:44Z

@ladinesa I'm setting the missing band structure information from the QE parser (after the workflow setup).
Is it possible to defer triggering of the result normalizer for the bands entry till after it has its information set?

ladinesa · 2025-05-07T13:38:18Z

@ladinesa I'm setting the missing band structure information from the QE parser (after the workflow setup). Is it possible to defer triggering of the result normalizer for the bands entry till after it has its information set?

there is this parser function after_normalization where you can call the results normalizer again. but no, since we are using the MatchingParserInterface wrapper no way to inject it. You can write a wrapper function in MatchingParserInterface for this.

- Remove references to old NOMAD search

ndaelman added 3 commits April 28, 2025 14:37

- Add first template based on 276-add-bandstructure-support-for-qe …

ce97cfb

…on electronic parsers + Claude Sonnet 3.7 - Update main file discovery for all quantum espresso workflows

- Add missing _create_calculation_section method

8a8e59a

- Correct formatting issues

- Set up entry point

b73f4a1

- Correct unit mapping - TODO: fix remaining bugs

ndaelman-hu mentioned this pull request Apr 28, 2025

276 add bandstructure support for qe nomad-coe/electronic-parsers#279

Closed

ndaelman added 5 commits April 29, 2025 12:29

- Rename workflowparsers/quantum_espresso_bands/parsers.py to `work…

c6a9ab1

…flowparsers/quantum_espresso_bands/parser.py` - Correct double assignment energy units

Fix mypy errors

4689b2a

Apply ruff

cc435c4

Make customk type definition cross-version

d0e51ca

Change import package for TypeAlias

e27c5f9

ndaelman added 6 commits April 29, 2025 18:54

Make file scanning logic leaner

05bdeae

Repartition file scanning logic

c1a83b0

- Rewrite file handling

05657ec

- Remove superfluous file handlers

Add warning for failed symmetry analysis

b133f46

Apply ruff

9b7f4a3

Add README file

3bff816

ndaelman-hu requested a review from ladinesa April 29, 2025 19:49

ladinesa reviewed Apr 30, 2025

View reviewed changes

workflowparsers/quantum_espresso_bands/parser.py Show resolved Hide resolved

ladinesa reviewed Apr 30, 2025

View reviewed changes

workflowparsers/quantum_espresso_bands/parser.py Outdated Show resolved Hide resolved

ladinesa reviewed Apr 30, 2025

View reviewed changes

workflowparsers/quantum_espresso_bands/parser.py Outdated Show resolved Hide resolved

ladinesa reviewed Apr 30, 2025

View reviewed changes

workflowparsers/quantum_espresso_bands/parser.py Outdated Show resolved Hide resolved

ladinesa reviewed Apr 30, 2025

View reviewed changes

workflowparsers/quantum_espresso_bands/parser.py Show resolved Hide resolved

ndaelman added 4 commits May 5, 2025 13:40

Replace raise by return

93e17b2

- Remove PWSCF text parser

8d3b334

- Rename BANDS text parser to `MainfileParser`

Remove _find_files in favor of get_mainfile_keys

961eebe

- Add data extraction from SCF entry

d748e23

- Remove `_process_data` function header (not the code block) - TODO: incorporate `_extract_reference_energy`

- Remove superfluous get_mainfile_keys

6a8f46c

- Remove references to old NOMAD search

Add bandstructure support for qe #55

Are you sure you want to change the base?

Add bandstructure support for qe #55

Uh oh!

Conversation

ndaelman-hu commented Apr 28, 2025

Uh oh!

coveralls commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 15002848715

Details

💛 - Coveralls

Uh oh!

ndaelman-hu commented Apr 29, 2025

Uh oh!

ndaelman-hu commented Apr 29, 2025

Uh oh!

ndaelman-hu commented Apr 29, 2025

Uh oh!

ladinesa commented Apr 29, 2025

Uh oh!

ndaelman-hu commented Apr 30, 2025

Uh oh!

ladinesa commented Apr 30, 2025

Uh oh!

ndaelman-hu commented Apr 30, 2025

Uh oh!

ladinesa commented Apr 30, 2025

Uh oh!

ndaelman-hu commented Apr 30, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ladinesa May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ndaelman-hu commented May 7, 2025

Uh oh!

ladinesa commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

coveralls commented Apr 29, 2025 •

edited

Loading

ladinesa May 5, 2025 •

edited

Loading

ladinesa commented May 7, 2025 •

edited

Loading