Skip to content

PyDarshan job summary cleanup #910

@shanedsnyder

Description

@shanedsnyder

From an in-person review of the current Darshan job summary reports at Argonne, it would be nice to try to address as much of this as we can before doing a release.

First some high-level things up for discussion:

  • drop blue I/O performance estimate text box, and instead add a per-module table detailing key details of the module (including total files/datasets(H5D)/variables(PNETCDF_VAR) accessed, total bytes read/written, and I/O performance estimate)
    • this way, each module starts off with a high-level summary table that describes key details of I/O, and gets rid of awkward text object we are using now
  • disable DXT heatmap plot generation by default, print warning if DXT is available but HEATMAP isn't, and add CLI option to force DXT usage
    • for some logs, it's probably just too expensive to generate DXT heatmaps by default, so this protects against that
    • this would require us to either push a little on CLI stuff in WIP, ENH: rich-click CLI #809 or use something simple temporarily?
  • after dust settles, potentially reorganize plots/tables in per-module sections?
    • nothing concrete to suggest just yet (Tyler -0.5 on this)

Smaller things:

  • Captions are hard to read (light colored text, also italicized)
    image
  • Add "% runtime" y-axis label to right-side of I/O cost plot
    image
  • Offset autolabeling text at 45 degrees in access size histogram plot and POSIX access pattern plot as is done in I/O op count graph
  • Don't autolabel 0 values to further help with cluttering of plots
  • If not too complicated, can we add more buffer to top of these plots to avoid autolabels overlapping with plot boundaries?
  • We should use a ylim of 0 for access histogram and op count plots to ensure no negative values are plotted.
    image
  • Erroneous "type" row in File Count Summary table
    image
  • drop scientific notation entirely (partially addressed in MAINT: integer r/w vals for data access #812?), and rely on humanize to generate sensible human readable byte strings?
  • File access by category table font really small
    image
  • Pointed out in ENH: PyDarshan job summary print warning and bail on logs with no data #907, add a forgotten space to our warning string when a log has no data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions