Skip to content

Conversation

@Bhoomika2905
Copy link
Collaborator

@Bhoomika2905 Bhoomika2905 commented Nov 11, 2025

Description

This PR adds comprehensive support for violin plots in py-maidr, enabling automatic registration of violin plots created with seaborn for MAIDR's accessibility pipeline. Violin plots are now registered with both BOX and SMOOTH (KDE) layers, providing full accessibility features including highlighting, navigation, and data extraction.

Type of Change

  • Bug fix
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Pull Request

Description

  • Automatic Registration: Violin plots created with seaborn.violinplot() are automatically registered with MAIDR when the module is imported
  • Dual Layer Support: Each violin plot registers two layers:
    • BOX Layer: Box plot statistics (min, max, median, Q1, Q3) calculated from raw data
    • SMOOTH Layer: KDE (density distribution) curves for the violin shapes
  • Group/Category Handling: Properly handles single and multiple violin plots with group names
  • Element Highlighting: Box plot elements (min, max, median, IQR) can be highlighted for accessibility
  • Density Calculation: For SMOOTH layers, calculates density as horizontal width at each y-value

Changes Made

New Files

  • maidr/patch/violinplot.py: Complete implementation of violin plot support
    • ViolinBoxPlot class: Custom plot class for violin/box plots
    • calculate_box_stats_from_violin_data(): Calculates box plot statistics from violin data
    • create_violin_box_elements(): Creates box plot elements (lines, rectangles) for highlighting
    • create_violin_box_data(): Creates MAIDR-compatible box plot data structure
    • sns_violin(): Wrapper function that patches seaborn.violinplot()

Modified Files

  • maidr/core/plot/regplot.py: Enhanced SmoothPlot class to support violin plots
    • Added violin_fill parameter for category/group names
    • Added density calculation for PolyCollection (violin) plots

Screenshots (if applicable)

image

Checklist

  • I have read the Contributor Guidelines.
  • I have performed a self-review of my own code and ensured it follows the project's coding standards.
  • I have tested the changes locally following ManualTestingProcess.md, and all tests related to this pull request pass.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation, if applicable.
  • I have added appropriate unit tests, if applicable.

Additional Notes

@Bhoomika2905 Bhoomika2905 marked this pull request as draft November 11, 2025 07:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for violin plots with embedded box plot statistics in the MAIDR system. It enables both KDE (violin shape) and box plot layers to be extracted from seaborn violin plots, making them accessible for visualization.

  • Implements ViolinBoxPlot class for rendering box plot statistics within violin plots
  • Adds automatic box plot statistics calculation from raw violin data
  • Registers both SMOOTH (KDE) and BOX layers for violin plots with proper element tracking

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 18 comments.

File Description
maidr/patch/violinplot.py New file implementing violin plot patching with box plot extraction, element creation, and KDE registration
maidr/patch/init.py Adds violinplot module to patch imports
maidr/core/plot/regplot.py Extends SmoothPlot to support violin density calculations and category naming
Comments suppressed due to low confidence (4)

maidr/patch/violinplot.py:241

  • Variable _whisker_width is not used.
        _whisker_width = 0.1  # kept for potential future customization

maidr/patch/violinplot.py:275

  • Variable _whisker_height is not used.
        _whisker_height = 0.1  # kept for potential future customization

maidr/patch/violinplot.py:478

  • Variable _sorted_tick_positions is not used.
                _sorted_tick_positions = [pos for pos, _ in position_group_pairs]

maidr/patch/violinplot.py:483

  • Variable _group_positions is not used.
                _group_positions = {group: idx for idx, group in enumerate(unique_groups)}

…cumentation and examples for better usability
@Bhoomika2905 Bhoomika2905 requested a review from nk1408 November 16, 2025 19:46
@Bhoomika2905 Bhoomika2905 changed the title Feat boxplot in violinplot part three feat: Implement violin plot support with dual-layer registration Nov 16, 2025
@nk1408
Copy link
Collaborator

nk1408 commented Nov 18, 2025

@Bhoomika2905 , Could you please address these issues?

Layer separation violation

ViolinBoxPlot is in maidr/patch/violinplot.py but should be in maidr/core/plot/. Patches should only contain wrapper functions

Factory pattern bypass: Code directly calls FigureManager._get_maidr() and manually appends plots and should use FigureManager.create_maidr() like all other patches

Missing factory registration: ViolinBoxPlot is not registered in MaidrPlotFactory

Looking at create_violin_box_elements() function (lines 632-638): The boxplot lines/rectangles are not part of the original seaborn.violinplot() output - they're added afterward. It directly manipulates the axes. We ideally shouldn't add these, rather extract the selectors from already rendered boxplot.

The violin patch should:

Detect both BOX and SMOOTH layers (already does this)

Use dedicated classes for each:

ViolinBoxPlot for box statistics
ViolinDensityPlot for density boundaries (not SmoothPlot)
Register both via the factory pattern

@nk1408
Copy link
Collaborator

nk1408 commented Nov 18, 2025

{
  "id": "97d64303-3c2d-442f-9021-a706b62d65db",
  "subplots": [
    [
      {
        "id": "bf39d757-8eaf-4116-996d-7049248f3432",
        "layers": [
          {
            "id": "0c4fb753-8ced-4a60-802b-2f1b8a6a9dc4",
            "type": "box",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              {
                "fill": "HR",
                "min": 60.06215542699554,
                "q1": 66.99547164751773,
                "q2": 69.36521854110143,
                "q3": 72.02976026006031,
                "max": 79.26139092254469,
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "fill": "Finance",
                "min": 61.56860149290671,
                "q1": 69.36037633789698,
                "q2": 75.58875018962785,
                "q3": 78.76719313363918,
                "max": 92.242694787397,
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "fill": "Sales",
                "min": 67.84914448005435,
                "q1": 76.06733875626571,
                "q2": 80.58617445391272,
                "q3": 84.22662468742521,
                "max": 93.88795140004105,
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "fill": "IT",
                "min": 56.50441710276077,
                "q1": 62.73192268592567,
                "q2": 65.20062946490535,
                "q3": 67.73600290559601,
                "max": 73.75921173287068,
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "fill": "Marketing",
                "min": 73.49039417632207,
                "q1": 80.54158181689796,
                "q2": 84.6205021580351,
                "q3": 88.25725049698185,
                "max": 95.30373962440993,
                "lowerOutliers": [],
                "upperOutliers": []
              }
            ],
            "orientation": "vert",
            "selectors": [
              {
                "min": "g[id='maidr-01f0e23a-aa72-4afb-9180-c19df6dea841'] > path",
                "max": "g[id='maidr-99473808-7c80-4a4e-8e3c-f41c06577dee'] > path",
                "q2": "g[id='maidr-e5c47a67-1636-4019-9bf4-71e9ee634f36'] > path",
                "iq": "g[id='maidr-d6baea4e-7ace-467b-b4e2-2f8af5696f77'] > path",
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "min": "g[id='maidr-860f0f69-d737-4cc4-ac9d-247c1b9f21b6'] > path",
                "max": "g[id='maidr-11e6d47b-d47e-47a9-a6e2-022991c678f5'] > path",
                "q2": "g[id='maidr-7324f000-7b59-4e87-bc71-8b449fc76563'] > path",
                "iq": "g[id='maidr-879bb0a7-58c4-4c2d-8bc6-f189465b8beb'] > path",
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "min": "g[id='maidr-7e470a9d-6d57-4377-a0a6-706680103fde'] > path",
                "max": "g[id='maidr-bb674111-833e-41ec-ba10-636f5c0f5e93'] > path",
                "q2": "g[id='maidr-de7895d2-fbdc-45f7-97c6-392f7e9704dd'] > path",
                "iq": "g[id='maidr-599716fa-62c7-4eee-a7f0-39c2cc7b74e0'] > path",
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "min": "g[id='maidr-a2879f30-ae0a-4593-8876-5943522b6b34'] > path",
                "max": "g[id='maidr-3047c442-d8e8-4978-abc5-ed57f113d209'] > path",
                "q2": "g[id='maidr-656a35d2-db89-490a-bca6-54c7b4bde10c'] > path",
                "iq": "g[id='maidr-94fbbad4-fe63-4fa3-a2e4-0c78e0e8cd0e'] > path",
                "lowerOutliers": [],
                "upperOutliers": []
              },
              {
                "min": "g[id='maidr-a0c3463b-ef34-426f-bc48-dde3943d3194'] > path",
                "max": "g[id='maidr-fbb9e193-031f-433f-8eac-9659b627f853'] > path",
                "q2": "g[id='maidr-b658ad26-7ff0-48f7-926d-f93dc9274477'] > path",
                "iq": "g[id='maidr-9123435c-cc6f-4bfb-890a-84069821d2ed'] > path",
                "lowerOutliers": [],
                "upperOutliers": []
              }
            ]
          },
          {
            "id": "39eb73d3-a4bc-46b8-9c7a-e9b9cfd79804",
            "type": "smooth",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              [
                {
                  "x": 0.0014108565360845687,
                  "y": 53.28579084694925,
                  "svg_x": 90.97659763843185,
                  "svg_y": 291.06,
                  "density": 0.0028217130721691374,
                  "fill": "HR"
                },
                {
                  "x": -0.0014108565360845687,
                  "y": 53.28579084694925,
                  "svg_x": 90.75140236156818,
                  "svg_y": 291.06,
                  "density": 0.0028217130721691374,
                  "fill": "HR"
                },
               
                {
                  "x": 0.002377663101203325,
                  "y": 82.87687455514673,
                  "svg_x": 91.05375653678085,
                  "svg_y": 155.72014800647446,
                  "density": 0.00475532620240665,
                  "fill": "HR"
                },
                {
                  "x": 0.002377663101203325,
                  "y": 82.87687455514673,
                  "svg_x": 91.05375653678085,
                  "svg_y": 155.72014800647446,
                  "density": 0.00475532620240665,
                  "fill": "HR"
                }
              ]
            ],
            "selectors": [
              "g[id='maidr-0f3bdd22-204e-4721-baa6-15034e80ff2f'] > defs > path"
            ]
          },
          {
            "id": "fc5fa96c-f171-441b-9f74-6cfb43cb3b87",
            "type": "smooth",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              [
               
                {
                  "x": 1.0016653731717322,
                  "y": 99.356458518933,
                  "svg_x": 170.8049101020896,
                  "svg_y": 80.3479691242306,
                  "density": 0.00333074634346453,
                  "fill": "Finance"
                },
                {
                  "x": 1.0016653731717322,
                  "y": 99.356458518933,
                  "svg_x": 170.8049101020896,
                  "svg_y": 80.3479691242306,
                  "density": 0.00333074634346453,
                  "fill": "Finance"
                }
              ]
            ],
            "selectors": [
              "g[id='maidr-456c079f-642b-4c08-92d5-f8cb57ea1d77'] > defs > path"
            ]
          },
          {
            "id": "8596471f-9b59-4481-8a6e-6f6544982f8d",
            "type": "smooth",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              [
                {
                  "x": 2.0014180550671354,
                  "y": 55.3724664749426,
                  "svg_x": 250.59317213879797,
                  "svg_y": 281.516234268538,
                  "density": 0.002836110134270653,
                  "fill": "Sales"
                },
                {
                  "x": 1.9985819449328648,
                  "y": 55.3724664749426,
                  "svg_x": 250.36682786120207,
                  "svg_y": 281.516234268538,
                  "density": 0.002836110134270653,
                  "fill": "Sales"
                },
                
                {
                  "x": 1.9985820961678409,
                  "y": 108.2963184285713,
                  "svg_x": 250.36683993096307,
                  "svg_y": 39.460000000000086,
                  "density": 0.002835807664318013,
                  "fill": "Sales"
                },
                {
                  "x": 2.001417903832159,
                  "y": 108.2963184285713,
                  "svg_x": 250.5931600690369,
                  "svg_y": 39.460000000000086,
                  "density": 0.002835807664318013,
                  "fill": "Sales"
                },
                {
                  "x": 2.001417903832159,
                  "y": 108.2963184285713,
                  "svg_x": 250.5931600690369,
                  "svg_y": 39.460000000000086,
                  "density": 0.002835807664318013,
                  "fill": "Sales"
                }
              ]
            ],
            "selectors": [
              "g[id='maidr-f726a959-849a-499f-9860-2d2db81664b6'] > defs > path"
            ]
          },
          {
            "id": "0cb5e2c2-dfd7-4d93-b14d-49bebeb4349f",
            "type": "smooth",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              [
                {
                  "x": 3.0015787545708026,
                  "y": 53.68868019580923,
                  "svg_x": 330.41399724478663,
                  "svg_y": 289.21731706948697,
                  "density": 0.003157509141605175,
                  "fill": "IT"
                },
                {
                  "x": 2.9984212454291974,
                  "y": 53.68868019580923,
                  "svg_x": 330.16200275521334,
                  "svg_y": 289.21731706948697,
                  "density": 0.003157509141605175,
                  "fill": "IT"
                },
                
                {
                  "x": 3.0054027152293,
                  "y": 76.34377421109482,
                  "svg_x": 330.71917989701996,
                  "svg_y": 185.6003930945546,
                  "density": 0.01080543045859983,
                  "fill": "IT"
                },
                {
                  "x": 2.996169469930023,
                  "y": 76.57494863982222,
                  "svg_x": 329.98229305617525,
                  "svg_y": 184.54307754378993,
                  "density": 0.007661060139954046,
                  "fill": "IT"
                },
                {
                  "x": 3.003830530069977,
                  "y": 76.57494863982222,
                  "svg_x": 330.5937069438247,
                  "svg_y": 184.54307754378993,
                  "density": 0.007661060139954046,
                  "fill": "IT"
                },
                {
                  "x": 3.003830530069977,
                  "y": 76.57494863982222,
                  "svg_x": 330.5937069438247,
                  "svg_y": 184.54307754378993,
                  "density": 0.007661060139954046,
                  "fill": "IT"
                }
              ]
            ],
            "selectors": [
              "g[id='maidr-1160e529-8e5d-4f29-bc1a-ce8533a41907'] > defs > path"
            ]
          },
          {
            "id": "c68f21a9-3257-4747-9e33-e53015abac5c",
            "type": "smooth",
            "title": "Performance Score Distribution Across 5 Departments",
            "axes": {
              "x": "Department",
              "y": "Performance Score"
            },
            "data": [
              [
                {
                  "x": 4.002416293338993,
                  "y": 69.25560748949077,
                  "svg_x": 410.2888395387984,
                  "svg_y": 218.01932862483974,
                  "density": 0.004832586677986139,
                  "fill": "Marketing"
                },
                {
                  "x": 3.997583706661007,
                  "y": 69.25560748949077,
                  "svg_x": 409.90316046120165,
                  "svg_y": 218.01932862483974,
                  "density": 0.004832586677986139,
                  "fill": "Marketing"
                },
                {
                  "x": 4.002416293338993,
                  "y": 69.25560748949077,
                  "svg_x": 410.2888395387984,
                  "svg_y": 218.01932862483974,
                  "density": 0.004832586677986139,
                  "fill": "Marketing"
                },
                {
                  "x": 4.001500768117909,
                  "y": 104.62919072910749,
                  "svg_x": 410.2157733019541,
                  "svg_y": 56.23223196625612,
                  "density": 0.003001536235817337,
                  "fill": "Marketing"
                }
              ]
            ],
            "selectors": [
              "g[id='maidr-43e1cc86-b696-4c0c-8cee-903a2f50a23d'] > defs > path"
            ]
          }
        ]
      }
    ]
  ]
}

@Bhoomika2905 , I think there is an issue with payload construction....I just see only box & smooth layer, and that is reason why there might be arch violations, You might have to introduce a separate violin class for processing these together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants