Skip to content

Conversation

wkliao
Copy link
Collaborator

@wkliao wkliao commented Dec 14, 2022

Resolve the Segmentation fault. Below are the gdb traces.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f1a2aea2907 in ncmpi_put_vars_int_all (ncid=0, varid=0, start=0x19e08c0, count=0x198fbf0, stride=0x1991c00, 
    buf=0x7f1a25eec010) at ./darshan-pnetcdf-api.c:11151
11151	                    common_access_vals[1+i] = count[i];

#0  0x00007f2f784f393e in ncmpi_put_vars_int_all (ncid=0, varid=0, start=0x1c7b8c0, count=0x1c2abf0, stride=0x1c2cc00, 
    buf=0x7f2f7353d010) at ./darshan-pnetcdf-api.c:11153
11153	                        common_access_vals[1+i+PNETCDF_VAR_MAX_NDIMS] = stride[i];

#0  0x00007fa89f55307f in ncmpi_get_vars_int_all (ncid=0, varid=0, start=0xca38c0, count=0xc52bf0, stride=0xc54c00, 
    buf=0x7fa89a58b010) at ./darshan-pnetcdf-api.c:13080
13080	                    common_access_vals[1+i] = count[i];

@shanedsnyder
Copy link

Is this an example that's publicly available and that I could reproduce with? If not, do you know what exactly triggers it? I.e., any variable with more than 5 dimensions, or?

@wkliao
Copy link
Collaborator Author

wkliao commented Dec 14, 2022

The seg fault occurred when I built PnetCDF with option --enable-burst_buffering
which tests a variable with a large dimension size 1024 in test/burst_buffer/highdim.c.

@shanedsnyder
Copy link

Anything else special I need to trigger the seg fault? That test case runs fine for me now on Darshan main branch.

Are you running the test just via make check on your laptop? Or are you running the test directly with different parameters?

FYI, I'm using PnetCDF main branch for testing. I could try a specific version if that's what you're using, also.

@wkliao
Copy link
Collaborator Author

wkliao commented Dec 14, 2022

I ran 'make check' and 'make ptest' on a local Redhat server.
I suggest to always use an official release of PnetCDF, like in #874.

@shanedsnyder
Copy link

Confirmed on my end with the 1.12.3 release.

Proposed changes look right to me. I spot checked a few examples that use vara/vars interfaces and the affected counters look as expected, so all looks good here. Thanks!

@shanedsnyder shanedsnyder merged commit 5cab3ef into darshan-hpc:main Dec 15, 2022
@shanedsnyder shanedsnyder added this to the 3.4.2 milestone Dec 15, 2022
@wkliao wkliao deleted the fix_start_ndx branch December 16, 2022 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants