Skip to content

vcf_to_dataframe not able to extract columns that read_vcf method can #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
philt31 opened this issue Jan 8, 2020 · 6 comments
Open

Comments

@philt31
Copy link

philt31 commented Jan 8, 2020

Extracting same column with vcf_to_dataframe gives error as shown in screenshot. is this correct behaviour (in which case how to i extract the WIT column) or is it a bug?

Screenshot 2020-01-08 at 09 59 42

@hardingnj
Copy link
Collaborator

Thanks for the report. Generally the google group might be a better for a quick response (users tend to respond quicker than devs!) - https://groups.google.com/forum/#!forum/scikit-allel)

I can't see anything obviously wrong in your code. It may be some confusion around column naming. Can you show the output of head(witty_df).

@philt31
Copy link
Author

philt31 commented Jan 9, 2020

Screenshot 2020-01-09 at 09 50 46

witty_df.head() shown above

@alimanfoo
Copy link
Contributor

Hi @philt31, apologies, the vcf_to_dataframe does not support extracting calldata fields, only variants. Any calldata fields will be ignored. This should be added to the docstring.

@nightscape
Copy link

I just ran into this as well. @alimanfoo has this just not yet been implemented, or is there a specific reason it cannot be done?

@katmaumue
Copy link

katmaumue commented May 3, 2020

There seems to be an open PR for this.

@alimanfoo
Copy link
Contributor

Hi all, just to say that PR #252 seems OK but is lacking any tests, I think we'd need some tests before it could be merged. Also it's not completely clear that it's handling all cases properly, particularly there can be both 2D calldata arrays (e.g., GQ) and 3D calldata arrays (e.g., GT). Apologies I can't work on this myself right now but if this functionality is useful to folks here then I'd encourage someone to review PR #252 with the above in mind, and/or take PR #252 and develop it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants