Skip to content

[Feature Request]: Allow filtering empty "scale" entries #2493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
richlv opened this issue Dec 27, 2023 · 12 comments
Closed

[Feature Request]: Allow filtering empty "scale" entries #2493

richlv opened this issue Dec 27, 2023 · 12 comments
Assignees

Comments

@richlv
Copy link

richlv commented Dec 27, 2023

Description

JASP allows to filter data in many different ways (https://jasp-stats.org/2018/06/27/how-to-filter-your-data-in-jasp/), and in some cases it's as easy as clicking on entries (like for nominal scale). Currently no "quick" filtering is available for "scale" data - filtering out records with empty fields could be useful.

Purpose

Easily filter out records that have missing "scale" data.

Use-case

A study has some participants dropping out. While they are kept in the original dataset, further analysis is to be performed with them excluded. This is relevant for analysis that does not directly involve the empty field.

Is your feature request related to a problem?

No response

Is your feature request related to a JASP module?

No response

Describe the solution you would like

When doubleclicking the header of a "scale" item, show to entries "empty" and "not empty" with checkmarks, similar to "Label Editor" for ordinal and nominal items.

Describe alternatives that you have considered

Custom filtering formulas, separate dataset.

Additional context

While sometimes this can be achieved by applying a custom filter ("item">0), this is not as discoverable. It would also be more tricky for items that can be negative.

@tomtomme
Copy link
Member

Can't you use "Preferences => Data => Missing value list"?

Our new beta even has this functionality per variable in the "edit data" mode".
See: https://jasp-stats.org/2023/12/29/introducing-jasp-0-18-2-beta-process-survival-analysis-and-more/

But I may missunderstand the problem you want to solve.

@richlv
Copy link
Author

richlv commented Dec 29, 2023

"Missing value list" seems to be different, but maybe I'm not fully grasping it.
A more specific example:

  • There's a dataset that has demographic data like age, and some metric.
  • Metric is missing for some participants that dropped out.
  • I'd like to perform some analysis on demographic data without modifying the data file.

@tomtomme
Copy link
Member

tomtomme commented Dec 30, 2023

I do not see why you would need to modify your data, if some cells are empty.
If some metric measurements are missing, JASP is calculating the analysis for the rest. You don't have to do anything. Other variables like age should not be affected by the missings in the other variables, except for multivariate analyses that needs all those vars (where you would then need to impute those missing values and that functionality is indeed missing in jasp)

About grasping the missing value list:
Consider the possibility, that you want to code missing values in a special way instead of leaving just an empty cell. E.g.:

  • a participant does not want to answer a question could be coded via "-99"
  • a measurement error because of a malfunctioning measurement device could be coded via "-88"
  • etc.

those numbers are chosen in a way, that those could not occur under any circumstances. You then woul put those numbers (-88, -99) in the aforementioned "missing values list" so that jasp can recognize them as values, that should not be used when calculating the mean etc. But those values would be counted as missing e.g. in the frequency or descriptives table.

With this being said, can you now be more specific abaout what is missing for your usecase?

@richlv
Copy link
Author

richlv commented Jan 1, 2024

Sorry if I'm misunderstanding, will try to rephrase this.
Let's say I have a dataset with age, gender and some experimental metric like time.
Some participants filled in the demographic data, but dropped out of the experiment - thus they are present in the dataset, but the "time" metric is empty.
These participants should be excluded from analysis, but kept in the dataset. If some descriptive analysis is performed that does not involve "time" (like showing age split by gender), the invalid entries are included (because data for all participants contains the demographic values).
While I can add a filter like "time > 0" in this case to achieve the desired result, it would be great to have something easier and more generic - some easy way to filter out entries where one column is empty.

@tomtomme
Copy link
Member

tomtomme commented Jan 1, 2024

I think I get it now :D
But I do have no idea how this could be made easier than just setting the filter. Or even deleting those persons completely in data edit mode.

Do you have something in mind? Could you make a mockup of the way this should work and look?

@JorisGoosen
Copy link
Contributor

Since 0.18.2 you can have empty values per column, also in scalar.

I think this would do what you want right?

@JorisGoosen
Copy link
Contributor

Or you can just do !is.na(colName) no?

@richlv
Copy link
Author

richlv commented Jan 3, 2024

I think I get it now :D But I do have no idea how this could be made easier than just setting the filter. Or even deleting those persons completely in data edit mode.

Do you have something in mind? Could you make a mockup of the way this should work and look?

Deletion is pretty much keeping two files, if the raw data is to be preserved (if I understood this correctly).

As a user I wouldn't mind having something this simple ;)
image

The second thing I tried was the filter builder like so.
image

Tried to enter "" and NULL :)

Then I also tried this, but no luck either.
image

Since 0.18.2 you can have empty values per column, also in scalar.

I think this would do what you want right?

Sorry, not sure how this would work - is that the mapping for empty/NaN values? If so, it might not help in this case.

Or you can just do !is.na(colName) no?

Oh neat, I tried this in the R filter, and it does seem to work as desired.
Would it be feasible to expose it for less advanced users like me in some click-through way?

@JorisGoosen
Copy link
Contributor

JorisGoosen commented Jan 3, 2024

Oh neat, I tried this in the R filter, and it does seem to work as desired.
Would it be feasible to expose it for less advanced users like me in some click-through way?

Well, I think this could be a nice and easy feature to add.

We could add a "filter out missing values" checkbox to the "Missing values" tab of the columninfo editor.
This would then be for all columntypes.

@richlv
Copy link
Author

richlv commented Jan 3, 2024

That sounds excellent. Perhaps the wording or a tooltip could be "Filter out <rows|entries> with missing values in this column", to avoid potential confusion.

@JorisGoosen
Copy link
Contributor

Well that is a bit too long to fit there but it can go into a tooltip for sure

@richlv
Copy link
Author

richlv commented May 12, 2025

It looks like a lot of great improvements went in, but a GUI-based method to filter empty values might have been missed - at least I cannot find anything like that in the latest JASP version.
As I stumbled upon a need to filter - this time, for - empty values, I had to search and eventually landed on this same issue, where I lifted the R filter syntax from :)
Created a followup issue at #3438, as I could not reopen this one - and also included the need to filter only for empty values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants