Skip to content

BorutaPy selects different features in different iterations #121

@VEZcoding

Description

@VEZcoding

First of all thanks for the package, I've been using it a lot in my work.

I came up to something strange. I know Boruta selects everything that is important.

I have a dataframe of 200 observations and 2000 features. if I shuffle the order of the features in the dataframe, Boruta (Random forrest classifier) will return different important features.

Also if I have 200 observations with first 1000 features Boruta selects a list of n-important features. But If I add another 1000 to the mix Boruta will select another set of features and the features from the first 1000 group won't be in them.

So why is Boruta always selecting different features if it should always select the best ones? How can best features change if you change the order of columns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions