Open
Description
I can't reproduce TSM features provided in Google drive using your pre-trained weights and the RU-LSTM code.
I've already read the relevant issues: #1 #6 #11
What am I missing?
My procedure
- Extract frames from a video:
ffmpeg -i recordings/nusar-2021_action_both_9063-c14a_9063_user_id_2021-02-17_101116/C10095_rgb.mp4 -r 30 frame_%10d.jpg
- Construct the TSM model without shifting operation and load your pre-trained weights.
- Replace the last dropout layer
with nn.Idnetity()
as mentioned:setattr(model.base_model, model.base_model.last_layer_name, nn.Identity())
- Process input images through your modified transformer.
- Compute the L2 distance between the extracted and the provided features using
np.linalg.norm()
.
Example results
The features are different 😢
nusar-2021_action_both_9011-a01_9011_user_id_2021-02-01_153724/C10095_rgb/C10095_rgb_0000001406.jpg: 7.957793712615967
nusar-2021_action_both_9011-a01_9011_user_id_2021-02-01_153724/C10095_rgb/C10095_rgb_0000001452.jpg: 8.35152816772461
nusar-2021_action_both_9011-a01_9011_user_id_2021-02-01_153724/C10095_rgb/C10095_rgb_0000003927.jpg: 13.770241737365723
nusar-2021_action_both_9011-a01_9011_user_id_2021-02-01_153724/C10095_rgb/C10095_rgb_0000003983.jpg: 8.230810165405273
nusar-2021_action_both_9011-a01_9011_user_id_2021-02-01_153724/C10095_rgb/C10095_rgb_0000006291.jpg: 7.16341495513916
Background
I'd like to ensure my feature extraction code is correct. I ran it using your pre-trained TSM and compared the extracted features with yours provided in the Google drive. If the both are same, it means that the code is correct. Then, I can use it to extract features from other datasets. The situation is similar to #6.
Metadata
Metadata
Assignees
Labels
No labels