Skip to content

Duplicates returned when listing nodes files #10086

Open
@aaronwolen

Description

@aaronwolen

This was originally reported in ropensci/osfr#150 by @doomlab.

The reprex here shows that the same file, 421_Lu.pdf, is returned twice when listing files in the Local IRB directory within this project.

I've confirmed that the duplicate entries are coming from the OSF API, across different pages of results:

#!/usr/bin/env bash

set -e

TOKEN="$OSF_PAT"
NODE="ycn7z"
ID="6113d75ae3801305b39612a8"
LIMIT=2

# Retrieve name and path attributes from JSON response
JQ_FILTER='.data[].attributes | "\(.name) \(.path)"'

for i in $(seq 1 $LIMIT); do
  echo "Retrieving page $i"
  curl --silent \
    "https://api.osf.io/v2/nodes/$NODE/files/osfstorage/$ID/?page=$i" \
    -H "Authorization: Bearer $TOKEN" \
    -H 'Accept-Header: application/vnd.api+json' \
    -H 'Content-Type: application/json; charset=utf-8' \
    | jq $JQ_FILTER
done

## Retrieving page 1
## "97_Pfuhl.pdf /6163f0e5fd5b230191983824"
## "1897_Parker.pdf /616440dfc5565801d34b71bf"
## "1698_Butt.pdf /616513bbc5565802014b9ae6"
## "1970_Pavlović.pdf /617436dae572ea00b13a7285"
## "1560_Irrazabal.pdf /618281a0a30f8100cdaa071d"
## "1867_Oner.pdf /6184db04bfb47d00a3ef50dd"
## "169_Montefinese.pdf /6186148c25f90a004a0f6aa6"
## "87_Vaughn.docx /619548800b0c1e01a27fdae5"
## "35_Stewart.pdf /6197ca37ef62980009f5c789"
## "421_Lu.pdf /6161fcd9fd5b2301429849b3"                  <-- copy 1
##
## Retrieving page 2
## "423_Arriaga.pdf /619d017da83c2001650e8e53"
## "761_Papadatou-Pastou.pdf /619df2886977cd010f496498"
## "712_Davis.pdf /61a7d30d4d4ce5018476e569"
## "1574_Al-Hoorie.pdf /61b89ac6da0b1b0488d05546"
## "206_Ergiyen.pdf /61cc42f3da632006e1fe6f4a"
## "437_Peker.pdf /61fc2630370e6c002bf3d6cc"
## "104_Stieger.pdf /620e3a2511da1c05cdf57647"
## "238_Martínez.pdf /620f7666d9b6cf0144b90449"
## "1052_Parzuchowski.pdf /6220fbccc064270378d90ce5"
## "421_Lu.pdf /6161fcd9fd5b2301429849b3"                  <-- copy 2

The waterbutler IDs are identical so this does seem like a possible bug.

Let me know if you need any more information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions