-
Notifications
You must be signed in to change notification settings - Fork 2
Questionnaires
- download the ZIP file
The first task - filling out the entry questionnaire - requires attentively reading
- the data section of the article
- any documentation in the supplementary data zip, in particular the README
- check carefully if any data is provided in the ZIP archive - the presence of a ZIP archive does NOT imply that it contains data.
- When it does have data, please look carefully - BEFORE running any programs - what the README says, and if not all data are provided, distinguish, as we have tried to so far, between "input data" (often not provided) and "analysis data" (sometimes provided).
- First, we distinguish between Private data and Administrative data. Private data comes from a private source, while administrative data may come from a government institution (e.g., U.S. Census, BLS, etc.), an intergovernmental organization (e.g., OECD), or a similar institution.
- "Private, commercial" entity refers to a private company.
- "Private, other" may refer for example to a survey designed and administered by the researchers themselves, who also own the data.
- "Administrative, national" refers to data coming from one or more national sources (e.g., U.S. Census)
- "Administrative, regional" refers to data coming from a state, a province, etc.
- "Administrative, local" refers to data coming from a city, a school district, etc.
- A data curator with a well-defined, non-preferential data access policy would be classified under "formal access".
- If the author personally promises to provide access to the data and would engage a third party to provide access in a well-defined fashion, we classify the data access as "informal access, committment".
- If the author simply promises to work with the replicator, without being able or willing to guarantee access, we classify the data access as "informal access, no committment".
We saw a few DOIs reported as
doi://10.1257/app.20150234 or app.20160121
Both of those are wrong for the second field of the Entry Questionnaire. Please code them as
10.1257/app.20150234 and 10.1257/app.20160121
Q: If an article doesn't provide justification for why it doesn't include datasets, but you visit the database described in the article and it requires registration to access the database, then should we mark the datasets as missing data (no justification) or proprietary data in the entry questionnaire?
A: If you found the data, and the registration is trivial (I.e. After registration you can download the data), then the data is not missing, and the entry questionnaire should list the URL or doi for the dataset.
- If the ZIP file does not contain ANY data, then "OnlineDataProvided = No"
- If the author provides a link, and you can download the data, then you will check the box under "DataAbsence" "Other data download site provided". You can later provide the URL for the input data in the section "Input Data Not Provided", which URL will be DIFFERENT from the ZIP file on the AEA website).
- If the author simply calls out an institution or something like that, make some effort to try and see if the data can be downloaded, but if it is not obvious, then under "InputDataAvailability1" =No, make notes in the InputDataOtherNotes1 (and make a note of your search efforts in the REPLICATION.txt)
Q: So I made a mistake with a past entry questionnaire, and realize that I based my answers on data for the wrong article. Can I just resubmit a new entry questionnaire?
A: Yes, please do resubmit, and let us know which DOI we should remove the faulty entry for.
Q: If we successfully run the code for an article, but some of the numbers don't match between the code-generated tables and the article tables, does that count as a full or partial replication?
A: This is a judgement call, to some extent. If the numbers don't match EXACTLY (i.e., there is some difference in the second or third decimal), then treat it as a "successful replication", but note the slight discrepancy in the comments of the EXIT questionnaire. If the numbers are really off, then you should mark it as a partial, or a non-replication, depending on how many tables are at fault (if some tables match, then it's a partial, if none match, then a non-replication).
Q: The author provides Analysis, Input and Temporary data sets. How to classify Temporary dataset? The ReadMe states that it is data sets that are created during the data cleaning process. Additionally, each of these categories contains many datasets both as Excel and Stata files. How should I report this in the framework of the entry questionnaire?
A: Lump the datasets together as Analysis and Input, mark both Stata and Excel as format. You can probably ignore the intermediate - move them out of the way, and you can use them as a check that the data cleaning works (i.e., if you run the programs that read the Input and write the Intermediate, you should get the same datasets as the author provided). You do not need to describe the Temporary datasets in the Entry questionnaire.
Q: If the authors do not provide programs to replicate the appendix results but everything else can be replicated does that count as a full or partial replication?
A: We classify that if you can do the main article but not the appendix as "full", but do make a note of it in the free notes at the end of the questionnaire.
Please follow this scale when answering this question:
- The article possesses all the desired features that ensure replicability. Datasets are provided and their use is public. The documentation and the program metadata are clear and complete. Negligible changes might be required to run the programs (e.g., path redirection).
- The article possesses most of the desired features that ensure replicability. Datasets are provided and their use is public. The documentation and the program metadata are present but the programs might need a few changes to run cleanly.
- The article replication may present some difficulties. Datasets are provided and their use is public, but the documentation is incomplete and unclear. Substantial changes might be needed to run the programs.
- The article replication may present substantial difficulties and/or additional steps are required to recover the datasets used by the authors. Datasets may not be provided but their use is public or available on request.
- The article is not replicable. The datasets are not provided and their access is private or restricted. The programs are not provided. The documentation is absent or incomprehensible.
-
Training
-
Tips for authors
-
Tips for replicators
-
Questionnaires
-
Definitions
-
Generic workflow
-
Post-publication replications
-
Technical issues
-
Appendix