Skip to content

Commit bcb4542

Browse files
feddelegrand7PatrickJS
authored andcommitted
Update .cursorrules
1 parent 9eea257 commit bcb4542

File tree

1 file changed

+13
-12
lines changed

1 file changed

+13
-12
lines changed

rules/r-cursorrules-prompt-file-best-practices/.cursorrules

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
You are an R programming assistant, make sure to use the best practices when programming in R:
22

33
## Project Structure and File Organization
4-
- Organize projects into clear directories: 'R/' (scripts), 'data/' (raw and processed), 'output/' (results, plots), 'docs/' (reports, markdowns), 'inst/' for external files used within the project (.csv, .css and so on)
4+
- Organize projects into clear directories: 'R/' (scripts), 'data/' (raw and processed), 'output/' (results, plots), 'docs/' (reports). For R packages, use 'inst/' for external files; for non-packages, consider 'assets/'.
55
- Use an 'Rproj' file for each project to manage working directories and settings.
66
- Create reusable functions and keep them in separate script files under the 'R/' folder.
7-
- Use RMarkdown or Quarto for reports and documentation. Prefer Quarto is available and already installed.
7+
- Use RMarkdown or Quarto for reproducible reports combining code and results. Prefer Quarto if available and installed.
88
- Keep raw data immutable; only work with processed data in 'data/processed/'.
99
- Use 'renv' for dependency management and reproducibility. All the dependencies must be installed, synchronized, and locked.
1010
- Version control all projects with Git and use clear commit messages.
1111
- Give a snake_case consistent naming for the file names. The file names should not be too long.
12-
- Avoid using unncessary dependencies, if a task can be achieved relatively easily using base R, just use base R and import other packages only if necessary. The imported package should for example be faster in terms of execution, more robust and can achieve the same tasks with fewer lines of code. Otherwise, just use base R.
12+
- Avoid using unnecessary dependencies. If a task can be achieved relatively easily using base R, use base R and import other packages only when necessary (e.g., measurably faster, more robust, or fewer lines of code).
1313

14-
## Package structure
14+
## Package Structure
1515
- If the R project is an R package, make sure to mention the dependencies used inside the package within the 'DESCRIPTION' file. All dependencies must have their version number mentioned (e.g: R6 (>= 2.6.1))
1616
- If the R project is an R package, make sure a 'LICENSE' file is available.
1717
- If the R project is an R package, make sure a 'NEWS.md' file is available which should track the package's development changes.
1818
- If the R project is an R package, make sure that each external file used inside the package is saved within the 'inst' folder. Reading the file should be done using the 'system.file' function.
1919
- If the R project is an R package, Always use 'devtools::load_all' before testing the new functions.
20-
- If the R project is an R package, make sure to always document the functions using 'roxygen' code. Use 'devtools::document' to create the corresponding and necessary documentation (.Rd files and NAMESPACE) file.
21-
- If the R project is an R package, run 'devtools::check' to check if the packages has no issues. Notes are okay but warnings and errors should be avoided.
20+
- If the R project is an R package, run 'devtools::check()' to ensure the package has no issues. Notes are okay; avoid warnings and errors.
21+
- If the R project is an R package, document functions using roxygen2. Use 'devtools::document()' to generate the required documentation (.Rd files) and 'NAMESPACE' file.
2222

2323
## Naming Conventions
2424
- snake_case: variables and functions (e.g., \`total_sales\`, \`clean_data()\`).
@@ -34,20 +34,21 @@ You are an R programming assistant, make sure to use the best practices when pro
3434
- Use spaces around operators (\`a + b\`, not \`a+b\`).
3535
- Keep line length <= 80 characters for readability.
3636
- Use consistent indentation (2 spaces preferred).
37-
- Use '#' for inline comments and section headers. Only comment if necessary (if a code is complex and need explanation), otherwise avoid commenting. The code should be self explanatory.
37+
- Use '#' for inline comments and section headers. Comment only when necessary (e.g., complex code needing explanation). The code should be selfexplanatory.
3838
- Write modular, reusable functions instead of long scripts.
3939
- Prefer vectorized operations over loops for performance.
4040
- Always handle missing values explicitly (\`na.rm = TRUE\`, \`is.na()\`).
41-
- When creating an empty element that will get values assigned in it, try to preallocate the type and memory in advance if possible, for example 'x <- character(length = 100)' instead of 'x <- c( )'.
41+
- When creating an empty object to be filled later, preallocate type and length when possible (e.g., 'x <- character(length = 100)' instead of 'x <- c()').
4242
- Always use <- for variables' assignment, except when working with 'R6' classes. The methods inside the 'R6' classes are assigned using '='
4343
- When referencing a function from a package always use the '::' syntax, for example 'dplyr::select'
4444
- Always use 'glue::glue' for string interpolation instead of 'paste0' or 'paste'
4545

4646
## Performance and Optimization
4747
- Profile code with \`profvis\` to identify bottlenecks.
48-
- Prefer vectorized functions and apply family (\`lapply\`, \`sapply\`, \`purrr\`) over explicit loops. When using loop, try to preallocate type and memory beforehands.
48+
- Prefer vectorized functions and the apply family ('apply', 'lapply', 'sapply', 'vapply', 'mapply', 'tapply') or 'purrr' over explicit loops. When using loops, preallocate type and memory beforehand.
4949
- Use data.table for large datasets when performance is critical and data can fit in memory.
50-
- When reading a csv file, always prefer using the 'fread::read_csv' or 'readr::read_csv' depending on the codebase. If the codebase is 'tidyverse' oriented (it contains packages that are part of the tidyverse), prefer 'readr', use 'data.table' otherwise.
50+
- When reading a CSV, prefer 'data.table::fread' or 'readr::read_csv' depending on the codebase. If the codebase is tidyverse‑oriented, prefer 'readr'; otherwise use 'data.table'.
51+
5152
- Use duckdb when data is out of memory.
5253
- Avoid copying large objects unnecessarily; use references when possible.
5354

@@ -88,11 +89,11 @@ You are an R programming assistant, make sure to use the best practices when pro
8889
- Use CI/CD (GitHub Actions, GitLab CI) to test and deploy R projects.
8990

9091
## Dependencies
91-
Have a preference for the following package when relying on a dependency:
92+
Have a preference for the following packages when relying on dependencies:
9293
- purrr for 'list' objects manipulation and functional programming
9394
- shiny for web application development
9495
- 'data.table' or 'dplyr' for in-memory data manipulation
95-
- 'data.table' or 'dplyr' for in-memory data injection.
96+
- 'data.table' or 'dplyr' for efficient data import (CSV/TSV, etc.).
9697
- 'arrow' when dealing with 'parquet' files
9798
- 'duckdb' when dealing with out of memory data sets.
9899
- 'ggplot2' for plotting.

0 commit comments

Comments
 (0)