Skip to content

Conversation

abinayagoudjandhyala
Copy link
Contributor

@abinayagoudjandhyala abinayagoudjandhyala commented Sep 22, 2025

Pull Request for PyVerse 💡

Requesting to submit a pull request to the PyVerse repository.


Issue Title

Add Airbnb Data Analysis Project under datascience folder

  • I have provided the issue title.

Info about the Related Issue

What's the goal of the project?
To contribute an Airbnb Data Analysis Project under the datascience folder, providing a comprehensive analysis of Airbnb listing data using Tableau, Excel, and visual presentations. The aim is to highlight pricing trends, availability, and location patterns in the Airbnb rental market.

  • I have described the aim of the project.

Name

Please mention your name.
Abinaya Goud Jandhyala

  • I have provided my name.

GitHub ID

Please mention your GitHub ID.
abinayagoudjandhyala

  • I have provided my GitHub ID.

Email ID

Please mention your email ID for further communication.
[email protected]

  • I have provided my email ID.

Identify Yourself

Mention in which program you are contributing (e.g., WoB, GSSOC, SSOC, SWOC).
GSSOC 2025

  • I have mentioned my participant role.

Closes

Enter the issue number that will be closed through this PR.
Closes: #1799

  • I have provided the issue number.

Describe the Add-ons or Changes You've Made

Give a clear description of what you have added or modified.
Added a complete Airbnb Data Analysis project featuring Tableau visualizations, Excel datasets, and a summary presentation. The project offers insights into average prices per bedroom, pricing by zip code, weekly revenue patterns, and listing breakdowns. A README details key insights and suggestions for future enhancements, such as integrating review/amenities data and ML predictive analysis.

  • I have described my changes.

Type of Change

Select the type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, local variables)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Describe how your changes have been tested.
The files were checked for integrity and correctness. Tableau workbook was loaded to verify interactivity and consistency of visualizations. The Excel dataset was reviewed for completeness and accurate linking to the Tableau workbook. The README was tested for clear instructions and setup reproducibility.

  • I have described my testing process.

Checklist

Please confirm the following:

  • My code follows the guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly wherever it was hard to understand.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added things that prove my fix is effective or that my feature works.
  • Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

  • Documentation
    • Added comprehensive end-user documentation for the Airbnb Data Analysis project, including overview, included assets, objectives, analysis focus areas (average price per bedroom, price distribution by zip code, weekly revenue trends, listing types), key insights, setup/usage instructions, and potential future enhancements. This improves discoverability and onboarding for readers. No functional behavior or data logic was changed.

Copy link
Contributor

coderabbitai bot commented Sep 22, 2025

Walkthrough

Added a new README.md for the Airbnb Data Analysis project describing project overview, included artifacts (presentation and Tableau workbook), objectives, insights, setup/usage instructions, and potential future enhancements. No code or public APIs were modified.

Changes

Cohort / File(s) Summary of Changes
Documentation
Data_Science/Airbnb-Data-Analysis/README.md
Introduced README detailing project scope, files (AirbnbProject.pptx, TableauProject.twbx), objectives, analysis focus areas, setup steps, key insights, and future enhancements.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested labels

Contributor, Status: Review Ongoing 🔄, level2, gssoc25

Suggested reviewers

  • TheChaoticor
  • UTSAVS26

Poem

A hop, a skip, a README appears—
Charts in Tableau, insights to cheer!
Bedrooms, zips, and weekly flow,
Carrots crunch as trends all show. 🥕
A slide, a twbx—neatly aligned,
Thump-thump! says bunny: “Data refined.”

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The PR objectives state it will add a Tableau workbook, an Excel dataset, a presentation, and a README under /datascience/Airbnb-Data-Analysis/ to satisfy issue [#1799], but the provided raw_summary shows only README.md was added and explicitly notes no data or workbook files were included; therefore the required deliverables (TableauProject.twbx, AIRBNB PROJECT.pptx, and the Excel dataset) are missing from the PR. This mismatch between the PR description and the actual file changes means the PR does not currently meet the linked issue's coding/delivery requirements. Because the key artifacts referenced in the issue are not present, the check cannot pass. Please add and commit the missing deliverables (TableauProject.twbx, AIRBNB PROJECT.pptx, and the Excel dataset) to /datascience/Airbnb-Data-Analysis/, verify the Tableau workbook loads against the included dataset, update README.md to reference the exact filenames and paths, and then update the PR description so this check can be re-run and issue [#1799] can be closed.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Add Airbnb Data Analysis Project under datascience folder #1799" concisely and accurately describes the primary change—adding the Airbnb Data Analysis project to the datascience directory—so a teammate scanning history can understand the main intent; the inclusion of the issue number is minor noise but not misleading. The title is specific, focused on the main change, and avoids vague language. It does not attempt to list files or extraneous details.
Out of Scope Changes Check ✅ Passed Based on the raw_summary the only change in this PR is README.md under datascience/Airbnb-Data-Analysis/ and no source code, config, or other directories were modified, so there are no apparent out-of-scope or unrelated changes; the PR also notes no exported/public declarations were changed. The modifications are limited to project documentation and thus align with the linked issue's scope. If additional files are later added, their inclusion should be rechecked for scope.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

👋 Thank you for opening this pull request! We're excited to review your contribution. Please give us a moment, and we'll get back to you shortly!

Feel free to join our community on Discord to discuss more!

Copy link

✅ PR validation passed! Syncing labels and assignees from the linked issue...

@github-actions github-actions bot added Contributor Denotes issues or PRs submitted by contributors to acknowledge their participation. gssoc25 level1 Status: Review Ongoing 🔄 PR is currently under review and awaiting feedback from reviewers. labels Sep 22, 2025
Copy link

✅ PR validation passed! Syncing labels and assignees from the linked issue...

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
Data_Science/Airbnb-Data-Analysis/README.md (4)

17-22: Define metrics precisely (weekly revenue, zip source, bed_type status, date range).

Document formulas and data fields to remove ambiguity and ease validation.

Apply:

 ## Key Insights
 - **Average Price Per Bedroom**: Larger properties generally have higher average prices, with steep increases for properties with 4+ bedrooms.
 - **Price by Zip Code**: Some zip codes show significantly higher prices, possibly indicating more desirable or premium areas.
 - **Revenue Trends**: Weekly revenue peaks during certain seasons, reflecting high-travel demand periods.
 - **Common Listing Types**: Real beds are the most frequent, and properties with 1-2 bedrooms make up the majority of listings.
+
+## Methodology & Metric Definitions
+- Date range analyzed: <insert coverage window>.
+- Weekly revenue: <define formula, e.g., price × (7 − weekly_availability) or based on calendar data>; specify fields used.
+- Zip code source: <state field or geocoding method>. Note any imputation rules.
+- Bed type and bedroom count fields: <list exact field names>; note if `bed_type` exists in this dataset/version.
+- Outlier handling: <winsorization/filtering rules, if any>.

28-33: Note privacy when adding reviews and outline basic ML evaluation.

Minor content tweak to future work to set expectations.

Apply:

 ## Future Enhancements
 Potential future improvements include:
 - Adding More Data Features: Incorporating additional features like reviews or amenities.
-- Predictive Analysis: Using machine learning to predict pricing trends based on data.
+- Predictive Analysis: Using machine learning to predict pricing trends based on data; report RMSE/MAE and train/test split methodology.
 - Interactive Web Dashboard: Creating a web-based, interactive dashboard for public access.
+ - Privacy: If incorporating reviews/amenities, ensure text is anonymized and complies with the dataset license.

23-27: Provide a non‑Tableau fallback (exported images/PDF or public link).

This widens accessibility for reviewers without Tableau Desktop.

Apply:

 - Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
 - Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
+ - No Tableau? Include/export key dashboards as PNG/PDF in `docs/` or provide a public Tableau link for viewing.

17-22: Qualify insight statements with dataset scope to avoid overgeneralization.

Add time/place context so claims aren’t read as universal truths.

Apply:

-- **Average Price Per Bedroom**: Larger properties generally have higher average prices, with steep increases for properties with 4+ bedrooms.
-**Price by Zip Code**: Some zip codes show significantly higher prices, possibly indicating more desirable or premium areas.
-**Revenue Trends**: Weekly revenue peaks during certain seasons, reflecting high-travel demand periods.
-**Common Listing Types**: Real beds are the most frequent, and properties with 1-2 bedrooms make up the majority of listings.
+- **Average Price Per Bedroom** (in this dataset/timeframe): Larger properties show higher average prices, with steeper increases for 4+ bedrooms.
+- **Price by Zip Code** (study region): Certain zip codes trend higher, indicating premium areas.
+- **Revenue Trends**: Weekly revenue peaks during specific seasons in the analyzed window.
+- **Common Listing Types**: Real beds are most frequent, and 1–2 bedroom listings dominate this dataset.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45a0fbe and bc96083.

⛔ Files ignored due to path filters (1)
  • Data_Science/Airbnb-Data-Analysis/AIRBNB PROJECT.pptx is excluded by !**/*.pptx
📒 Files selected for processing (1)
  • Data_Science/Airbnb-Data-Analysis/README.md (1 hunks)
🔇 Additional comments (1)
Data_Science/Airbnb-Data-Analysis/README.md (1)

1-1: Confirm directory casing: Data_Sience vs datascience.

Found: ./Data_Science/Airbnb-Data-Analysis. PR references "/datascience/...". Confirm intended casing and update paths or rename to avoid duplicate/ambiguous paths on case‑insensitive filesystems.

Comment on lines +3 to +4
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add data source, license, and non‑affiliation/privacy notes.

Provenance and licensing are required for compliance and reuse; add a short disclaimer on Airbnb affiliation and PII handling.

Apply:

 ## Overview
 This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
+
+## Data Source & Licensing
+- Source: <dataset name/provider, URL, retrieval date>.
+- License: <dataset license>; confirm that redistribution of the workbook and excerpts complies.
+- Privacy: dataset must not include precise addresses, emails, phone numbers, or other PII. Any such fields should be removed or generalized.
+- Non‑affiliation: This project is for educational purposes and is not affiliated with or endorsed by Airbnb.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
## Data Source & Licensing
- Source: <dataset name/provider, URL, retrieval date>.
- License: <dataset license>; confirm that redistribution of the workbook and excerpts complies.
- Privacy: dataset must not include precise addresses, emails, phone numbers, or other PII. Any such fields should be removed or generalized.
- Non-affiliation: This project is for educational purposes and is not affiliated with or endorsed by Airbnb.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 3-4, add brief
provenance, licensing and privacy/non‑affiliation statements: specify the
original data source (e.g., "Dataset obtained from Airbnb public listings via
[source/link] on [date]" or the exact file name), include a license or reuse
terms (e.g., CC BY‑NC or link to dataset's license) and a short non‑affiliation
disclaimer ("This project is not affiliated with Airbnb"), and add a PII/privacy
note explaining any personal data removal/anonymization steps and that no
attempt was made to contact hosts/guests. Keep each note one or two sentences
and place them under a new "Data Sources and License" or "Provenance and
Privacy" section near the top of the README.

Comment on lines +6 to +9
## Project Files
- **AirbnbProject.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Include the Excel dataset and reconcile file names; avoid spaces/case drift.

  • Project files list omits the Excel dataset mentioned in the PR objectives.
  • Filenames differ from the PR text (e.g., “AIRBNB PROJECT.pptx” vs “AirbnbProject.pptx”). Normalize and match the actual committed files to prevent broken references across OSes.

Apply this minimal update (adjust names to whatever is actually in the repo):

 ## Project Files
-- **AirbnbProject.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
-- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.
+- **Airbnb-Project.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
+- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.
+- **data/Airbnb_Listings.xlsx**: Source dataset used by the workbook (if not embedded in the TWBX).

Run this to verify filenames and the folder that actually exists:


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
for d in Data_Science/Airbnb-Data-Analysis datascience/Airbnb-Data-Analysis; do
  if [ -d "$d" ]; then
    echo "Contents of $d:"
    ls -al "$d"
  fi
done

Length of output: 377


Normalize filenames in README and add the missing Excel dataset

  • Repo (Data_Science/Airbnb-Data-Analysis) contains: AIRBNB PROJECT.pptx and TableauProject.twbx; README lists AirbnbProject.pptx and mentions an Excel dataset that is not present.
  • Action (choose one):
    • Rename Data_Science/Airbnb-Data-Analysis/"AIRBNB PROJECT.pptx" → Data_Science/Airbnb-Data-Analysis/AirbnbProject.pptx (remove spaces/case drift) and add the source dataset at data/Airbnb_Listings.xlsx if it's not embedded in the TWBX.
    • OR update README to reference the exact committed filenames (Data_Science/Airbnb-Data-Analysis/"AIRBNB PROJECT.pptx", TableauProject.twbx) and explicitly state whether the source dataset is embedded in the TWBX or where the Excel file is located.
  • Use consistent, space-free filenames for cross-OS compatibility.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 6–9, the README
mentions "AirbnbProject.pptx" and an Excel dataset that do not match the repo
contents (the repo contains "AIRBNB PROJECT.pptx" and TableauProject.twbx and
the Excel file is missing or embedded); fix by either (A) renaming the repo file
to AirbnbProject.pptx (remove spaces/change case) and adding the source dataset
at Data_Science/Airbnb-Data-Analysis/data/Airbnb_Listings.xlsx if it is not
embedded in the TWBX, or (B) update the README to reference the exact committed
filenames ("AIRBNB PROJECT.pptx" and "TableauProject.twbx") and explicitly state
whether the source dataset is embedded in the TWBX or where the Excel file can
be found; ensure filenames use consistent, space-free naming if you choose
rename, and commit the README and any file renames/additions together.

⚠️ Potential issue

🧩 Analysis chain

Track large binaries with Git LFS to keep repo lean.

TWBX/PPTX (and XLSX) often exceed normal size thresholds; ensure they’re LFS‑tracked.

Use this check:


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
echo "Potential large binaries:"
python - <<'PY'
import os
exts={'.twbx','.twb','.pptx','.xlsx','.xls','.csv'}
for root,_,files in os.walk('.'):
  for f in files:
    if os.path.splitext(f)[1].lower() in exts:
      p=os.path.join(root,f)
      try:
        sz=os.path.getsize(p)
      except OSError:
        continue
      if sz>=5*1024*1024:
        print(f"{p}\t{sz/1024/1024:.1f} MB")
PY
echo
echo ".gitattributes entries referencing LFS:"
[ -f .gitattributes ] && rg -n 'filter=lfs' .gitattributes || echo "No .gitattributes found"

Length of output: 1760


Track large binaries with Git LFS and migrate existing large files.

No .gitattributes found; the repo contains multiple files >=5MB. Add a .gitattributes and LFS-track binary/artifact patterns (e.g. *.twbx, *.pptx, *.xlsx, *.xls and large *.csv), then migrate the existing large files into LFS or move them to external storage (git lfs migrate import --include="path,...").
Notable large files found:

  • Data_Science/Airbnb-Data-Analysis/TableauProject.twbx — 5.1 MB
  • Machine_Learning/Book Recommendation System/books.csv — 73.9 MB
  • Deep_Learning/toxicity insult analyser/jigsaw-toxic-comment-classification-challenge/train.csv/train.csv — 65.6 MB
  • Deep_Learning/toxicity insult analyser/jigsaw-toxic-comment-classification-challenge/test.csv/test.csv — 57.6 MB
  • Machine_Learning/Fake news detection project/dataset/Fake.csv — 59.9 MB
  • Machine_Learning/Fake news detection project/dataset/True.csv — 51.1 MB

Comment on lines +23 to +27
## Setup and Usage
To interact with the project and explore the data:
- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Specify Tableau Desktop version and whether data is embedded; add relinking steps.

Without this, opening the TWBX may fail or silently point to the wrong data.

Apply:

 ## Setup and Usage
-To interact with the project and explore the data:
-- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
-- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
+To interact with the project and explore the data:
+- Prerequisite: Tableau Desktop (specify tested version, e.g., 2023.3 or later).
+- Open **TableauProject.twbx** in Tableau Desktop.
+  - If the workbook expects an external Excel file (not embedded), go to Data > [Data Source] > Edit Connection and relink to `data/Airbnb_Listings.xlsx`.
+  - If an extract is used, Data > [Data Source] > Extract > Refresh after relinking.
+- Review the presentation: open **Airbnb-Project.pptx** for summarized findings.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Setup and Usage
To interact with the project and explore the data:
- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
## Setup and Usage
To interact with the project and explore the data:
- Prerequisite: Tableau Desktop (specify tested version, e.g., 2023.3 or later).
- Open **TableauProject.twbx** in Tableau Desktop.
- If the workbook expects an external Excel file (not embedded), go to Data > [Data Source] > Edit Connection and relink to `data/Airbnb_Listings.xlsx`.
- If an extract is used, Data > [Data Source] > Extract > Refresh after relinking.
- Review the presentation: open **Airbnb-Project.pptx** for summarized findings.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 23 to 27, the TWBX
instructions lack Tableau version and data-link guidance; update the README to
state the minimum Tableau Desktop version used (e.g., Tableau Desktop 2023.1)
and explicitly note whether the TWBX contains embedded/extracted data or
references external files, then add concise relinking steps: open Tableau
Desktop, go to Data > Replace Data Source (or Data > Extract > Refresh) if the
workbook points to external CSV/Hyper files, and provide the expected relative
path(s) to the data files in the repo (or note that data is embedded so no
relinking is needed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Denotes issues or PRs submitted by contributors to acknowledge their participation. gssoc25 level1 Status: Review Ongoing 🔄 PR is currently under review and awaiting feedback from reviewers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Airbnb Data Analysis Project under datascience folder

1 participant