- π B.Tech in Computer Science & Engineering (CGPA 9.11)
- πΌ Software Engineer at Anura Infotech (CRIF India) β Big Data, ETL, PySpark
- π§ Currently learning: Advanced Data Engineering, Cloud, MERN Stack
- π€ Open to collaborate on Big Data, Python, Cloud, and Open Source Projects
- π¨βπ» Check out all my projects: GitHub Repositories
- π« Reach me here: Email
- Built & optimized ETL pipelines using PySpark, Hadoop, HBase, Oracle SQL
- Processed 1M+ loan records across Consumer, Commercial & MFI domains
- Automated ingestion pipelines using Shell Scripts + Azkaban, improving processing speed by 20%
- Designed stored procedures for high-accuracy data ingestion into Oracle DWH & HBase
- Ensured 98%+ data quality through validation, reporting & reconciliation
- Collaborated with BA / QA / Dev teams for requirement analysis, BRDs, and CRs
- Performed functional & performance testing of ETL workflows using JIRA & QTest
- Debugged production issues & worked on API integrations
- Automated testing workflows reducing manual efforts by 30%
- Worked with Git, CI/CD, JIRA, and collaborated with QA & production teams
π GitHub: https://github.com/theharshkonda/aws-etl-pipeline-apache-spark
- Built a fully serverless ETL pipeline using AWS Glue (PySpark), S3, Athena
- Automated schema detection using Glue Crawlers
- Converted CSV β Parquet for high-performance querying
- Designed within AWS free tier & ensured scalable architecture
π GitHub: https://github.com/theharshkonda/Dhanvantari
- Tech: React Native, Node.js, Express, MySQL, AWS
- Features: Ayurvedic chatbot, mental-health support, doctor booking
- Managed backend workflows & clinical decision data
Python β’ SQL β’ Java β’ JavaScript β’ TypeScript β’ PySpark
Apache Spark β’ Hadoop β’ HBase β’ Parquet β’ ETL Pipelines β’ Data Validation
AWS Glue β’ S3 β’ Athena β’ Lambda β’ EC2 β’ RDS β’ IAM β’ Docker
React β’ React Native β’ HTML β’ CSS
Node.js β’ Express.js β’ Flask
Git β’ Linux β’ Jenkins β’ Airflow β’ JIRA β’ Confluence
