Data Engineer | AWS | PySpark | Real-time Pipelines | Cloud Automation | GenAI Enthusiast
I'm a Data Engineer with over 3 years of experience in building and optimizing cloud-native, scalable data pipelines using modern data stacks. My journey has focused on crafting high-performance ETL/ELT workflows, automating cloud infrastructure, and enabling real-time data access that supports analytics and ML use cases.
I have hands-on experience designing robust systems using:
- AWS Services like Glue, Lambda, Step Functions, DMS, Kinesis, S3, Redshift, CloudWatch
- Big Data technologies like Apache Kafka, PySpark, Delta Lake, Airflow, and Ab Initio
- Languages & Frameworks: Python, SQL, Flask, Shell, LangChain, PyTorch
I am also passionate about AI/ML and LLMs, and actively explore ways to integrate them into data engineering workflows.
- π 3rd Place Winner β Barclays GenAI Hackathon (Regional Level)
- βοΈ Built a real-time data streaming pipeline using Kafka, Python, and AWS S3
- β¨ Contributed to DaFE (Data Forge Engine), a cloud-native, low-code processing platform
- β Automated AWS DMS, EC2 cost-optimization workflows, and CI/CD config pipelines
Languages: Python, SQL, Java, Shell
Cloud & DevOps: AWS (Glue, Lambda, S3, DMS, DynamoDB, Athena, Step Functions, CloudWatch), Jenkins, GitLab, Docker
Data Engineering: PySpark, Airflow, Kafka, Ab Initio, Delta Lake, ETL/ELT, Streaming, Data Governance
Storage: PostgreSQL, MongoDB
AI/ML Tools: PyTorch, LangChain, Hugging Face, LLM, NLP- Built a real-time ingestion pipeline with Apache Kafka, Python, and AWS S3
- Automated metadata detection using Glue Crawlers + Athena for serverless querying
- Developed an open-domain retrieval-augmented generation (RAG) model for Tamil using fine-tuned Roberta + XLM
- Dense vector indexing with Milvus and deployed APIs with Flask
Bachelor of Engineering (Computer Science)
SSN College of Engineering β Chennai, India (2018β2022)
CGPA: 7.79 / 10
- βοΈ Email: [email protected]
- π€ LinkedIn: linkedin.com/in/aswin07
- π» GitHub: github.com/AswiN-7
"Data isn't just numbersβit's a story waiting to be understood. Let's build systems that tell it better."
