Data Engineering • ETL • SQL • Docker • AWS

Building reliable data pipelines
that turn raw data into decisions.

I’m Yashodeep Basnet — an aspiring Data Engineer Intern. I build ETL systems, clean schemas, APIs, and analytics workflows with production-style discipline.

ETL
Pipeline systems
SQL
Schema + analytics
Docker
Reproducible deploy
Available for Internships
$ python main.py --run pipeline
✓ Extract: 3 sources (CSV/Excel/JSON)
✓ Transform: cleaned + standardized
✓ Load: PostgreSQL (stores/products/sales)
✓ Analytics: revenue, trends, top products
PythonPostgreSQLDockerAWSFastAPI

Featured Projects

Production-style projects focused on data pipelines, indexing systems, and analytics.

Retail Sales ETL Pipeline
ETL • PostgreSQL

Automated ETL pipeline to ingest multi-format data (CSV, Excel, JSON), clean it, model schemas, and load into PostgreSQL for analytics & dashboards.

PythonPandasPostgreSQLDocker
Coventry Academic Search Engine
IR • TF-IDF

Vertical search engine that crawls academic publications, builds an inverted index, and ranks results with weighted TF-IDF via a Flask API.

PythonWeb CrawlingInverted IndexFlask
Automated News Classifier
NLP • Pipeline

Crawls news and classifies into Business, Entertainment, and Health with a scalable text pipeline. Achieved F1-score above 0.97 across classes.

PythonScikit-learnNaive BayesCrawling
Fake News Detection API
ML • API

Processed 45,000+ articles with NLP cleaning and feature engineering. Trained and compared models; deployed inference API for real-time classification.

PythonNLTKScikit-learnFlask

Skills Snapshot

Core strengths for data engineering internships.

Data Engineering

  • ETL/ELT pipelines (batch-style)
  • Data cleaning, standardization
  • Data modeling & schema design
  • API integration (REST)

Databases & Querying

  • PostgreSQL, MySQL, SQLite
  • Indexing & query optimization basics
  • Analytics queries (KPIs, trends)
  • Relational design principles

Tools & Deployment

  • Docker (containerized pipelines)
  • Git/GitHub (clean version control)
  • AWS fundamentals (S3/EC2)
  • Flask / FastAPI for services

Journey

Education & focus areas aligned with data engineering.

Master’s — Data Science & Computational Intelligence (Ongoing)

Softwarica College (Coventry University). Focus: data engineering, cloud systems, ML for analytics.

Bachelor’s — Software Engineering (GPA 4.0/4.0)

NAMI College (University of Northampton). Strong base in backend systems and databases.

Let’s build something reliable.

Want a data engineering intern who can ship ETL pipelines, write SQL, and deploy systems cleanly? Message me and I’ll respond fast.

ETL Pipelines PostgreSQL Docker APIs