archit@portfolio: ~

archit@portfolio:~$ whoami

Archit Konde

Machine Learning Engineer

open to opportunities

archit@portfolio:~$

$ cat about.txt

I'm a Machine Learning Engineer with an MEng in Electrical & Computer Engineering from the University of Windsor.

I build AI systems end-to-end and actually test them — from writing retrieval algorithms by hand to deploying production-ready pipelines. My current focus is on LLMs, RAG architectures, and the engineering that makes AI reliable in production.

Currently based in Waterloo, ON — open to ML/AI engineering and software engineering roles.

$ cat experience.log
publication

Fundamentals of Neural Networks

International Journal for Research in Applied Science & Engineering Technology (IJRASET)

  • Authored a research paper examining the architectural and mathematical foundations of neural networks
  • Covered activation functions, backpropagation, gradient descent, and network topologies
volunteer

Data Analyst

City of Windsor — Affordable Transit Program

  • Analyzed transit ridership data across multiple routes to surface usage patterns and support service planning decisions
  • Cleaned and restructured raw operational datasets; delivered summary reports used by program coordinators for planning cycles
$ cat education.txt
MEng

Electrical & Computer Engineering

University of Windsor — Canada

B.E

Computer Engineering

University of Mumbai — India

$ cat skills.json
"languages": [
Python SQL
],
"ml_deep_learning": [
PyTorch Hugging Face Transformers scikit-learn NumPy pandas
],
"llm_rag": [
OpenAI API Anthropic API vector search BM25 cross-encoder re-ranking
],
"tools": [
Git Docker pytest Hugging Face Spaces
]
$ ls ./projects/
01 // live

supportops_ai_monitor.py

Built an LLM-powered ticket triage system using GPT-4o-mini for automated classification and priority scoring via structured multi-step tool-calling. Conditional routing logic, SQLite persistence, observability dashboard. Docker deployment, CI passing.

python openai sqlite docker streamlit pytest
02 // live

rag_from_scratch.py

Complete RAG pipeline built in pure Python — custom chunker, Okapi BM25, NumPy vector store, hybrid retrieval via Reciprocal Rank Fusion, cross-encoder re-ranking. 87 unit tests. No LangChain, no LlamaIndex. Hybrid + Rerank achieved MRR 1.0.

python pytorch transformers numpy bm25 pytest
03 // 0.9995 CV

triagegeist_solution.py

Emergency triage acuity prediction on 80k clinical ED records. Pushed accuracy from 0.891 to 0.9995 CV via TF-IDF scaling on chief complaint text + 3-tier hybrid: deterministic lookup (99.4%), glaucoma-specific binary classifier, LightGBM fallback. $10k Kaggle competition.

python lightgbm scikit-learn tf-idf pandas streamlit
04 // live

insurance_reshopping_predictor.py

Predicts whether you’d benefit from re-shopping your car insurance — ML trained on 381K real insurance profiles. Data quality first: 8-check validation pipeline with SQL-style audit queries, LightGBM classification, SHAP waterfall explanations, counterfactual tips.

python lightgbm shap streamlit scikit-learn pandas
05 // live

ragops.py

Production RAG service built on FastAPI and PostgreSQL/pgvector — persistent vector database, offline evaluation harness with Precision@k/Recall@k/MRR, CI regression gate, Docker Compose dev environment.

python fastapi postgresql pgvector docker pytest
$ cat github_activity.log
$ ls ./blog/
MAR 2026 // learning writeup

RAGOps API — Production RAG with FastAPI and pgvector

Upgrading a from-scratch RAG pipeline to a real API. Persistent vector storage with pgvector, layered FastAPI architecture, offline evaluation with Precision@k/Recall@k/MRR, and CI regression gating. The gap between notebook and production.

fastapi pgvector docker production ML
MAR 2026 // learning writeup

Insurance Re-Shopping Predictor — Data Quality First

Why data quality matters more than model accuracy in insurance ML. 8-check validation pipeline with SQL-style audit queries, LightGBM on 381K profiles, SHAP explainability, and honest limitations of training on Indian market data for North American predictions.

lightgbm shap data quality streamlit
MAR 2026 // learning writeup

Building SupportOps AI Monitor — What I Learned

Architecture decisions, bugs found, deployment challenges, and honest answers to questions I couldn’t answer at first. Covers applymap() removal in pandas 2.2, SQLite ephemeral containers, Streamlit’s PWA limitations, and more.

python streamlit openai sqlite
MAR 2026 // learning writeup

Building RAG From Scratch — A Complete Technical Deep-Dive

Every algorithm in a RAG pipeline derived from first principles. BM25 with Robertson-Walker IDF, mean pooling math, cosine similarity as dot product, Reciprocal Rank Fusion, cross-encoder reranking, and evaluation metrics. No LangChain, no LlamaIndex.

python transformers numpy information retrieval
MAR 2026 // competition writeup

Triagegeist — From 0.891 to 0.9995 CV Accuracy

How text beat everything else in emergency triage prediction. TF-IDF scaling experiments, error analysis tracing every mistake to a single diagnosis, and a 3-tier hybrid that routes 99.4% of predictions through a lookup table. $10k Kaggle competition.

lightgbm tf-idf clinical NLP kaggle
$ cat contact.txt

I'm actively looking for opportunities in AI/ML engineering and research. If you're working on something interesting — let's talk.