Summary
Greetings! I'm Ujjwal, and I extend a warm welcome to my website.
As a seasoned data scientist with over 10 years of professional experience, I've delved into projects encompassing big data analytics, predictive modeling, machine learning, deep learning, and natural language processing. I completed my education at the Indian Institute of Technology (IIT) Kharagpur in 2013. During my leisure time, I enjoy participating in Kaggle competitions.
If you come across any collaboration opportunities, don't hesitate to get in touch!
Work Experience
2024 - Present | MSCI | Vice President
I am a member of the Data Extraction team, tasked with developing Retrieval Augmented Generation (RAG) pipelines, using LLMs, for fetching data and information from financial documents.
2023 - 2024 | HERE Maps | Lead Data Scientist
I am a member of the Map Observables team, tasked with constructing Self-Driving Maps for BMW's Urban Cruise Control. My work involves:
Tackling global-scale challenges by harnessing petabytes of data for creating high-definition maps in the field of autonomous driving. I have successfully enhanced crucial performance indicators such as False Positives, False Negatives, and Accuracy by more than 50% when compared to traditional legacy systems.
Applying machine learning algorithms and XGBoost models to integrate data observations from diverse input sources, including dashcams and overhead imagery. This process allows to deduce the accurate location and attributes of road signs.
Crafting innovative graph-based solutions to counteract positional observation drift from drive-based data sources used in map content. This implementation resulted in a notable reduction of False Positives by around 5%, surpassing the performance of radial search-based clustering.
Constructing a question-answering engine using LLAMA over extensive product and data requirement documents for data validations. This tool empowers users to efficiently search through these documents, extracting details and significantly enhancing productivity.
2021 - 2023 | Gojek Tech | Senior Data Scientist
I was a member of the Care Tech team, where I leverage machine learning, deep learning, and natural language processing techniques to extract insights and facilitate automation. This involved analyzing customer service interactions across diverse channels such as email, in-app requests, chat, Twitter, and more. My work involved:
Facilitating AI/ML-driven intent detection through the implementation of multilingual NLP models. I developed intent classification models based on XLM-RoBERTa to support various languages, including Bahasa and English, achieving an accuracy rate exceeding 80%. Additionally, I deployed these models into production using torchscript and MLFlow.
Constructing named entity recognition (NER) models based on IndoBERT, utilizing open-source IndoNLU datasets. These models were designed to identify entities such as food, quantity, date, and chit-chat within text utterances.
Enhancing the search experience for help center articles by incorporating tags to encompass semantic diversity in search queries. I implemented a TF-IDF and Logistic Regression pipeline to extract pertinent keywords for each article, contributing to an improved search functionality.
Establishing a pipeline for issue discovery to identify emerging themes in service tickets and app reviews. Utilizing PyLDAVis and BERTopic libraries, I implemented topic modeling. Additionally, I trained sentence transformer models using SetFit for better results.
2018 - 2021 | American Express | Senior Data Scientist
I was part of the data science team working on Natural Language Understanding (NLU) layer of the AskAmex chatbot. My work involved:
Training transformer-based models (like BERT, distilBERT, RoBERTa etc.) for intent classification. I removed label noise from training datasets using various robust machine learning techniques which lead to 5% increase in prediction accuracy.
Building human-in-the-loop (HITL) pipelines for collecting labeled data at a minimal cost. I used weak supervision and active learning strategies to filter relevant data points for annotation. I built various interactive tools to help data labelers work efficiently. I introduced best practices and quality checks in the annotation pipelines to ensure high-quality output.
Collaborating with product teams to improve customer experience. I built interactive tools to visualize the performance of servicing journeys. These tools helped identify the edge cases that often lead to automation failures. I introduced tracking around sentiment level KPIs (apart from automation) to holistically capture the channel performance.
2017 - 2018 | American Express | Data Scientist
I was part of the data science team working on an offer recommendation engine for the mobile app and website. My work involved:
Building factorization machine models to predict click-through rate. I built spark-based feature engineering pipelines to process terabytes of clickstream data for training these models. The models were part of the final stacked ensemble that got deployed in production.
Optimizing impression caps on offers to drive higher overall engagement on the channel. I built xgboost models to analyse the sensitivity of click-through rate with respect to impressions. I used the partial dependency plots from these models to identify the impression cap that maximised f-beta score.
2015 - 2017 | American Express | Senior Data Analyst
I was part of the modeling team working on up-sell, cross-sell targeting via email campaigns. My work involved:
Building artificial neural network-based models. These were binary classification models which predicted the probability of an existing customer taking up a more premium product. These models replaced the legacy logistic regression models by delivering better performance while simultaneously driving operational efficiency.
Migrating the legacy data transformation and feature engineering pipelines from sas to python to support the deployment of above mentioned neural network models in production. Enabled automated re-training pipelines to solve for data drift.
2014 - 2015 | American Express | Data Analyst
I joined the customer marketing team focusing on international markets (non-US). I worked on:
Targeting strategy for dynamic email campaigns in partnership with movable ink. The focus was to increase customer spending on small merchants in the UK. I analyzed transaction data to understand the location and category preferences of the customers. The analysis generated content-based recommendations displayed to the customer via dynamic emails. The open and click rates for these campaigns were significantly higher than the long term average.
Supporintng a joint venture with Gurunavi. Amex partnered with Gurunavi to offer dining recommendations to customers in Japan. I designed customer segments by clustering spending patterns across various industry verticals. The customer segments mapped to different personas, each of which received an exclusive set of restaurant recommendations.