Skip to content
View LingAdeu's full-sized avatar

Block or report LingAdeu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
LingAdeu/README.md

Header

About

Hi, I am a junior linguist with a current interest in computational stylometry. As this area of interest requires a deep understanding of linguistics, statistics, and computational methods, data science and machine learning are tools I'm using to explore my area of interest. These two fields are applied in a business context for demonstrating some transferrable skills.

Currently, I'm learning how to extract important linguistic features from text data and how to experiment machine learning models for text classification. I am also exploring how to apply statistical techniques for authorship attribution. In addition to these, I am working on some data science projects in business context to get myself familiarized with making sense of numbers.

Key Projects

Predictive Modeling

Project Name Description
Optimizing Ride Fares: A Dynamic Pricing Model for Ride-Sharing Services Currently, ride-sharing prices are primarily set based on ride duration, overlooking fluctuating demand and supply. This project explores a dynamic pricing model powered by machine learning to enhance profitability while keeping prices appealing to customers. By experimenting with 12 ML algorithms and two feature engineering techniques (feature selection and polynomial expansion), the project developed a model that, when tested with a simulation of 100 customers, showed that increasing the key feature—expected ride duration—by 20% through a promotional campaign could generate a net profit of $2.4K. (URL)
Addressing Customer Churn in an E-Commerce Company This project seeks to reduce an e-commerce company's customer churn rate from 16.8% to 10%. Using diagnostic analysis and a classification model, we focused on minimizing false negatives due to their higher financial impact. After testing various techniques and algorithms, we chose XGBoost and identified tenure and cashback amount as key factors for intervention. Simulations showed that with targeted strategies, achieving the 10% churn rate is feasible. (URL)
Development and Evaluation of a Classification Model for Spam Detection This project developed a classification model to identify spam messages (1 for spam, 0 for legitimate) for a telecommunications company. F1 score was selected as the primary metric to balance false positives and false negatives. Logistic regression emerged as the best model, achieving an F1 score of 0.92 ± 0.01 across 10 folds. Additionally, the model's potential to save $23K through reduced spam impact highlights its financial and operational benefits. (URL)

Data Analysis

Project Name Description
Evaluating Marketing Campaign Effectiveness for New Menu Items: An A/B Testing Approach This project assesses which promotional campaign best boosts sales for a fast-food company's new menu items. Statistical analysis, including the Kruskal-Wallis H test and Dunn's post-hoc test, was used due to non-normal sales distributions and outliers. Results showed the first campaign achieved the highest median sales, but differences between campaigns were minor. It is recommended that the Marketing Manager reevaluate marketing strategies and target customers to improve campaign impact. (URL)
Improving the Number of Review: Exploring Review Patterns in Bangkok's Airbnb Landscape Despite an increase in reviews, about 36% (5.7 thousand) of Airbnb listings in Bangkok received none from 2012 to 2022. This project explores why some listings lack reviews and offers recommendations for Airbnb Thailand. It finds that unreviewed listings often have higher prices and longer minimum stays, which may deter bookings and reviews. In contrast, reviewed listings are typically entire homes or apartments, more centrally located, and closer to popular areas. Recommendations include adjusting prices and minimum stays for unreviewed listings, running promotions to boost reviews, and improving marketing to highlight unique features and attractions. (URL)

Natural Language Processing

Project Name Description
Regular Expression for Rule-Based Content Moderation This project addresses taboo expressions in computer-mediated communications by detecting and censoring specific elements of messages (e.g., "Shit, I forgot!" $\rightarrow$ "****, I forgot!"). A rule-based approach using regular expressions was chosen over machine learning for its efficient implementation, high explainability to stakeholders, and reliable detection of inappropriate content through rule matching. (URL)

My tools

rstudio logo python logo vscode logo markdown logo markdown logo markdown logo markdown logo

Connect with me

linkedin logo medium logo

Pinned Loading

  1. customer-churn-prediction customer-churn-prediction Public

    This project aims to reduce churn rate from 16.8% to 10% by exploiting both data analysis and predictive modeling.

    Jupyter Notebook

  2. dynamic-pricing-model dynamic-pricing-model Public

    The goal of this project is to build a dynamic pricing model to adjust fares of bike-ride services based on different factors.

    Jupyter Notebook

  3. ab-testing-campaign-effectiveness ab-testing-campaign-effectiveness Public

    This project aims to investigate out of three promotional campaign, which one performs the best in terms of generating sales. To this end, this project utilizes an A/B testing approach.

    Jupyter Notebook

  4. bangkok-airbnb-review-exploration bangkok-airbnb-review-exploration Public

    This repository contains the code and resources for analyzing Airbnb listing reviews in Bangkok, Thailand. My aim is to explore the factors influencing the lack of reviews for certain listings in B…

    Jupyter Notebook

  5. spam-message-prediction spam-message-prediction Public

    This project aims to build a classification model to detect a spam message utilizing term frequency-inverse document frequency (TF-IDF) as the text representation.

    Jupyter Notebook

  6. predicting-gender-based-on-name predicting-gender-based-on-name Public

    This project seeks to build a classifier to predict someone's gender (binary categories) based on their full names. It is IMPORTANT to note that the model's predictions are only valid for Indonesia…

    Jupyter Notebook