Adelia Januarto LingAdeu

About

Hi, I am a junior linguist with a current interest in computational stylometry. As this area of interest requires a deep understanding of linguistics, statistics, and computational methods, data science and machine learning are tools I'm using to explore my area of interest. These two fields are applied in a business context for demonstrating some transferrable skills.

Currently, I'm learning how to extract important linguistic features from text data and how to experiment machine learning models for text classification. I am also exploring how to apply statistical techniques for authorship attribution. In addition to these, I am working on some data science projects in business context to get myself familiarized with making sense of numbers.

Key Projects

Predictive Modeling

Project Name	Description
Optimizing Ride Fares: A Dynamic Pricing Model for Ride-Sharing Services	Currently, ride-sharing prices are primarily set based on ride duration, overlooking fluctuating demand and supply. This project explores a dynamic pricing model powered by machine learning to enhance profitability while keeping prices appealing to customers. By experimenting with 12 ML algorithms and two feature engineering techniques (feature selection and polynomial expansion), the project developed a model that, when tested with a simulation of 100 customers, showed that increasing the key feature—expected ride duration—by 20% through a promotional campaign could generate a net profit of $2.4K. (URL)
Addressing Customer Churn in an E-Commerce Company	This project seeks to reduce an e-commerce company's customer churn rate from 16.8% to 10%. Using diagnostic analysis and a classification model, we focused on minimizing false negatives due to their higher financial impact. After testing various techniques and algorithms, we chose XGBoost and identified tenure and cashback amount as key factors for intervention. Simulations showed that with targeted strategies, achieving the 10% churn rate is feasible. (URL)
Development and Evaluation of a Classification Model for Spam Detection	This project developed a classification model to identify spam messages (1 for spam, 0 for legitimate) for a telecommunications company. F1 score was selected as the primary metric to balance false positives and false negatives. Logistic regression emerged as the best model, achieving an F1 score of 0.92 ± 0.01 across 10 folds. Additionally, the model's potential to save $23K through reduced spam impact highlights its financial and operational benefits. (URL)

Data Analysis

Project Name

Description

Evaluating Marketing Campaign Effectiveness for New Menu Items: An A/B Testing Approach

This project assesses which promotional campaign best boosts sales for a fast-food company's new menu items. Statistical analysis, including the Kruskal-Wallis H test and Dunn's post-hoc test, was used due to non-normal sales distributions and outliers. Results showed the first campaign achieved the highest median sales, but differences between campaigns were minor. It is recommended that the Marketing Manager reevaluate marketing strategies and target customers to improve campaign impact. (URL)

Improving the Number of Review: Exploring Review Patterns in Bangkok's Airbnb Landscape

Despite an increase in reviews, about 36% (5.7 thousand) of Airbnb listings in Bangkok received none from 2012 to 2022. This project explores why some listings lack reviews and offers recommendations for Airbnb Thailand. It finds that unreviewed listings often have higher prices and longer minimum stays, which may deter bookings and reviews. In contrast, reviewed listings are typically entire homes or apartments, more centrally located, and closer to popular areas. Recommendations include adjusting prices and minimum stays for unreviewed listings, running promotions to boost reviews, and improving marketing to highlight unique features and attractions. (URL)

Natural Language Processing

Project Name

Description

Regular Expression for Rule-Based Content Moderation

This project addresses taboo expressions in computer-mediated communications by detecting and censoring specific elements of messages (e.g., "Shit, I forgot!"

$\rightarrow$

"****, I forgot!"). A rule-based approach using regular expressions was chosen over machine learning for its efficient implementation, high explainability to stakeholders, and reliable detection of inappropriate content through rule matching. (URL)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adelia Januarto LingAdeu

Block or report LingAdeu

About

My tools

Connect with me

Pinned Loading