I'm a Data Analyst with experience in data engineering, system integration, and cloud-based solutions. I have a Master of Science degree in Applied Data Science from Indiana University, and I am passionate about data analytics, AI and machine learning. I'm actively seeking opportunities to work on impactful projects as a Data Analyst or Data Engineer.
- Languages: Python, R, SQL, Java, C
- Databases: PostgreSQL, MySQL, MongoDB, Snowflake, MS Access
- Data Visualization: Tableau, Power BI, Plotly, Excel
- Tools: Jira, Confluence, Lucidchart, MS Project , HP ALM
- Methodologies: Agile, Scrum, Waterfall
- Career Essentials in Data Analysis by Microsoft
- Microsoft Azure Data Fundamentals
- Data Analytics with Microsoft Fabric
- HackerRank SQL, R (Intermediate)
- Atlassian Agile Project Management Professional
-
Check Efficacy using a pre-processed dataset (CA, CM, CI classes) from Moleculenet.ai
-
Merge Data: Link NSC across files to combine screening results, EC50/IC50, and structures.
-
Filter Compounds: Focus on CA/CM for active candidates.
-
Calculate Selectivity Index (SI): SI = IC50/EC50 to identify compounds with high efficacy and low toxicity.
-
Data preprocessing :
- Manage duplicate entries,
- Mismatched screening conclusions,
- flag interpretation sign to values and
- Handle missing data.
-
ML model: performing random splitting (80% train, 20% test).
-
Extracted molecular descriptors (e.g., logP, Morgan Fingerprints, MORSE) from data,
-
training base models, check with test data/.
-
Evaluated models using accuracy, F1-score, and Cohen’s kappa, aligning predictive insights with clinical research objectives.
2. MULTI-CLASS GENRE CLASSIFICATION using R Link
- Automatic genre classification has long captivated researchers in Music Information Retrieval (MIR), seeking techniques to unravel the musical diversity.
- audio feature extraction and music genre classification by utilizing Spotify's rich array of audio features and a diverse dataset.
- Few other projects exploring concepts in R Link
- Tools: Python, NLP, Data Visualization
- Description: Applied NLP techniques to analyze customer feedback and classify sentiment as positive, negative, or neutral. Achieved 79% accuracy using machine learning models (Naive Bayes, Decision Tree, KNN).
- Tools: Python, Machine Learning, Streamlit
- Description: Developed a web app to predict real estate sales using Linear Regression, Random Forest, and Gradient Boosting. Enabled city-specific and overall sales predictions with user input.
- Tools: Shell, Airflow and Kafka
- Description: Designed and implemented ETL pipelines to integrate data from multiple sources into a centralized data warehouse, improving data quality by 25%.
- Coursera: Link
- Tools: SQL, GCP, Apache Airflow, GitHub, Restful APIs, Flask, ETL/ELT , SQL, NoSQL, Data warehouses
- Tools: Oracle DB, HP ALM, Python, Automation testing scripts, Data warehouses
-
Master of Science in Applied Data Science | Indiana University Indianapolis | Jan 2023 – May 2024
- Coursework: Data Analytics using Python and R, Data Visualization, Deep Learning, Cloud Computing, DBMS, Statistics
- Dean’s Scholarship Recipient
-
Bachelor of Engineering | Mangalore Institute of Technology and Engineering, VTU, India
I'm open to collaborating on interesting projects or discussing new opportunities. Feel free to reach out!
- 📧 Email: parimala.js27@gmail.com
- 🔗 LinkedIn: LinkedIn Profile
- 🔗 Portfolio: Portfolio
- 🔗 Hackerrank: Hackerrank