Skip to content

oracle-samples/heatwave-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HeatWave AutoML examples and performance benchmarks

HeatWave is an integrated, massively parallel, high-performance, in-memory query accelerator for MySQL Database Service that accelerates performance of MySQL by orders of magnitude for analytics and mixed workloads. It is the only service that enables you to run OLTP and OLAP workloads simultaneously and directly from your MySQL database, without any changes to your applications. This eliminates the need for complex, time-consuming, and expensive data movement and integration with a separate analytics database. Your applications connect to the HeatWave cluster through standard MySQL protocols.

HeatWave users currently do not have an easy way of creating machine-learning models for their data in the database, or generating predictions and explanations for it. Such users, while being database experts, frequently are relatively new to Machine Learning and can benefit from products that streamline the creation and usage of machine learning models. HeatWave AutoML is the product that addresses this need.

Required Services:

  1. Oracle Cloud Infrastructure
  2. MySQL Database Service and HeatWave

Getting started

  1. Provision MySQL Database Service instance and add a HeatWave cluster.
  2. Clone this repository and change directories
git clone https://github.com/oracle-samples/heatwave-ml.git
  1. Create a Python virtual environment and activate it as follows
python3.8 -m venv py_heatwaveml
source py_heatwaveml/bin/activate
  1. Install the necessary Python packages
pip install pandas numpy unlzw3 scikit-learn pyreadr --user

Python Notebooks

To help customers get started with Heatwave ML and showcase its capabilities, we have prepared a set of Jupyter notebooks. Each notebook focuses on a simple application of Heatwave ML components in practice and walks you through a solution. Here is the list of existing notebooks and a screenshot of the rendered HTML.

Description Link
Training a model to predict whether a bank customer will subscribe to a term deposit Bank marketing
Training a model to predict the price of a diamond Diamonds

SQL examples

SQL Code to run training, predictions and scoring on a variety of common Machine Learning classification and regression datasets.

Example Description #Rows (Training Set) #Features
airlines Predict Flight Delays 377568 8
bank_marketing Direct marketing – Banking Products 31648 17
cnae-9 Documents with free text business descriptions of Brazilian companies 757 857
connect-4 8-ply positions in the game of connect-4 in which neither player has won yet – predict win/loss 47290 161
fashion_mnist Clothing classification problem 60000 785
nomao Active learning is used to efficiently detect data that refer to a same place based on Nomao browser 24126 119
numerai Data is cleaned, regularized and encrypted global equity data 67425 22
higgs Monte Carlo Simulations 10500000 29
census Determine if a person makes > $50k 32561 15
titanic Survival Status of individuals 917 14
creditcard Identify fraudulent  transactions 199364 30
appetency Predict the propensity of customers to buy new products 35000 230
black_friday Customer purchases on Black Friday 116774 10
diamonds Predict price of a diamond 37758 10
mercedes Time the car took to pass testing 2946 377
news_popularity Predict the number of shares of article in social networks (popularity) 27750 60
nyc_taxi Predict tip amount for NYC taxi cab 407284 15
twitter The popularity of a topic on social media 408275 78

Contributing

This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide

Security

Please consult the security guide for our responsible security vulnerability disclosure process

License

Copyright (c) 2025 Oracle and/or its affiliates.

Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •