Skip to content

a toolset wrapper for Sklearn , pandas ... etc. for data analysis tasks and ML tasks in terminal.

License

Notifications You must be signed in to change notification settings

whuang022ai/Bobatea

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License

Bobatea 波霸奶茶

An all-in-one command-line toolbox for data scientists to complete their daily job easier and Low-code .

Support common data cleaning and machine learning methods from Pandas, Sklearn Seborn, etc. ---

Install

add /your_path/Bobatea_datatool/bin to /.bashrc example : "PATH=/Bobatea_datatool/bin:$PATH"

source ~/.bashrc

and run install.sh :

$ bash install.sh

Support command

command function
index add index to data
header add header to data
take take wanted columns
drop drop unwant columns
range take wanted rows range
merge merge two datasheets together
mergebyix merge multiple csv files by index
group group by wanted column,output muti csvs
csvf feature value filter
t data transpose
log apply logarithm
mean apply arithmetic mean
pca apply Principal component analysis
tsne apply tSNE
scatter plot scatterter
kmean plot 2D kmeans
pair plot pair plot
curve plot line curve plot
hcluster plot hierarchical cluster plot

These commands are well integrated with terminal commands like cat, |, >, etc... together to build the data pipeline faster and easier.

Example of iris data applying pca

step1. get data :

$ wget http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

step2. add header, index and choose features, run PCA, and plot in just less than one minute.

$ cat iris.data | header --h "sepal_length,sepal_width,petal_length,petal_width,species"  | drop species | pca | scatter 

more exampls please checkout ./test/

About

a toolset wrapper for Sklearn , pandas ... etc. for data analysis tasks and ML tasks in terminal.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published