31 May 18:44

github-actions

ec26ec2

v0.2.2 Latest

Latest

Key Changes

Enable observability in JetStream Server (prometheus metrics)
Enable JAX profiler support on single-host JetStream Server
Support both text and token ids I/O for JetStream Decode API
Add health check API
Support MLPerf evaluation
Enable JetStream Server E2E tests
Increase unit test coverage (>=96%)

What's Changed

Accuracy eval mlperf by @jwyang-google in #76
Add metadata metrics by @yeandy in #77
Fix pad_tokens function description by @FanhaiLu1 in #80
Prometheus Metrics by @Bslabe123 in #71
Update JetStream grpc proto to support I/O with text and token ids by @JoeZijunZhou in #78
Update benchmark script to easily test llama-3 by @bhavya01 in #83
Unit test coverage cleanup by @JoeZijunZhou in #81
Allow tokenizer to customize stop_tokens by @qihqi in #84
Decode Batch Percentage Metrics/Improved Scraping by @Bslabe123 in #82
Bump requests from 2.31.0 to 2.32.0 in the pip group across 1 directory by @dependabot in #86
Add profiling support and update docs by @JoeZijunZhou in #85
Add ray disaggregated serving support by @FanhaiLu1 in #87
Ensure server warmup before benchmark by @JoeZijunZhou in #91
Add healthcheck support for JetStream by @vivianrwu in #90
Add JetStream E2E test CI by @JoeZijunZhou in #89
Release v0.2.2 by @JoeZijunZhou in #95

New Contributors

@jwyang-google made their first contribution in #76
@Bslabe123 made their first contribution in #71
@vivianrwu made their first contribution in #90

Full Changelog: v0.2.1...v0.2.2

Contributors

Bslabe123, qihqi, and 7 other contributors

Assets 2

03 May 21:23

github-actions

v0.2.1

dabded4

v0.2.1

Key Changes

Support Llama3 tokenizer
JetStream Tokenizer refactor
Disaggregation preparation work

What's Changed

add sample_idx in InputRequest for debugging by @morgandu in #32
Update README.md with user guides by @JoeZijunZhou in #34
Update README.md with PT user guide by @JoeZijunZhou in #35
Reorganize unit tests and update CICD by @JoeZijunZhou in #37
Add badges for JetStream by @JoeZijunZhou in #38
Bump idna from 3.6 to 3.7 by @dependabot in #39
Reformat benchmark metrics by @yeandy in #42
Update server host default value by @JoeZijunZhou in #43
Refactor readme by @FanhaiLu1 in #41
Add missing Documentation by @FanhaiLu1 in #47
Update README.md to fix broken link by @charbull in #50
Add np padded token support by @FanhaiLu1 in #49
Format token utils and test by @FanhaiLu1 in #51
Align Tokenizer in JetStream by @JoeZijunZhou in #40
Do nothing for nd array in copy_to_host_async by @FanhaiLu1 in #52
Add jax_padding support driver and server lib by @FanhaiLu1 in #54
Update maxtext user guide by @JoeZijunZhou in #56
Fix benchmark script type issue by @JoeZijunZhou in #59
Fix requester flag default value by @JoeZijunZhou in #60
Fix float division by zero in benchmark by @FanhaiLu1 in #62
Register IFRT proxy backend when proxy is defined in the jax_platforms by @zhihaoshan-google in #63
Add an abstract class for Tokenizer by @bhavya01 in #53
refactor slice_to_num_chips to adapt to Cloud config by @zhihaoshan-google in #65
Support llama3 tokenizer by @bhavya01 in #67
Prerequisite work for supporting disaggregation: by @zhihaoshan-google in #68
Create init.py in Jetstream/third_party by @bhavya01 in #69
Add tokenize_and_pad function to backward compatible by @FanhaiLu1 in #70
Release v0.2.1 by @JoeZijunZhou in #72
Bump tqdm from 4.66.1 to 4.66.3 in the pip group across 1 directory by @dependabot in #73
Release v0.2.1 with docs update by @JoeZijunZhou in #74

New Contributors

@dependabot made their first contribution in #39
@yeandy made their first contribution in #42
@charbull made their first contribution in #50
@zhihaoshan-google made their first contribution in #63
@bhavya01 made their first contribution in #53

Full Changelog: v0.2.0...v0.2.1

Contributors

charbull, morgandu, and 6 other contributors

Assets 2

05 Apr 20:42

JoeZijunZhou

v0.2.0

ae2ca8c

v0.2.0

Major Changes

Support JetStream MaxText inference on Cloud TPU VM
Support JetStream Pytorch inference on Cloud TPU VM
Support Continuous Batching with interleaved mode in JetStream
Support online serving benchmarking

What's Changed

Add unit tests CI github action by @JoeZijunZhou in #1
Refine thread in orchestrator by @JoeZijunZhou in #2
Optimize maximum threads to saturate decoding capacity by @JoeZijunZhou in #3
Add benchmarks maximum threads config by @JoeZijunZhou in #4
First support necessary for MaxText by @rwitten in #5
Support gracefully stopping orchestrator and server by @JoeZijunZhou in #6
Save request outputs and add eval accuracy support by @FanhaiLu1 in #8
Use parameter based num as inference request max output length by @FanhaiLu1 in #10
Fix output token drop issue by @JoeZijunZhou in #9
Add option to warm up by @qihqi in #11
Replace token_list with generated_text in saved outputs by @FanhaiLu1 in #12
Refine requester util by @JoeZijunZhou in #15
Adds filtering for sharegpt based on conversation starter. by @patemotter in #17
Allows more requests than available data. by @patemotter in #19
Fix starvation with async server and interleaving optimization by @JoeZijunZhou in #13
Add Token util unit test by @FanhaiLu1 in #20
Fix llama2 decode bug in tokenizer by @FanhaiLu1 in #22
Fix whitespace replacement bug by @FanhaiLu1 in #24
Update benchmark to run openorca dataset by @morgandu in #21
Add model ckpt conversion and AQT scripts for JetStream MaxText Serving by @JoeZijunZhou in #23
Refactor to sample before tokenize by @morgandu in #26
Update ckpt conversion scripts by @JoeZijunZhou in #25
move tokenizer model to third party llama2 by @FanhaiLu1 in #27
Support JetStream MaxText user guide by @JoeZijunZhou in #28
Enable pylint linter and pyink formatter by @JoeZijunZhou in #29
Update README by @JoeZijunZhou in #30
Release v0.2.0 by @JoeZijunZhou in #31

New Contributors

@JoeZijunZhou made their first contribution in #1
@rwitten made their first contribution in #5
@FanhaiLu1 made their first contribution in #8
@qihqi made their first contribution in #11
@patemotter made their first contribution in #17
@morgandu made their first contribution in #21

Full Changelog: https://github.com/google/JetStream/commits/v0.2.0

Contributors

rwitten, patemotter, and 4 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key Changes

What's Changed

New Contributors

Contributors

Key Changes

What's Changed

New Contributors

Contributors

Major Changes

What's Changed

New Contributors

Contributors

Releases: AI-Hypercomputer/JetStream

v0.2.2

Key Changes

What's Changed

New Contributors

Contributors

v0.2.1

Key Changes

What's Changed

New Contributors

Contributors

v0.2.0

Major Changes

What's Changed

New Contributors

Contributors