Skip to content

KevinZhoutianyi/FoNE

Repository files navigation

Fourier Number Embedding (FoNE)

Arxiv GitHub Website

πŸ“Œ FoNE: Precise Single-Token Number Embeddings via Fourier Features

πŸ”’ Efficient and accurate numerical representation for Large Language Models (LLMs).


πŸ”₯ Why Fourier Number Embedding (FoNE)?

πŸš€ Solving Tokenization Limitations in LLMs

Traditional LLMs tokenize numbers inefficiently, leading to:

  • Multiple tokens per number (e.g., "12345.6789" β†’ 5 tokens in GPT-4, 10 in LLaMA2).
  • Loss of precision, impacting arithmetic and numerical reasoning.

FoNE directly maps numbers to their Fourier representations, making:

  • βœ… Running time more efficient
  • βœ… Number embeddings precise
  • βœ… Data efficiency improved

πŸ”— Read the full details on our website


πŸ“ˆ Key Benefits of FoNE

  • βœ… Single-token number embeddings
  • βœ… Improves accuracy on arithmetic tasks
  • βœ… Reduces training data needs by up to 64Γ—
  • βœ… Works for any numeric data, including decimals & large numbers

🎯 Example: Tokenization Comparison

Tokenizer Tokenized Representation Tokens Used
GPT-4, LLaMA3.2 (BPE) 123 45 . 678 9 5
LLaMA2 (Digitwise Tokenization) 1 2 3 4 5 . 6 7 8 9 10
FoNE (Ours) 12345.6789 1 βœ…

πŸ“Š Empirical Results

πŸ“Œ Accuracy Trends on Arithmetic Tasks

FoNE achieves 99%+ accuracy with 64Γ— less data compared to baseline models.

πŸ“Œ Performance Highlights: βœ… 100% accuracy on 6-digit integer addition βœ… 98.4% accuracy on 50-digit integer addition βœ… Significant gains in subtraction & multiplication tasks

accuracy


πŸ”§ How Does FoNE Work?

method

πŸ“– Citation

If you find this project useful, please cite our work:

@article{zhou2025fone,
  title={FoNE: Precise Single-Token Number Embeddings via Fourier Features},
  author={Zhou, Tianyi and Fu, Deqing and Soltanolkotabi, Mahdi and Jia, Robin and Sharan, Vatsal},
  journal={arXiv preprint arXiv:2502.09741},
  year={2025}
}

βœ‰οΈ Contact

If you would like to discuss applying Fourier Number Embedding (FoNE) to quantization, data analysis, time series, or othersβ€”or explore adding new features to FNE, feel free to connect! πŸ“§ Email: tzhou029@usc.edu

πŸš€ If you find this useful, don't forget to ⭐️ the repo! πŸš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published