LLM transformer

Configuration

Dependencies

The dependencies for this project are managed by Poetry. To install them, run

poetry install

Some of the dependencies are:

Pytorch 2.1
Python 3.10

Docker

A Dockerfile is provided to run the code in a container. To build the image, run

./build_docker_image.sh

The image name is $HOSTNAME/llm-transformer. To run the container, run

./docker.sh python -m llmt.main --help

Hardware

This code was developed and tested on the Nvidia 4090 GPU with 24GB of memory.

Usage

Setup

huggingface-cli login
cp ~/cache/huggingface/token ./data/
poetry install

Token is supposed to be under directory ./data

Dataset

In order to download the dataset, run

./docker.sh python -m llmt.main dataset download

and it will be downloaded under -./data.

Training

In order to train a model, run

./docker.sh python -m llmt.main train

We use the tokenizer from https://huggingface.co/replit/replit-code-v1-3b

Validation and metrics

TODO

Links:

Inference

TODO

TODO

Implement the test function to evaluate the generated code
Use sintax trees for the languages to remove the spaces which add no information and may lead to slower learning
Use sintax trees to change the variable names

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
llmt		llmt
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
build_docker_image.sh		build_docker_image.sh
docker.sh		docker.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
spiece.model		spiece.model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM transformer

Configuration

Dependencies

Docker

Hardware

Usage

Setup

Dataset

Training

Validation and metrics

Inference

TODO

About

Releases

Packages

Languages

jorgemf/LLM-transformer

Folders and files

Latest commit

History

Repository files navigation

LLM transformer

Configuration

Dependencies

Docker

Hardware

Usage

Setup

Dataset

Training

Validation and metrics

Inference

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages