Amazon ML Challenge 2024

Welcome to the Amazon ML Challenge 2024 repository! This project showcases solutions and implementations for the machine learning challenge organized by Amazon, focusing on various aspects of image analysis and processing.

Project Overview

This repository contains code and resources for solving the Amazon ML Challenge 2024. The challenge involves predicting specific attributes from images using advanced machine learning models and techniques.

Installation

To set up the project locally, follow these steps:

Clone the repository:

git clone https://github.com/gjyotin305/AmazonMLChallenge24.git

Navigate to the project directory:
```
cd AmazonMLChallenge24
```

Create a virtual environment and activate it:

python -m venv env
source env/bin/activate  # On Windows use: env\Scripts\activate

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

To download the images run:

python student_resource_3/src/download_data.py

To further add the image path to the csv

python preprocess.py

To run the image prediction and processing pipeline:

Ensure you have the dataset available at the specified paths or adjust the paths in the script accordingly.
Run the main script:
```
python eval.py
```
This script will process images and output predictions as defined in the code.
For specific tasks or stages, you can modify and execute scripts located in the scripts folder.

Data

The project uses images and related data for training and testing. Ensure that you have the dataset in the following directory structure:

../dataset/: Contains CSV files and other metadata.
/data/.jyotin/AmazonMLChallenge24/student_resource 3/images_train/: Contains training images.

Model

The project utilizes the LlavaNextForConditionalGeneration model from Hugging Face’s Transformers library. The model is fine-tuned for the specific task of extracting and analyzing numerical information from images.

Model Name: llava-hf/llava-v1.6-mistral-7b-hf
Processor: LlavaNextProcessor

After we get the image description from LlavaNext, we then used a

Model Name: Phi3-Mini-4K Instruct,

to further process the relevant text in a relevant json, which is then checked to ensure proper units and values are only allowed.

We load the model in float16 to ensure better execution speeds.

Our approach utilises only 10 sec per image.

Results

Results of the predictions and analyses are printed to the console. Modify the script to save results to files or visualize them as needed.

Team Members

Meet our team members:

Rhythm Baghel.
Jyotin Goel.
Harshiv Shah.
Mehta Jay Kamalkumar.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amazon ML Challenge 2024

Project Overview

Table of Contents

Installation

Usage

Data

Model

Results

Team Members

License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
student_resource_3		student_resource_3
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

gjyotin305/AmazonMLChallenge24

Folders and files

Latest commit

History

Repository files navigation

Amazon ML Challenge 2024

Project Overview

Table of Contents

Installation

Usage

Data

Model

Results

Team Members

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages