Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using an EfficientNet Model - Looking for advice #638

Closed
ozzyonfire opened this issue Mar 11, 2024 · 7 comments · Fixed by #639
Closed

Using an EfficientNet Model - Looking for advice #638

ozzyonfire opened this issue Mar 11, 2024 · 7 comments · Fixed by #639
Labels
question Further information is requested

Comments

@ozzyonfire
Copy link

Question

Discovered this project from the recent Syntax podcast episode (which was excellent) - it got my mind racing with different possibilities.

I got some of the example projects up and running without too much issue and naturally wanted to try something a little more outside the box, which of course has led me down some rabbit holes.

I came across this huggingface model;
https://huggingface.co/chriamue/bird-species-classifier
and https://huggingface.co/dennisjooo/Birds-Classifier-EfficientNetB2

Great, file size is only like 32 mb... however just swapping in this model into the example code didn't work - something about efficientnet models not supported yet. Okay I'll just try to convert this model with the provided script.

Similar error about EfficientNet... Okay I will clone the repo, and retrain using a different architecture... Then looking at the training data https://www.kaggle.com/datasets/gpiosenka/100-bird-species, it seems like maybe it's meant for efficientnet?

Also digging into how the above huggingface projects were done, I realized they are fine-tunes of other image classification models...

So my questions is, can I fine tune an existing transformer js image classification model? such as https://huggingface.co/Xenova/convnext-tiny-224 or am I better off using the original https://huggingface.co/facebook/convnext-tiny-224 model and creating a fine tune from there, then converting it to onnx using the script?

Thanks for your help on this and for this awesome project. Really just looking for some direction.

@ozzyonfire ozzyonfire added the question Further information is requested label Mar 11, 2024
@xenova
Copy link
Collaborator

xenova commented Mar 11, 2024

Hey there! 👋 So, there are two reasons why that model is (currently) not working:

  1. We don't yet support efficientnet models (see current list of 90 models here)
  2. The model repo you are referencing do not have ONNX weights in a onnx subfolder (they are in the root of the folder). A future version of transformers.js will allow users to change the subfolder

Fortunately, both are quite easy to add support for! Let's see if I can get a PR out for this.

@ozzyonfire
Copy link
Author

Great! thanks for the quick reply.

My first instinct was to just clone the model into my own repo and reorganize the project so it contained the onnx folder structure and weights. But that's when I ran into the script error about efficientnet.

Appreciate the effort to get this working, I'm also totally happy to fine-tune an existing model, if that's the best course of action.

@xenova
Copy link
Collaborator

xenova commented Mar 11, 2024

Turns out this was actually a bit trickier than I originally guessed:

  1. The provided ONNX weights are not compatible with transformers.js (non-standard input output names)
  2. Exporting with my normal script produces models with incorrect outputs. Turns out opset=9 is necessary.

For future reference, here is the conversion code I used:

import torch
import os
from PIL import Image
import requests
from transformers import AutoProcessor, AutoModelForImageClassification

def export_efficientnet(model_id, image):
  extractor = AutoProcessor.from_pretrained(model_id)
  model = AutoModelForImageClassification.from_pretrained(model_id)
  model.eval()

  # Input to the model
  inputs = extractor(image, return_tensors='pt')
  torch_out = model(inputs['pixel_values'])

  output_path = f"./models/{model_id}/onnx"
  os.makedirs(output_path, exist_ok=True)

  # Export the model
  torch.onnx.export(model,                                                # model being run
                    inputs['pixel_values'],                               # model input (or a tuple for multiple inputs)
                    f"{output_path}/model.onnx",                          # where to save the model (can be a file or file-like object)
                    export_params=True,                                   # store the trained parameter weights inside the model file
                    opset_version=9,                                      # the ONNX version to export the model to
                    do_constant_folding=True,                             # whether to execute constant folding for optimization
                    input_names = ['pixel_values'],                       # the model's input names
                    output_names = ['logits'],                            # the model's output names
                    dynamic_axes={'pixel_values' : {0 : 'batch_size'},    # variable length axes
                                  'logits' : {0 : 'batch_size'}})
  
  
# Load example image
url = "https://upload.wikimedia.org/wikipedia/commons/7/73/Short_tailed_Albatross1.jpg"
image = Image.open(requests.get(url, stream=True).raw)

export_efficientnet('chriamue/bird-species-classifier', image)
export_efficientnet('dennisjooo/Birds-Classifier-EfficientNetB2', image)

I might add this to the conversion script if I get time.

I've opened a PR (#639) to add support for the model to transformers.js.

Although, it does work for your models 👍

import { pipeline } from '@xenova/transformers';

// Create image classification pipeline
const classifier = await pipeline('image-classification', 'chriamue/bird-species-classifier', {
    quantized: false,      // Quantized model doesn't work
    revision: 'refs/pr/1', // Needed until the model author merges the PR
});

// Classify an image
const url = 'https://upload.wikimedia.org/wikipedia/commons/7/73/Short_tailed_Albatross1.jpg';
const output = await classifier(url);
console.log(output)
// [{ label: 'ALBATROSS', score: 0.9999023079872131 }]

If you'd like to test it out now, you can install this branch with

npm install xenova/transformers.js#add-efficientnet

@xenova
Copy link
Collaborator

xenova commented Mar 11, 2024

Similarly for the other model:

import { pipeline } from '@xenova/transformers';

// Create image classification pipeline
const classifier = await pipeline('image-classification', 'dennisjooo/Birds-Classifier-EfficientNetB2', {
    quantized: false,      // Quantized model doesn't work
    revision: 'refs/pr/3', // Needed until the model author merges the PR
});

// Classify an image
const url = 'https://upload.wikimedia.org/wikipedia/commons/7/73/Short_tailed_Albatross1.jpg';
const output = await classifier(url);
console.log(output)
// [{ label: 'ALBATROSS', score: 0.9996034502983093 }]

@ozzyonfire
Copy link
Author

Dude, you are a machine! Was not expecting to get such a thorough reply. I will check it out as soon as I can. Thanks!

@xenova
Copy link
Collaborator

xenova commented Mar 11, 2024

@ozzyonfire Happy to help! :)

@ozzyonfire
Copy link
Author

Got it working - thanks again for your help on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants