Skip to content

Latest commit

 

History

History
267 lines (224 loc) · 12.6 KB

index.md

File metadata and controls

267 lines (224 loc) · 12.6 KB
layout title description show_buttons repository_url
default
EDITVAL
Benchmarking Text-Guided Image Editing Methods
true

Overview

EDITVAL is a standardized benchmark for evaluating text-guided image editing methods across diverse edit types, validated through a large-scale human study.


Branching

EDITVAL consists of the following distinct components:

  • A seed dataset D consisting of carefully selected images from MS-COCO. These are the real images which need to be edited by the different editing methods.
  • An attribute list A which consists of various dimensions in which the edits need to be made on the dataset D.
  • An evaluation template and procedure for human study on the edited images.
  • An automated evaluation procedure to check quality of edits using pre-trained vision-language models for a subset of attributes in A.

The attribute list A for ~100 images from MS-COCO can be downloaded from here{:target="_blank" rel="noopener"}. The format of the json file is as follows:

{
  "class_name" : {
    "image_id": { # image ids from MS-COCO
      "edit_attribute" : {
        "from" : ["initial state of attribute"],
        "to" : ["target states of attribute", ...]}}}
}

The complete list of edit attributes for evaluation currently is:

  • Object Addition: adding an object to the image.
  • Object Replacement: replacing an existing object in the image with another object.
  • Size: changing the size of an object.
  • Position Replacement: changing the position of an object in the image (e.g., left, center, right).
  • Positional Addition: adding an object in a specific position in the image.
  • Alter Parts: modifying the details of an object.
  • Background: changing the background of the image.
  • Texture: changing the texture of an object (e.g., wooden table, polka dot cat).
  • Color: changing the color of an object.
  • Shape: changing shape of an object (e.g., circle-shaped stop sign)
  • Action: changing the action that the main object is performing (e.g., dog running).
  • Viewpoint: changing the viewpoint in which the image is taken from (e.g., photo of a dog from above).

More Details on EditVal Dataset and Pipeline

EditVal benchmark contains 648 unique image-edit operations for 19 classes selected from MS-COCO spanning a variety of real-world edits. Edit operations span simple attribute categories like adding or replacing an object to more complex ones such as changing an action, camera viewpoint or replacing the position of an existing object.

Branching

MTurk Human Study

The template to run an MTurk study{:target="_blank" rel="noopener"} to evaluate the quality of the image editting methods is provided here{:target="_blank" rel="noopener"}.

Together with the template, an input csv file must be provided for the mturk study. Each row of the csv file represents one instance of edit, which contains these four inputs:

  • url_org: url of the original image.
  • url_edit: url of the editted image.
  • prompt: the prompt used to edit the image.
  • class_name: name of the main object in the image.

An example of an input csv file can be seen here{:target="_blank" rel="noopener"}. Below is an example of how the mturk study looks to the workers.

Original Image
Edited Image

The right image is supposed to apply the prompt "Change apple to orange" to the left image.

How well is the edit from the given prompt applied?

Not applied Minorly applied Adequetly applied Perfectly applied

How well are the other properties (other than what the edit is targeting) of the main object (apple) preserved in the right image?

Object is completely changed Some parts are preserved Most parts are preserved Other properties of the object are perfectly preserved

How well are the other properties (other than what the edit is targeting) of the main object (apple) preserved in the right image?

Completely changed Some parts are preserved Most parts are preserved Perfectly preserved

Leaderboards

The numbers below for the human study are calculated only on the first question of the template, which does not consider the changes to the rest of the image. This has been done in order to keep the results comparable to our automatic evaluation framework. For each instant in the human study, a score of 1.0 is given if the edit is Adequetly applied or Perfectly applied, and a score of 0.0 otherwise.

Human Study

Method Object Addition Object Replacement Position Replacement Positional Addition Size Alter Parts Average

Automatic Evaluation

Method Object Addition Object Replacement Position Replacement Positional Addition Size Alter Parts Average

Contact Us

Contact us at [email protected] if you wish to add your method to the leaderboards.

<script> function sortTable(n, tableID) { var table, rows, switching, i, x, y, shouldSwitch, dir, switchcount = 0; table = document.getElementById(tableID); switching = true; //Set the sorting direction to ascending: dir = "desc"; /*Make a loop that will continue until no switching has been done:*/ while (switching) { //start by saying: no switching is done: switching = false; rows = table.rows; /*Loop through all table rows (except the first, which contains table headers):*/ for (i = 1; i < (rows.length - 1); i++) { //start by saying there should be no switching: shouldSwitch = false; /*Get the two elements you want to compare, one from current row and one from the next:*/ x = rows[i].getElementsByTagName("TD")[n]; y = rows[i + 1].getElementsByTagName("TD")[n]; /*check if the two rows should switch place, based on the direction, asc or desc:*/ if (dir == "asc") { if (x.innerHTML.toLowerCase() > y.innerHTML.toLowerCase()) { //if so, mark as a switch and break the loop: shouldSwitch= true; break; } } else if (dir == "desc") { if (x.innerHTML.toLowerCase() < y.innerHTML.toLowerCase()) { //if so, mark as a switch and break the loop: shouldSwitch = true; break; } } } if (shouldSwitch) { /*If a switch has been marked, make the switch and mark that a switch has been done:*/ rows[i].parentNode.insertBefore(rows[i + 1], rows[i]); switching = true; //Each time a switch is done, increase this count by 1: switchcount ++; } else { /*If no switching has been done AND the direction is "asc", set the direction to "desc" and run the while loop again.*/ if (switchcount == 0 && dir == "desc") { dir = "asc"; switching = true; } } } } </script> <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/4.1.2/papaparse.js"></script> <script> function arrayToTable(tableData, tableID) { var table = document.getElementById(tableID); $(tableData).each(function (i, rowData) { let row = table.insertRow(-1); $(rowData).each(function (j, cellData) { let c = row.insertCell(j); if (j != 0) { cellData = Math.round(cellData * 100) / 100; } c.innerText = cellData }); }); return table; } $.ajax({ type: "GET", url: "./human_study_table.csv", success: function (data) { arrayToTable(Papa.parse(data).data, "human_study_table"); } }); $.ajax({ type: "GET", url: "./aut_eval_table.csv", success: function (data) { arrayToTable(Papa.parse(data).data, "aut_eval_table"); } }); </script>