Skip to content

CorrWu/SETI-reverse_image_search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Reverse Image Search

Summary

My goal is to do reverse image search on a target interference pattern to find other interference patterns similar to it.

Exploration

I found some interference such as the following from mnt_blpd7/datax/dl/GBT_57436_51432_HIP77257_fine.h5. Screen Shot 2022-11-30 at 23 55 48
Screen Shot 2022-11-30 at 23 57 23
I chose a much smaller portion of the data as the target interference to improve performance.
Screen Shot 2022-12-04 at 16 53 03

Model

I tried using ResNet50 and a model developed by Peter Ma. In the end, ResNet50 with imagenet worked out better for the purpose of reverse image search.

model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

Generating Input Data

The variable interval is the difference between the starting point (f_start) and the stopping point (f_stop) of each image. The variable step is the difference between the starting points (f_start) of each image. Both variables can be changed if needed. I set the interval to a small number (256 * 2.79e-6) to improve the efficiency of the program. The variable step can be set to a smaller number to get more accurate result.
Screen Shot 2022-12-04 at 16 32 15
I used skimage.transform.resize to resize each interval of frequency to the shape of (1, 224, 224, 3), which is the shape ResNet50 uses. The resulting data_list is a list of arrays with the shape. Each array is perceived as an image when passed in the feature extraction function. They are compared with each other to find the nearest neighbor among them.

start = 1530
stop = 1535
interval = 256 * 2.79e-6    # The difference between the starting point (`f_start`) and the stopping point (`f_stop`)
step = interval             # The difference between the starting points (`f_start`) of each interval
data_list = []
wf = blimpy.Waterfall(url, load_data=True, f_start=start, f_stop=stop)
for i in np.arange(start, stop, step):
    fstart, fstop = round(i, 3), i + interval
    _, sub_data = wf.grab_data(f_start=fstart, f_stop=fstop)
    resized_data = resize(sub_data, (1, 224, 224, 3))
    data_list.append(resized_data)

Data Preprocessing

I used logarithm on the data and then scaled the data to numbers between 0 and 1.

def preprocess_input(data):
    log_input = np.log(data)
    scale_input = (log_input - log_input.min()) / log_input.max()
    return scale_input

Feature Extraction

This function preprocesses the input array using the preprocess_input function above. Then it generates the features of the input array.

def extract_features(input_arr, model):
    input_shape = (224, 224, 3)
    preprocessed_arr = preprocess_input(input_arr)
    features = model.predict(preprocessed_arr, verbose = 0)
    flattened_features = features.flatten()
    normalized_features = flattened_features / norm(flattened_features)
    return normalized_features

Generating Features

This part of the code applies extract_features function to each array in the data_list generated and stores the features in the feature_list.

feature_list = []
for i in range(len(data_list)):
    data = data_list[i]
    feature_list.append(extract_features(data, model))

Finding the Nearest Neighbor

I imported NearestNeighbors from sklearn.neighbors to find the nearest neighbor using cosine similarity and Euclidean distance. In this case, both yielded the same result.

neighbors = NearestNeighbors(n_neighbors=5, algorithm='brute', metric='cosine').fit(feature_list)

or,

neighbors = NearestNeighbors(n_neighbors=5, algorithm='brute', metric='euclidean').fit(feature_list)

Finding the Nearest Neighbor of a Certain Interval

The following image is the pattern with a f_start of 1530 + 256 * 2.79e-6 * 3015 and an interval of 256 * 2.79e-6. This pattern would be the 3015th (zero-index) of the feature_list.

start = 1530 + 256 * 2.79e-6 * 3015
stop = start + 256 * 2.79e-6
wf.plot_waterfall(f_start=start, f_stop=stop)

Screen Shot 2022-12-04 at 16 53 03

distances, indices = neighbors.kneighbors([feature_list[3015]])

The indices are [3015, 3014, 5538, 3348, 3981]. They are ordered in ascending order of their distance from the 3015th pattern. The first one would be the target pattern itself.

# The 1st nearest neighbor except itself
start = 1530 + 256 * 2.79e-6 * 3014
stop = start + 256 * 2.79e-6
wf.plot_waterfall(f_start=start, f_stop=stop)

Screen Shot 2022-12-04 at 17 21 16

# The 2nd nearest neighbor except itself
start = 1530 + 256 * 2.79e-6 * 5538
stop = start + 256 * 2.79e-6
wf.plot_waterfall(f_start=start, f_stop=stop)

Screen Shot 2022-12-04 at 17 22 42

# The 3rd nearest neighbor except itself
start = 1530 + 256 * 2.79e-6 * 3348
stop = start + 256 * 2.79e-6
wf.plot_waterfall(f_start=start, f_stop=stop)

Screen Shot 2022-12-04 at 17 23 56

# The 4th nearest neighbor except itself
start = 1530 + 256 * 2.79e-6 * 3981
stop = start + 256 * 2.79e-6
wf.plot_waterfall(f_start=start, f_stop=stop)

Screen Shot 2022-12-04 at 17 24 25

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published