Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train about custom data #6

Open
hewumars opened this issue Jan 27, 2021 · 9 comments
Open

Train about custom data #6

hewumars opened this issue Jan 27, 2021 · 9 comments

Comments

@hewumars
Copy link

hewumars commented Jan 27, 2021

I use SOLAR for vehicle re-identification/pedestrian re-identification. but generate custom dataset is wrong.
After the network converges, the negative sample l2 distance is close to 0 in create_epoch_tuples.
In the evaluation, rank1 is close to 100%, but mAP is very low.

# create db pickle
_,_,image_paths,file_ids,labels = gil('custom_train.csv','/home/lxk/ZHP/data/VeIDData/VERI',True)

for mode in ['train', 'val']:
    image_list = train_idx_list if mode == 'train' else val_idx_list
  
    # boxes_dict = boxes[mode]

    for i,list in tqdm(image_list.items()):
        for idx in list:
            positives = []

            db_dict[mode]['cids'].append(image_paths[idx]) # image path

            db_dict[mode]['cluster'].append(labels[idx])   # class label

            pidxs_potential = [i for i in list]

            try:
                pidxs_potential.remove(idx)
            except:
                pass

            if len(pidxs_potential) == 0:
                continue

            pidxs = np.random.choice(pidxs_potential, min(len(pidxs_potential), 1)).tolist()

            db_dict[mode]['bbxs'].append(None) # bbox none

            db_dict[mode]['qidxs'].append(idx)    #anchor image idx
            db_dict[mode]['pidxs'].append(pidxs[0])  #postive image idx


save_path = './db_gl18.pkl'

pickle.dump(db_dict, open(save_path, 'wb'))
# in class TuplesBatchedDataset(data.Dataset): def __init__
self.images = [os.path.join(self.ims_root, db['cids'][i]+'.jpg') for i in range(len(db['cids']))]
#modified to 
self.images = [db['cids'][i] for i in range(len(db['cids']))]
@tonyngjichun
Copy link
Owner

Hi, thanks for opening this issue. I would like to ask how is your dataset structured? i.e. what's the content of your '/home/lxk/ZHP/data/VeIDData/VERI/custom_train.csv'?

@hewumars
Copy link
Author

hewumars commented Jan 28, 2021

image
landmark_id is class_id. the means of images is the name of all images in same class.

@tonyngjichun
Copy link
Owner

tonyngjichun commented Jan 28, 2021

Can you access the tensorboard log files during your training? (it's located in ./specs by default) Your triplets should be visualised in the IMAGES tab, do you mind sharing an example of it?

@hewumars
Copy link
Author

hewumars commented Jan 28, 2021

google drive download url : https://drive.google.com/file/d/1cwD8iiSeYsouimQKKn22Y9ChoVXuGnWk/view?usp=sharing

I removed ‘--soa --sos’ for experimentation.
train params : specs/gl18 --training-dataset gl18 --test-datasets veri_test --arch resnet101 --pool gem --p 3 --loss triplet --pretrained-type gl18 --loss-margin 1.25 --optimizer adam --lr 1e-6 -ld 1e-2 --neg-num 5 --query-size 2000 --pool-size 20000 --batch-size 32 --image-size 256 --update-every 1 --whitening --lambda 10 --no-val --flatten-desc --epochs 1000 --soa-layers ''

@tonyngjichun
Copy link
Owner

image
image

according to your tensorboard files, many of your negatives are actually positives, that explains why L2 is close to 0 in negative mining. Are you sure that the landmark ID are unique? i.e. multiple landmark IDs do not correspond to the same vehicle class?

@hewumars
Copy link
Author

hewumars commented Jan 28, 2021

the datasets use VeRi-776 ,github. Make sure IDs are unique. negatives are not positives, but the difference is very small.
Also I have used Market1501 pedestrian dataset, It have the same problem.

image
image

@tonyngjichun
Copy link
Owner

tonyngjichun commented Jan 28, 2021

I see that you get rank 1 close to 100%, but mAP very low - this makes sense given the triplets visualised above. The network is able to tell the subtle differences as the negatives are extremely hard (in the image/landmark retrieval community we usually take these as positives); however as the network is not exposed to moderately difficult negatives as much during training like in the example below, it is less capable of ranking vehicles that are more different than the query.

I am not an expert in person/vehicle re-ID but I suppose it's sensible that in these data domains the hardest negative distances could be very close to 0, since they are quite a bit more confusing for the network to recognise than landmarks. Therefore, hardest negative sampling might not be the best choice for your dataset, you might wanna add some thresholding / include easier negatives. Moreover, judging by your triplet examples, you might want the positive and negatives to be from the same viewpoint, as now the negatives are way more closer to the anchor than the positive is, this makes the triplet loss practically impossible to minise. Therefore, if there's a viewpoint attribute from you dataset, I suggest you constrain the negative viewpoints to be as different from the anchor as the positive is, then mine from this constrained pool of negatives to find the hardest ones.

Also, would you be able to show me your test dataset? The mAP is dependant on the number of ground-truth positive labels, so any mislabelling there might impact the mAP a lot even though the rank 1 predictions are nearly perfect.

image

@hewumars
Copy link
Author

I modified test.py.

def main():
    args = parser.parse_args()

    # check if there are unknown datasets
    for dataset in args.datasets.split(','):
        if dataset not in datasets_names:
            raise ValueError('Unsupported or unknown dataset: {}!'.format(dataset))

    # check if test dataset are downloaded
    # and download if they are not
    # download_test(get_data_root())

    # setting up the visible GPU
    os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu_id

    # loading network
    net = load_network(network_name=args.network)
    net.mode = 'test'
    # x = torch.randn(1, 3, 256, 256, requires_grad=False)
    # torch.onnx.export(net, x, "solar.onnx", opset_version=12, verbose=True)

    print(">>>> loaded network: ")
    print(net.meta_repr())

    # setting up the multi-scale parameters
    ms = list(eval(args.multiscale))

    print(">>>> Evaluating scales: {}".format(ms))

    # moving network to gpu and eval mode
    net.cuda()
    net.eval()

    # set up the transform
    normalize = transforms.Normalize(
        mean=net.meta['mean'],
        std=net.meta['std']
    )
    transform = transforms.Compose([
        transforms.ToTensor(),
        normalize
    ])

    # evaluate on test datasets
    datasets = args.datasets.split(',')
    for dataset in datasets:
        start = time.time()

        print('')
        print('>> {}: Extracting...'.format(dataset))

        # prepare config structure for the test dataset
        dataset_root_path = os.path.join(get_data_root(),'test',dataset)
        images = []
        qimages = []
        images_path = os.listdir(os.path.join(dataset_root_path,'query'))
        for dir_name in images_path:
            image_paths = glob.glob(os.path.join(dataset_root_path,'query', dir_name, '*.jpg'))
            for image_path in image_paths:
                qimages.append(image_path)
        images_path = os.listdir(os.path.join(dataset_root_path,'gallery'))
        for dir_name in images_path:
            image_paths = glob.glob(os.path.join(dataset_root_path,'gallery', dir_name, '*.jpg'))
            for image_path in image_paths:
                images.append(image_path)
        try:
            # bbxs = [tuple(cfg['gnd'][i]['bbx']) for i in range(cfg['nq'])]
            bbxs = None  # for holidaysmanrot and copydays
        except:
            bbxs = None  # for holidaysmanrot and copydays

        # extract database and query vectors
        print('>> {}: database images...'.format(dataset))
        vecs = extract_vectors(net, images, args.image_size, transform, ms=ms, mode='test')
        vecs = vecs.numpy()

        print('>> {}: query images...'.format(dataset))
        qvecs = extract_vectors(net, qimages, args.image_size, transform, bbxs=bbxs, ms=ms, mode='test')
        qvecs = qvecs.numpy()

        print('>> {}: Evaluating...'.format(dataset))

        # search, rank, and print
        scores = np.dot(vecs.T, qvecs)
        ranks = np.argsort(-scores, axis=0)
        scoresT = scores.T
        ranksT = ranks.T
        top1 = 0
        top_one = 0
        mAP = 0.0
        false_alarm_num = 0
        for i in range(ranksT.shape[0]):
            t = 0
            rank = 0.0
            query_id0 = qimages[i][qimages[i].rfind('/')-4:qimages[i].rfind('/')]
            gallery_id0 = images[ranksT[i][0]][images[ranksT[i][0]].rfind('/')-4:images[ranksT[i][0]].rfind('/')]
            if query_id0 == gallery_id0:
                top1 += 1
            if query_id0 == gallery_id0 and scoresT[i][ranksT[i][0]] > 0.6:
                top_one += 1
            query_id = qimages[i][qimages[i].rfind('/')-4:qimages[i].rfind('/')]
            for j in range(ranksT.shape[1]):
                gallery_id = images[ranksT[i][j]][images[ranksT[i][j]].rfind('/')-4:images[ranksT[i][j]].rfind('/')]
                if query_id == gallery_id:
                    t += 1
                    rank += t/(j+1)
                if query_id != gallery_id and scoresT[i][ranksT[i][j]] > 0.6:
                    false_alarm_num += 1
            if t == 0:
                continue
            mAP += rank / t
            print('{}.{} AP = {}%'.format(i, query_id, rank / t * 100))
        query_num = len(qimages)
        print('TOP1 num: {}'.format(top1))
        print('TOP1 recall: {}%'.format(top1 / query_num * 100))
        print('mAP = {}%'.format(mAP / query_num * 100))
        print('accuray: {}%'.format(top_one / query_num * 100))
        print('false num: {}'.format(false_alarm_num))
        print('false rate: {}%'.format(false_alarm_num / query_num * 100))

The figure below is the similarity matrix, it's shape is [1367,11579]. When loss converges, all similarity values are closer to 1.
image
image

@hewumars
Copy link
Author

VeRi test data google drive:https://drive.google.com/file/d/1NsH8e4NbFQYxtPc6QLL0aJfsIm3A4OHf/view?usp=sharing
I initially suspected the problem of datasets generation, but no errors were found.
I am considering whether triple loss needs to be used together with classification loss to ensure retrieval accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants