Computer Vision @ University of Sussex – Spring 2018Coursework AssignmentDeadline: 11th May 2018 at...

Question

Computer Vision @ University of Sussex – Spring 2018
Coursework Assignment
Deadline: 11th May 2018 at 4PM
This assignment
ief was first released on 27th March 2018
The assignment grade, which is worth 50% of the total grade, is separated into 2 components:
a 2-page report and source code. The 2-page report should comprise:
1. 1-page summary of a research paper “Zero-Shot Learning - A Comprehensive Evaluation
of the Good, the Bad and the Ugly” (https:
arxiv.org/pdf/ XXXXXXXXXXpdf) [1];
2. 1-page summary on your implementation of a zero-shot recognition system based on
a Direct Attribute Prediction (DAP) concept (Lecture 13).
DAP system models the probability of a certain object (e.g. polar bear) being present in the
image using the probabilities of being present for each of the attributes that a bear is known
to have. For example, if we detect the attributes “white”, “fu
y”, “bulbous”, “not lean”, “not
own” etc. in the image, i.e. attributes that a polar bear is known to have, we can be fairly
confident that there is a bear in the image. Hence, we can recognize a polar bear without eve
having seen a polar bear, if
(1) we know what attributes a polar bear has, and
(2) we have classifiers trained for these attributes, using images from other object classes (i.e.
other animals).
Follow the steps below to implement a DAP-based zero-shot recognition system.
First, copy the Animals with Attributes dataset (originally appearing here1) from
http:
users.sussex.ac.uk/~nq28/Subset_of_Animals_with_Attributes2.zip (4.9GB). The
dataset includes 50 animal categories/classes, and 85 attributes. The dataset provides a 50x85
predicate-matrix-binary.txt which you should read into Matlab using
M = load(’predicate-matrix-binary.txt’); An entry (i, j)=1 in the matrix says that the
i-th class has the j-th attribute (e.g. a bear is white), and an entry of (i, j)=0 says that the
i-th class doesn’t have the j-th attribute (e.g. a bear is not white).
Image set in JPEG format is provided. You should extract SIFT (Scale-Invariant Feature
Transform) features/interest points from those images (Lecture 9). You should use Matlab’s
uilt-in functions from the Computer Vision System Toolbox to do this feature detection and
extraction step
(https:
uk.mathworks.com/help/vision/feature-detection-and-extraction.html). Fo
an example on how to extract SIFT/SURF feature detector and descriptor, type
openExample(’vision/ExtractSURFFeaturesFromAnImageExample’) in the Matlab command
window. SURF (Speeded-Up Robust Features) is simply a speed-up version of SIFT. In Mat-
lab, the default feature size is 64, to make it 128, you can set the ’SURFSize’ argument to 128
(https:
uk.mathworks.com/help/vision
ef/extractfeatures.html).
1https:
cvml.ist.ac.at/AwA2
1
Computer Vision @ University of Sussex – Spring 2018
Your zero-shot recognition system should split the object classes (not images) into a training
and test set. In this scenario, the training classes are animals that your system will see, i.e.
ones whose images the system has access to. In contrast, the test set contains classes (animals)
for which your system will never see example images. The 40 training classes are given in
trainclasses.txt and the 10 test classes are given in testclasses.txt. (Use [c1, c2]
= textread(’classes.txt’, ’%u %s’); to read in the class names. You can use the same
function but with a different second argument to read in testclasses.txt.) At each time, we will
assume that a query image can only be classified as belonging to one of the 10 unseen classes,
so chance performance (randomly guessing the label) will be 10%.
You will use all or a random sample of all images from the training classes (or rather, thei
feature descriptors) to train a classifier for each of the 85 attributes. The predicate matrix
mentioned above tells you which animals have which attributes. So if a bear is
own, you
should assign the ”
own=1” tag to all of its images. Similarly, if a dalmatian is not
own,
you should assign the tag ”
own=0” to all of its images. You will use the images tagged with
”
own=1” as the positive data in your classifier, and the images tagged with ”
own=0” as
the negative data, for the ”
own” classifier. Use the Matlab fitcsvm function to train the
classifiers. Save the model output by each attribute classifier as the j-th entry in a models cell
a
ay (initialized as models = cell(85, 1);) Note that if you sample data in such a way that you
have either no positive or no negative data for some attribute classifier, you’ll get a classifie
that only knows about one class, which is a problem. However, for every attribute, there are
some classes that do and some that don’t have the attribute. So you just have to make sure
you sample data from all classes, when training your attribute classifiers. You now have one
classifier for each attribute.
You next want to apply each attribute classifier j to each image l belonging to any of the
test classes. You want to save the probability that the j-th attribute is present in the l-th
image. To do so, you have to do one extra operation to your classifier. For each of the j
attribute classifiers, run the function fitSVMPosterior on them, i.e. call model = modelsj;
model = fitSVMPosterior(model); modelsj = model; (or if you want, run this function on
each classifier before saving it into the cell a
ay). Then to get the probability that the l-th
image contains the attribute j, call [label, scores] = predict(model, x); where x is the
feature descriptor for your image. Then scores will be a 1x2 vector, check model.ClassNames
to know which probability belongs to which class. Ensure that the probabilities sum to 1,
y calling assert(sum(scores) == 1), or if x contains the descriptors for multiple images,
assert(all(sum(scores, 2) == 1)). Save these probabilities so you can easily access them
in the next step.
You will now actually predict which animals are present in each test image. To perform
classification of a query test image, you will assign it to the test class (out of 10) whose attribute
”signature” it matches the best. How can we compute the probability that an image belongs
to some animal category? Let’s use a toy example where we only have 2 animal classes and 5
attributes. We know (from a predicate matrix like the one discussed above) that the first class
has the first, second, and fifth attributes, but does not have the third and fourth. Then the
probability that the query image (with descriptor x) belongs to this class is P (class = 1|x) =
P (attribute 1 = 1|x) × P (attribute 2 = 1|x) × P (attribute 3 = 0|x) × P (attribute 4 = 0|x) ×
P (attribute 5 = 1|x). The “|x” notation means “given x”, i.e. we compute some probability
using the image descriptor x. Let’s say the second class is known to have attributes 3 and 5,
and no others. Then the probability that the query image belongs to this class is P (class =
2|x) = P (attribute 1 = 0|x) × P (attribute 2 = 0|x) × P (attribute 3 = 1|x) × P (attribute 4 =
0|x) × P (attribute 5 = 1|x). You will assign the image with descriptor x to that class i which
2
Computer Vision @ University of Sussex – Spring 2018
gives the maximal P (class = i|x). For example, if P (class = 1|x) = 0.80 and P (class = 2|x) =
0.20, then you will assign x to class 1. You can call [ , ind] = max(probs); on a vector of
probabilities such that probs(i) is P (class = i); then ind will give you the “winning” class to
which x should be assigned. How do you compute P (attribute i = 1|x)? This is a probability
value you’ve computed already. It is just the second entry of the scores output from running
predict on the descriptor x (assuming you trained with labels of 1 and 0). If you need
P (attribute i = 0|x), that’s just the first entry of scores (or more simply, 1 - the second entry).
You will classify each test image from the 10 unseen (test) classes, and compute the average
accuracy. What to include in your submission:
1. A function [bow feature] = extract bag of visual words feature(...); that out-
puts a Kx1 a
ay (where K is the free parameter K in the K-Means clustering algorithm)
for each image. When using a SIFT/SURF feature extraction method, each image will
have a variable number of SIFT/SURF feature descriptors, usually of size 128 each. You
can aggregate all feature descriptors from multiple images, and perform K-Means cluster-
ing over all of them (first explained on 7th March 2018, will be re-explained on
10th April XXXXXXXXXXAn image can now be represented by the histogram over the K cluste
centers.
2. A function [models] = train attribute models(...); that outputs a 85x1 cell a
ay
of attribute classifier models. You are free to pass in whatever arguments you need, and
are welcome to add any additional outputs after models.
3. A function [probs attr] = compute attribute probs(...); that outputs a 85xNtest
matrix of probabilities, where Ntest is the number of test images you choose to use; and
probs attr(j, l) is the probability that the j-th attribute is present in the l-th image.
Again, use any inputs you like, and any additional outputs after the first one.
4. A function [probs class] = compute class probs(...); that outputs an 10xNtest ma-
trix of probabilities, where probs class(i, l) is the probability that the i-th class is
present in the l-th image.
5. A function [acc] = compute accuracy(probs class, ground truth class); where
ground truth class is a 1xNtest vector such that ground truth class(l) is the true
(i.e. given in the dataset) class for the l-th image. acc is a single real number denoting
the overall accuracy of your system, averaged over the Ntest test images. Also include the
overall accuracy score in your 2-page report.
2-page report guide
25 points Background
Here you should write a 1-page summary about a research paper “Zero-Shot Learning -
A Comprehensive Evaluation of the Good, the Bad and the Ugly” (https:
arxiv.org
pdf/ XXXXXXXXXXpdf) [1].
25 points Outline of methods employed
This does not have to be in depth, and I do not expect you to regurgitate the contents of
the lecture notes. You should state clearly what methods you have used, what parameters
you have used with those methods and what the purpose of these methods were. If you
3
Computer Vision @ University of Sussex – Spring 2018
have developed any of your own approaches, or you have adapted either a built in approach,
or improved on it, then you should discuss that here.
25 points Results achieved and analysis
In this section, you

tae_636616122434908271_118163_1.pdf

Abr Writing · Accepted Answer

MATLAB/classes.txt
     1	antelope
     2	grizzly+bear
     3	killer+whale
     4	beaver
     5	dalmatian
     6	persian+cat
     7	horse
     8	german+shepherd
     9	blue+whale
    10	siamese+cat
    11	skunk
    12	mole
    13	tiger
    14	hippopotamus
    15	leopard
    16	moose
    17	spider+monkey
    18	humpback+whale
    19	elephant
    20	gorilla
    21	ox
    22	fox
    23	sheep
    24	seal
    25	chimpanzee
    26	hamster
    27	squirrel
    28	rhinoceros
    29	rabbit
    30	bat
    31	giraffe
    32	wolf
    33	chihuahua
    34	rat
    35	weasel
    36	otter
    37	buffalo
    38	zebra
    39	giant+panda
    40	deer
    41	bobcat
    42	pig
    43	lion
    44	mouse
    45	polar+bear
    46	collie
    47	walrus
    48	raccoon
    49	cow
    50	dolphin
MATLAB/DAP/attributes.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('/agbs/share/datasets/Animals_with_Attributes/code/')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
    return x.split('	')[1]
def loadstr(filename,converter=str):
    return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
    return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern =  '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-%s.pic.bz2'
labels_pattern =  '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-labels.pic.bz2'
all_features = ['cq','lss','phog','sift','surf','rgsift']
attribute_matrix = 2*loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classes.txt',nameonly)
attributenames = loadstr('/agbs/share/datasets/Animals_with_Attributes/predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
    featurehist={}
    for feature in all_features:
        featurehist[feature]=[]
    
    labels=[]
    for classname in all_classes:
        class_id = classnames.index(classname)
        class_size = 0
        for feature in all_features:
            featurefilename = feature_pattern % (classname,feature)
            print '# ',featurefilename
            histfile = bzUnpickle(featurefilename)
            featurehist[feature].extend( histfile )
        
        labelfilename = labels_pattern % classname
        print '# ',labelfilename
        print '#'
        labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )
    
    for feature in all_features:
        featurehist[feature]=array(featurehist[feature]).T  # shogun likes its data matrices shaped FEATURES x SAMPLES
    
    labels = array(labels)
    return featurehist,labels
def train_attribute(attribute_id, C, split=0):
    from sg import sg
    attribute_id = int(attribute_id)
    print "# attribute ",attributenames[attribute_id]
    C = float(C)
    print "# C ", C
    
    if split == 0:
        train_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
        test_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
    else:
        classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classnames.txt')
        startid= (split-1)*10
        stopid = split*10
        test_classes = classnames[startid:stopid]
        train_classes = classnames[0:startid]+classnames[stopid:]
    
    Xtrn,Ltrn = create_data(train_classes,attribute_id)
    Xtst,Ltst = create_data(test_classes,attribute_id)
    
    if min(Ltrn) == max(Ltrn):  # only 1 class
        Lprior = mean(Ltrn)
        prediction = sign(Lprior)*ones(len(Ltst))
        probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
        return prediction,probabilities,Ltst
    
    sg('loglevel', 'WARN')
    widths={}
    for feature in all_features:
        traindata = array(Xtrn[feature][:,::50],float) # used to be 5*offset
        sg('set_distance', 'CHISQUARE', 'REAL')
        sg('clean_features', 'TRAIN')
        sg('set_features', 'TRAIN', traindata)
        sg('init_distance', 'TRAIN')
        DM=sg('get_distance_matrix')
        widths[feature] = median(DM.flatten())
        del DM
    
    sg('new_svm', 'LIBSVM')
    sg('use_mkl', False)     # we use fixed weights here
    sg('clean_features', 'TRAIN')
    sg('clean_features', 'TEST')
    
    Lplatt_trn = concatenate([Ltrn[i::10] for i in range(9)])   # 90% for training
    Lplatt_tst = Ltrn[9::10] # remaining 10% for platt scaling 
    for feature in all_features:
        Xplatt_trn = concatenate([Xtrn[feature][:,i::10] for i in range(9)], axis=1)
        sg('add_features', 'TRAIN', Xplatt_trn)
        Xplatt_tst = Xtrn[feature][:,9::10]
        sg('add_features', 'TEST', Xplatt_tst)
        del Xplatt_trn,Xplatt_tst,Xtrn[feature]
    
    sg('set_labels', 'TRAIN', Lplatt_trn)
    sg('set_kernel', 'COMBINED', 5000)
    for featureset in all_features:
        sg('add_kernel', 1., 'CHI2', 'REAL', 10, widths[featureset]/5. )
    sg('svm_max_train_time', 600*60.) # one hour should be plenty
    sg('c', C)
    sg('init_kernel', 'TRAIN')
    try:
        sg('train_classifier')
    except (RuntimeWarning,RuntimeError):    # can't train, e.g. all samples have the same labels
        Lprior = mean(Ltrn)
        prediction = sign(Lprior) * ones(len(Ltst))
        probabilities = 0.1+0.8*0.5*(Lprior+1.) * ones(len(Ltst))
        savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
        savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
        savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)
        return prediction,probabilities,Ltst
    
    [bias, alphas]=sg('get_svm')
    #print bias,alphas
    sg('init_kernel', 'TEST')
    try:
        prediction=sg('classify')
        platt_params = SigmoidTrain(prediction, Lplatt_tst)
        probabilities = SigmoidPredict(prediction, platt_params)
        
        savetxt('./DAP/cvfold%d_C%g_%02d-val.txt' % (split, C, attribute_id), prediction)
        savetxt('./DAP/cvfold%d_C%g_%02d-val.prob' % (split, C, attribute_id), probabilities)
        savetxt('./DAP/cvfold%d_C%g_%02d-val.labels' % (split, C, attribute_id), Lplatt_tst)
        savetxt('./DAP/cvfold%d_C%g_%02d-val.platt' % (split, C, attribute_id), platt_params)
        #print '#train-perf ',attribute_id,C,mean((prediction*Lplatt_tst)>0),mean(Lplatt_tst>0)
        #print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Lplatt_tst)>0),mean(Lplatt_tst>0)
    except RuntimeError:
        Lprior = mean(Ltrn)
        prediction = sign(Lprior)*ones(len(Ltst))
        probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
        print >> sys.stderr, "#Error during testing. Using constant platt scaling"
        platt_params=[1.,0.]
    
    # ----------------------------- now apply to test classes ------------------
    
    sg('clean_features', 'TEST')
    for feature in all_features:
        sg('add_features', 'TEST', Xtst[feature])
        del Xtst[feature]
    
    sg('init_kernel', 'TEST')
    prediction=sg('classify')
    probabilities = SigmoidPredict(prediction, platt_params)
    
    savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
    savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
    savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)
    
    #print '#test-perf ',attribute_id,C,mean((prediction*Ltst)>0),mean(Ltst>0)
    #print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Ltst)>0),mean(Ltst>0)
    return prediction,probabilities,Ltst
if __name__ == '__main__':
    import sys
    try:
        attribute_id = int(sys.argv[1])
    except IndexError:
        print "Must specify attribute ID!"
        raise SystemExit
    try:
        split = int(sys.argv[2])
    except IndexError:
        split = 0
    try:
        C = float(sys.argv[3])
    except IndexError:
        C = 10.
    pred,prob,Ltst = train_attribute(attribute_id,C,split)
    print "Done.", attribute_id, C, split
MATLAB/DAP/attributes.sh
#!/bin/bash
# Animals with Attributes Dataset
# Train all attribute classifiers for fixed split and regularizer 
SPLIT=0
C=10
for A in `seq 1 85` ;
do
./new-attributes.py $A $SPLIT $C
done
MATLAB/DAP/build_matfiles.m
clear all, close all
% dataset
pnam = '/agbs/share/datasets/Animals_with_Attributes';
% output
outpath = '.';
% There are 6 feature representations:
% - cq: (global) color histogram (1x1 + 2x2 + 4x4 spatial pyramid, 128 bins each, each histogram L1-normalized)
% - lss[1]: local self similarity (2000 entry codebook, raw bag-of-visual-word counts)
% - phog[2]: histogram of oriented gradients (1x1 + 2x2 + 4x4 spatial pyramid, 12 bins each, each histogram L1-normalized or all zero)
% - rgsift[3]: rgSIFT descriptors (2000 entry codebook, bag-of-visual-word counts, L1-normalized)
% - sift[4]: SIFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
% - surf[5]: SUFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
feat = {'cq','lss','phog','rgsift','sift','surf'};
nfeat = [2688,2000,252,2000,2000,2000];
% [1] E. Shechtman, and M. Irani: "Matching Local Self-Similarities 
%     across Images and Videos", CVPR 2007.
% 
% [2] A. Bosch, A. Zisserman, and X. Munoz: "Representing shape with 
%     a spatial pyramid kernel", CIVR 2007.
% 
% [3] Koen E. A. van de Sande, Theo Gevers and Cees G. M. Snoek:
%     "Evaluation of Color Descriptors for Object and Scene 
%     Recognition", CVPR 2008.
% 
% [4] D. G. Lowe, "Distinctive Image Features from Scale-Invariant 
%     Keypoints", IJCV 2004.
%     
% [5] H. Bay, T. Tuytelaars, and L. Van Gool: "SURF: Speeded Up 
%     Robust Features", ECCV 2006.
%% set some constants
% class names of all classes
[tmp,classes] = textread([pnam,'/classes.txt'],'%d %s'); clear tmp
% class names of training/test classes
trainclasses  = textread([pnam,'/trainclasses.txt'],'%s');
testclasses   = textread([pnam,'/testclasses.txt' ],'%s');
% classes(trainclasses_id) == trainclasses
trainclasses_id = -ones(length(trainclasses),1);
for i=1:length(trainclasses)
    for j=1:length(classes)
        if strcmp(trainclasses{i},classes{j})
            trainclasses_id(i) = j;
        end
    end
end
% classes(testclasses_id) == testclasses
testclasses_id = -ones(length(testclasses),1);
for i=1:length(testclasses)
    for j=1:length(classes)
        if strcmp(testclasses{i},classes{j})
            testclasses_id(i) = j;
        end
    end
end
% predicate names of all 85 predicates
[tmp,predicates] = textread([pnam,'/predicates.txt'],'%d %s');
% pca matrix: probability class-attribute pca(i,j) =  P(a_j=1|c=i)
% contains RELATIVE CONNECTION STRENGTH linearly scaled to 0..100
pca = textread([pnam,'/predicate-matrix-continuous.txt']); 
% class antelope has 4 missing values (black,white,blue,brown) => copy from lion
pca(1,1:4) = pca(43,1:4);
% derive binary matrix from continuous
pca_bin = pca > mean(pca(:)); 
% pca_bin = textread([pnam,'/predicate-matrix-binary.txt']); 
save([outpath,'/constants.mat'],'pnam','feat','nfeat','classes',...
    'trainclasses','testclasses','trainclasses_id','testclasses_id', ...
    'predicates','pca','pca_bin')
%% save Matlab files one per feature type
nperclass = zeros(length(classes),1);
for idc = 1:50
    for idf = [1:2,4:6]
        fnam = [pnam,'/Features/',feat{idf},'-hist/',classes{idc}];
        no = numel(dir(fnam))-2;
        nperclass(idc) = no;
        Xc = sparse(nfeat(idf),no);
        for ido = 1:no
            Xc(:,ido) = textread(sprintf('%s/%s_%04d.txt',fnam,classes{idc},ido),'%f');
        end
        fprintf('%s	%04d: %s
',feat{idf},ido,classes{idc})
        save(sprintf('%s/feat/x_%s_c%02d.mat',outpath,feat{idf},idc),'Xc')
    end
end
save([outpath,'/nperclass.mat'],'nperclass')
MATLAB/DAP/collect_results.m
datapath = '.';
load([datapath,'/constants.mat'])
for cvsplit = 0:5 % 1:5
    for log3_C = -13:-9 % -13:-9
        fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);   
        if exist(fnam,'file')
            data = load(fnam);
            
            % recompute predictions
            % calculate p( attribute  = j | image ) from p( train class = j | image )
            pfa_te = data.pfc_te * ( pca ./ repmat(sum(pca,2),1,85) );
            % calculate p( test class = j | image ) from p( attribute   = j | image )
            s_pcate = sum(pca(data.cte,:));
            is_pcate = zeros(size(s_pcate));  
            is_pcate(s_pcate~=0) = 1./s_pcate(s_pcate~=0); 
            pfc_pr = pfa_te * (pca(data.cte,:).*repmat(is_pcate,10,1))';
            % class assignment
            mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
            id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
            for i=1:length(ypr)
                if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
                ypr(i) = data.cte( id( mx(i,:) ) );  
            end
            acc_pr = 100*sum(ypr==data.yte)/numel(ypr);
            fprintf('split %d, C=%1.2e: Acc = %1.3f%% (%d/%d)
',...
                cvsplit,3^log3_C,acc_pr,sum(ypr==data.yte),numel(ypr))
        else
            fprintf('%s missing
',fnam)
        end
    end
end
MATLAB/DAP/constants.mat
pnam:[1x44  char array]
feat:[1x6  cell array]
nfeat:[1x6  double array]
classes:[50x1  cell array]
predicates:[85x1  cell array]
prca:[50x85  double array]
prca_bin:[50x85  uint8 (logical) array]
MATLAB/DAP/DAP_eval.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Perform Multiclass Predicition from binary attributes and evaluates it.
"""
import os,sys
sys.path.append('/agbs/cluster/chl/libs/python2.5/site-packages/')
from numpy import *
def nameonly(x):
    return x.split('	')[1]
def loadstr(filename,converter=str):
    return [converter(c.strip()) for c in file(filename).readlines()]
def loaddict(filename,converter=str):
    D={}
    for line in file(filename).readlines():
        line = line.split()
        D[line[0]] = converter(line[1].strip())
    
    return D
# adapt these paths and filenames to match local installation
classnames = loadstr('../classes.txt',nameonly)
numexamples = loaddict('numexamples.txt',int)
def evaluate(split,C):
    global test_classnames
    attributepattern = './DAP/cvfold%d_C%g_%%02d.prob' % (split,C)
    
    if split == 0:
        test_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
        train_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
    else:
        startid= (split-1)*10
        stopid = split*10
        test_classnames = classnames[startid:stopid]
        train_classnames = classnames[0:startid]+classnames[stopid:]
    
    test_classes = [ classnames.index(c) for c in test_classnames]
    train_classes = [ classnames.index(c) for c in train_classnames]
    M = loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)
    L=[]
    for c in test_classes:
        L.extend( [c]*numexamples[classnames[c]] )
    L=array(L)  # (n,)
    P = []
    for i in range(85):
        P.append(loadtxt(attributepattern % i,float))
    P = array(P).T   # (85,n)
    prior = mean(M[train_classes],axis=0)
    prior[prior==0.]=0.5
    prior[prior==1.]=0.5    # disallow degenerated priors
    M = M[test_classes] # (10,85)
    prob=[]
    for p in P:
        prob.append( prod(M*p + (1-M)*(1-p),axis=1)/prod(M*prior+(1-M)*(1-prior), axis=1) )
    MCpred = argmax( prob, axis=1 )
    
    d = len(test_classes)
    confusion=zeros([d,d])
    for pl,nl in zip(MCpred,L):
        try:
            gt = test_classes.index(nl)
            confusion[gt,pl] += 1.
        except:
            pass
    for row in confusion:
        row /= sum(row)
    
    return confusion,asarray(prob),L
def plot_confusion(confusion):
    from pylab import figure,imshow,clim,xticks,yticks,axis,setp,gray,colorbar,savefig,gca
    fig=figure(figsize=(10,9))
    imshow(confusion,interpolation='nearest',origin='upper')
    clim(0,1)
    xticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],rotation='vertical',fontsize=24)
    yticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],fontsize=24)
    axis([-.5,9.5,9.5,-.5])
    setp(gca().xaxis.get_major_ticks(), pad=18)
    setp(gca().yaxis.get_major_ticks(), pad=12)
    fig.subplots_adjust(left=0.30)
    fig.subplots_adjust(top=0.98)
    fig.subplots_adjust(right=0.98)
    fig.subplots_adjust(bottom=0.22)
    gray()
    colorbar(shrink=0.79)
    savefig('AwA-ROC-confusion-DAP.pdf')
    return 
def plot_roc(P,GT):
    from pylab import figure,xticks,yticks,axis,setp,gray,colorbar,savefig,gca,clf,plot,legend,xlabel,ylabel
    from roc import roc
    AUC=[]
    CURVE=[]
    for i,c in enumerate(test_classnames):
        class_id = classnames.index(c)
        tp,fp,auc=roc(None,GT==class_id,  P[:,i] ) # larger is better
        print "AUC: %s %5.3f" % (c,auc)
        AUC.append(auc)
        CURVE.append(array([fp,tp]))
    order = argsort(AUC)[::-1]
    styles=['-','-','-','-','-','-','-','--','--','--']
    figure(figsize=(9,5))
    for i in order:
        c = test_classnames[i]
        plot(CURVE[i][0],CURVE[i][1],label='%s (AUC: %3.2f)' % (c,AUC[i]),linewidth=3,linestyle=styles[i])
    
    legend(loc='lower right')
    xticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
    yticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
    xlabel('false negative rate',fontsize=18)
    ylabel('true positive rate',fontsize=18)
    savefig('AwA-ROC-DAP.pdf')
def main():
    try:
        split = int(sys.argv[1])
    except IndexError:
        split = 0
    try:
        C = float(sys.argv[2])
    except IndexError:
        C = 10.
    confusion,prob,L = evaluate(split,C)
    print "Mean class accuracy %g" % mean(diag(confusion)*100)
    plot_confusion(confusion) 
    plot_roc(prob,L)
    
if __name__ == '__main__':
    main()
MATLAB/DAP/liblinear_cv5.m
function liblinear_cv5(cvsplit,log3_C)
% path to liblinear
addpath /agbs/cluster/hn/mpi_animal_challenge/lib/liblinear-1.33/matlab
% path to Matlab feature representation
datapath = '/kyb/agbs/chl/mysrc/Animals-with-Attributes/code';
% build training-testing split
if cvsplit==0
    % get original split
    tmp = load([datapath,'/constants.mat'],'trainclasses_id','testclasses_id');
    cte = tmp.testclasses_id';
    ctr = tmp.trainclasses_id';
    clear tmp
else
    % build training-testing split
    cte = (cvsplit-1)*10+(1:10); % test classes
    ctr = setdiff(1:50,cte);     % training classes
end
load([datapath,'/constants.mat'])
%% load training data (40 classes)
fprintf('Load training set
') 
Xtr = []; ytr = [];
for idc = ctr % 40 classes
    Xc = [];
    for idf = 1:6 % 6 features       
        data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
        Xc = [Xc; data.Xc];       
    end
    Xtr = [Xtr,Xc];
    ytr = [ytr; idc*ones(size(Xc,2),1)];
    fprintf('  %s(%d)
',classes{idc},size(Xc,2))
end, Xtr = Xtr';
% train model
fprintf('Learning
')   
% logistic regression
C = 3^log3_C;
argstr = sprintf('-s 0 -c %f',C);
model  = train(ytr, Xtr, argstr);
%% make prediction on training data
tic
[l,acc_tr,p]  = predict(ytr, Xtr, model, '-b 1');
T = toc;
fprintf('training took %1.2f s
',T)
pfc_tr = zeros(length(l),50); pfc_tr(:,model.Label) = p; % full 50 matrix
%% load test data (10 classes)
fprintf('Load test set
') 
Xte = []; yte = [];
for idc = cte % 10 classes
    Xc = [];
    for idf = 1:6 % 6 features
        data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
        Xc = [Xc; data.Xc];       
    end
    Xte = [Xte,Xc];
    yte = [yte; idc*ones(size(Xc,2),1)];
    fprintf('  %s(%d)
',classes{idc},size(Xc,2))
end, Xte = Xte';
%% predict train classes on test data
[l,acc_te,p]  = predict(yte, Xte, model, '-b 1');
pfc_te = zeros(length(l),50); pfc_te(:,model.Label) = p; % full 50 matrix
%% predict test classes on test data
% calculate p( attribute  = j | image ) from p( train class = j | image )
pfa_te = pfc_te * ( prca ./ repmat(sum(prca,2),1,85) );
% calculate p( test class = j | image ) from p( attribute   = j | image )
pfc_pr = pfa_te * (prca(cte,:)./repmat(sum(prca(cte,:)),10,1))';
% class assignment
mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
for i=1:length(ypr)
    if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
    ypr(i) = cte( id( mx(i,:) ) );  
end
acc_pr = 100*sum(ypr==yte)/numel(ypr);
fprintf('Accuracy = %1.4f%% (%d/%d)
',acc_pr,sum(ypr==yte),numel(ypr))
% save results
fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);
save(fnam,'cvsplit','log3_C','argstr','C','acc_tr','acc_pr',...
    'ctr','cte','pfc_tr','pfc_te','pfc_pr','ytr','yte','ypr')
MATLAB/DAP/new-attributes.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('./')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
    return x.split('	')[1]
def loadstr(filename,converter=str):
    return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
    return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern =  './feat/%s-%s.pic.bz2'
labels_pattern =  './feat/%s-labels.pic.bz2'
all_features = ['cq']
attribute_matrix = 2*loadtxt('../predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('../classes.txt',nameonly)
attributenames = loadstr('../predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
    featurehist={}
    for feature in all_features:
        featurehist[feature]=[]
    
    labels=[]
    for classname in all_classes:
        class_id = classnames.index(classname)
        class_size = 0
        for feature in all_features:
            featurefilename = feature_pattern % (classname,feature)
            print '# ',featurefilename
            histfile = bzUnpickle(featurefilename)
            featurehist[feature].extend( histfile )
        
        labelfilename = labels_pattern % classname
        print '# ',labelfilename
        print '#'
        labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )
    
    for feature in all_features:
        featurehist[feature]=array(featurehist[feature]).T  # shogun likes its data matrices shaped FEATURES x SAMPLES
    
    labels = array(labels)
    return featurehist,labels
def train_attribute(attribute_id, C, split=0):
    from shogun import Classifier,Features,Kernel,

Computer Vision @ University of Sussex – Spring 2018 Coursework Assignment Deadline: 11th May 2018 at 4PM This assignment brief was first released on 27th March 2018 The assignment grade, which is...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment