Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Computer Vision @ University of Sussex – Spring 2018 Coursework Assignment Deadline: 11th May 2018 at 4PM This assignment brief was first released on 27th March 2018 The assignment grade, which is...

1 answer below »
Computer Vision @ University of Sussex – Spring 2018
Coursework Assignment
Deadline: 11th May 2018 at 4PM
This assignment
ief was first released on 27th March 2018
The assignment grade, which is worth 50% of the total grade, is separated into 2 components:
a 2-page report and source code. The 2-page report should comprise:
1. 1-page summary of a research paper “Zero-Shot Learning - A Comprehensive Evaluation
of the Good, the Bad and the Ugly” (https:
arxiv.org/pdf/ XXXXXXXXXXpdf) [1];
2. 1-page summary on your implementation of a zero-shot recognition system based on
a Direct Attribute Prediction (DAP) concept (Lecture 13).
DAP system models the probability of a certain object (e.g. polar bear) being present in the
image using the probabilities of being present for each of the attributes that a bear is known
to have. For example, if we detect the attributes “white”, “fu
y”, “bulbous”, “not lean”, “not
own” etc. in the image, i.e. attributes that a polar bear is known to have, we can be fairly
confident that there is a bear in the image. Hence, we can recognize a polar bear without eve
having seen a polar bear, if
(1) we know what attributes a polar bear has, and
(2) we have classifiers trained for these attributes, using images from other object classes (i.e.
other animals).
Follow the steps below to implement a DAP-based zero-shot recognition system.
First, copy the Animals with Attributes dataset (originally appearing here1) from
http:
users.sussex.ac.uk/~nq28/Subset_of_Animals_with_Attributes2.zip (4.9GB). The
dataset includes 50 animal categories/classes, and 85 attributes. The dataset provides a 50x85
predicate-matrix-binary.txt which you should read into Matlab using
M = load(’predicate-matrix-binary.txt’); An entry (i, j)=1 in the matrix says that the
i-th class has the j-th attribute (e.g. a bear is white), and an entry of (i, j)=0 says that the
i-th class doesn’t have the j-th attribute (e.g. a bear is not white).
Image set in JPEG format is provided. You should extract SIFT (Scale-Invariant Feature
Transform) features/interest points from those images (Lecture 9). You should use Matlab’s
uilt-in functions from the Computer Vision System Toolbox to do this feature detection and
extraction step
(https:
uk.mathworks.com/help/vision/feature-detection-and-extraction.html). Fo
an example on how to extract SIFT/SURF feature detector and descriptor, type
openExample(’vision/ExtractSURFFeaturesFromAnImageExample’) in the Matlab command
window. SURF (Speeded-Up Robust Features) is simply a speed-up version of SIFT. In Mat-
lab, the default feature size is 64, to make it 128, you can set the ’SURFSize’ argument to 128
(https:
uk.mathworks.com/help/vision
ef/extractfeatures.html).
1https:
cvml.ist.ac.at/AwA2
1
Computer Vision @ University of Sussex – Spring 2018
Your zero-shot recognition system should split the object classes (not images) into a training
and test set. In this scenario, the training classes are animals that your system will see, i.e.
ones whose images the system has access to. In contrast, the test set contains classes (animals)
for which your system will never see example images. The 40 training classes are given in
trainclasses.txt and the 10 test classes are given in testclasses.txt. (Use [c1, c2]
= textread(’classes.txt’, ’%u %s’); to read in the class names. You can use the same
function but with a different second argument to read in testclasses.txt.) At each time, we will
assume that a query image can only be classified as belonging to one of the 10 unseen classes,
so chance performance (randomly guessing the label) will be 10%.
You will use all or a random sample of all images from the training classes (or rather, thei
feature descriptors) to train a classifier for each of the 85 attributes. The predicate matrix
mentioned above tells you which animals have which attributes. So if a bear is
own, you
should assign the ”
own=1” tag to all of its images. Similarly, if a dalmatian is not
own,
you should assign the tag ”
own=0” to all of its images. You will use the images tagged with

own=1” as the positive data in your classifier, and the images tagged with ”
own=0” as
the negative data, for the ”
own” classifier. Use the Matlab fitcsvm function to train the
classifiers. Save the model output by each attribute classifier as the j-th entry in a models cell
a
ay (initialized as models = cell(85, 1);) Note that if you sample data in such a way that you
have either no positive or no negative data for some attribute classifier, you’ll get a classifie
that only knows about one class, which is a problem. However, for every attribute, there are
some classes that do and some that don’t have the attribute. So you just have to make sure
you sample data from all classes, when training your attribute classifiers. You now have one
classifier for each attribute.
You next want to apply each attribute classifier j to each image l belonging to any of the
test classes. You want to save the probability that the j-th attribute is present in the l-th
image. To do so, you have to do one extra operation to your classifier. For each of the j
attribute classifiers, run the function fitSVMPosterior on them, i.e. call model = modelsj;
model = fitSVMPosterior(model); modelsj = model; (or if you want, run this function on
each classifier before saving it into the cell a
ay). Then to get the probability that the l-th
image contains the attribute j, call [label, scores] = predict(model, x); where x is the
feature descriptor for your image. Then scores will be a 1x2 vector, check model.ClassNames
to know which probability belongs to which class. Ensure that the probabilities sum to 1,
y calling assert(sum(scores) == 1), or if x contains the descriptors for multiple images,
assert(all(sum(scores, 2) == 1)). Save these probabilities so you can easily access them
in the next step.
You will now actually predict which animals are present in each test image. To perform
classification of a query test image, you will assign it to the test class (out of 10) whose attribute
”signature” it matches the best. How can we compute the probability that an image belongs
to some animal category? Let’s use a toy example where we only have 2 animal classes and 5
attributes. We know (from a predicate matrix like the one discussed above) that the first class
has the first, second, and fifth attributes, but does not have the third and fourth. Then the
probability that the query image (with descriptor x) belongs to this class is P (class = 1|x) =
P (attribute 1 = 1|x) × P (attribute 2 = 1|x) × P (attribute 3 = 0|x) × P (attribute 4 = 0|x) ×
P (attribute 5 = 1|x). The “|x” notation means “given x”, i.e. we compute some probability
using the image descriptor x. Let’s say the second class is known to have attributes 3 and 5,
and no others. Then the probability that the query image belongs to this class is P (class =
2|x) = P (attribute 1 = 0|x) × P (attribute 2 = 0|x) × P (attribute 3 = 1|x) × P (attribute 4 =
0|x) × P (attribute 5 = 1|x). You will assign the image with descriptor x to that class i which
2
Computer Vision @ University of Sussex – Spring 2018
gives the maximal P (class = i|x). For example, if P (class = 1|x) = 0.80 and P (class = 2|x) =
0.20, then you will assign x to class 1. You can call [ , ind] = max(probs); on a vector of
probabilities such that probs(i) is P (class = i); then ind will give you the “winning” class to
which x should be assigned. How do you compute P (attribute i = 1|x)? This is a probability
value you’ve computed already. It is just the second entry of the scores output from running
predict on the descriptor x (assuming you trained with labels of 1 and 0). If you need
P (attribute i = 0|x), that’s just the first entry of scores (or more simply, 1 - the second entry).
You will classify each test image from the 10 unseen (test) classes, and compute the average
accuracy. What to include in your submission:
1. A function [bow feature] = extract bag of visual words feature(...); that out-
puts a Kx1 a
ay (where K is the free parameter K in the K-Means clustering algorithm)
for each image. When using a SIFT/SURF feature extraction method, each image will
have a variable number of SIFT/SURF feature descriptors, usually of size 128 each. You
can aggregate all feature descriptors from multiple images, and perform K-Means cluster-
ing over all of them (first explained on 7th March 2018, will be re-explained on
10th April XXXXXXXXXXAn image can now be represented by the histogram over the K cluste
centers.
2. A function [models] = train attribute models(...); that outputs a 85x1 cell a
ay
of attribute classifier models. You are free to pass in whatever arguments you need, and
are welcome to add any additional outputs after models.
3. A function [probs attr] = compute attribute probs(...); that outputs a 85xNtest
matrix of probabilities, where Ntest is the number of test images you choose to use; and
probs attr(j, l) is the probability that the j-th attribute is present in the l-th image.
Again, use any inputs you like, and any additional outputs after the first one.
4. A function [probs class] = compute class probs(...); that outputs an 10xNtest ma-
trix of probabilities, where probs class(i, l) is the probability that the i-th class is
present in the l-th image.
5. A function [acc] = compute accuracy(probs class, ground truth class); where
ground truth class is a 1xNtest vector such that ground truth class(l) is the true
(i.e. given in the dataset) class for the l-th image. acc is a single real number denoting
the overall accuracy of your system, averaged over the Ntest test images. Also include the
overall accuracy score in your 2-page report.
2-page report guide
25 points Background
Here you should write a 1-page summary about a research paper “Zero-Shot Learning -
A Comprehensive Evaluation of the Good, the Bad and the Ugly” (https:
arxiv.org
pdf/ XXXXXXXXXXpdf) [1].
25 points Outline of methods employed
This does not have to be in depth, and I do not expect you to regurgitate the contents of
the lecture notes. You should state clearly what methods you have used, what parameters
you have used with those methods and what the purpose of these methods were. If you
3
Computer Vision @ University of Sussex – Spring 2018
have developed any of your own approaches, or you have adapted either a built in approach,
or improved on it, then you should discuss that here.
25 points Results achieved and analysis
In this section, you
Answered Same Day May 11, 2020

Solution

Abr Writing answered on May 17 2020
144 Votes
MATLAB/classes.txt
1    antelope
2    grizzly+bea
3    killer+whale
4    beave
5    dalmatian
6    persian+cat
7    horse
8    german+shepherd
9    blue+whale
10    siamese+cat
11    skunk
12    mole
13    tige
14    hippopotamus
15    leopard
16    moose
17    spider+monkey
18    humpback+whale
19    elephant
20    gorilla
21    ox
22    fox
23    sheep
24    seal
25    chimpanzee
26    hamste
27    squi
el
28    rhinoceros
29    ra
it
30    bat
31    giraffe
32    wolf
33    chihuahua
34    rat
35    weasel
36    otte
37    buffalo
38    ze
a
39    giant+panda
40    dee
41    bobcat
42    pig
43    lion
44    mouse
45    polar+bea
46    collie
47    walrus
48    raccoon
49    cow
50    dolphin
MATLAB/DAP/attributes.py
#!/us
in/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('/agbs/share/datasets/Animals_with_Attributes/code/')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern = '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-%s.pic.bz2'
labels_pattern = '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-labels.pic.bz2'
all_features = ['cq','lss','phog','sift','surf','rgsift']
attribute_matrix = 2*loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classes.txt',nameonly)
attributenames = loadstr('/agbs/share/datasets/Animals_with_Attributes/predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
featurehist={}
for feature in all_features:
featurehist[feature]=[]

labels=[]
for classname in all_classes:
class_id = classnames.index(classname)
class_size = 0
for feature in all_features:
featurefilename = feature_pattern % (classname,feature)
print '# ',featurefilename
histfile = bzUnpickle(featurefilename)
featurehist[feature].extend( histfile )

labelfilename = labels_pattern % classname
print '# ',labelfilename
print '#'
labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )

for feature in all_features:
featurehist[feature]=a
ay(featurehist[feature]).T # shogun likes its data matrices shaped FEATURES x SAMPLES

labels = a
ay(labels)
return featurehist,labels
def train_attribute(attribute_id, C, split=0):
from sg import sg
attribute_id = int(attribute_id)
print "# attribute ",attributenames[attribute_id]
C = float(C)
print "# C ", C

if split == 0:
train_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
test_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
else:
classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classnames.txt')
startid= (split-1)*10
stopid = split*10
test_classes = classnames[startid:stopid]
train_classes = classnames[0:startid]+classnames[stopid:]

Xtrn,Ltrn = create_data(train_classes,attribute_id)
Xtst,Ltst = create_data(test_classes,attribute_id)

if min(Ltrn) == max(Ltrn): # only 1 class
Lprior = mean(Ltrn)
prediction = sign(Lprior)*ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
return prediction,probabilities,Ltst

sg('loglevel', 'WARN')
widths={}
for feature in all_features:
traindata = a
ay(Xtrn[feature][:,::50],float) # used to be 5*offset
sg('set_distance', 'CHISQUARE', 'REAL')
sg('clean_features', 'TRAIN')
sg('set_features', 'TRAIN', traindata)
sg('init_distance', 'TRAIN')
DM=sg('get_distance_matrix')
widths[feature] = median(DM.flatten())
del DM

sg('new_svm', 'LIBSVM')
sg('use_mkl', False) # we use fixed weights here
sg('clean_features', 'TRAIN')
sg('clean_features', 'TEST')

Lplatt_trn = concatenate([Ltrn[i::10] for i in range(9)]) # 90% for training
Lplatt_tst = Ltrn[9::10] # remaining 10% for platt scaling
for feature in all_features:
Xplatt_trn = concatenate([Xtrn[feature][:,i::10] for i in range(9)], axis=1)
sg('add_features', 'TRAIN', Xplatt_trn)
Xplatt_tst = Xtrn[feature][:,9::10]
sg('add_features', 'TEST', Xplatt_tst)
del Xplatt_trn,Xplatt_tst,Xtrn[feature]

sg('set_labels', 'TRAIN', Lplatt_trn)
sg('set_kernel', 'COMBINED', 5000)
for featureset in all_features:
sg('add_kernel', 1., 'CHI2', 'REAL', 10, widths[featureset]/5. )
sg('svm_max_train_time', 600*60.) # one hour should be plenty
sg('c', C)
sg('init_kernel', 'TRAIN')
try:
sg('train_classifier')
except (RuntimeWarning,RuntimeE
or): # can't train, e.g. all samples have the same labels
Lprior = mean(Ltrn)
prediction = sign(Lprior) * ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.) * ones(len(Ltst))
savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)
return prediction,probabilities,Ltst

[bias, alphas]=sg('get_svm')
#print bias,alphas
sg('init_kernel', 'TEST')
try:
prediction=sg('classify')
platt_params = SigmoidTrain(prediction, Lplatt_tst)
probabilities = SigmoidPredict(prediction, platt_params)

savetxt('./DAP/cvfold%d_C%g_%02d-val.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d-val.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d-val.labels' % (split, C, attribute_id), Lplatt_tst)
savetxt('./DAP/cvfold%d_C%g_%02d-val.platt' % (split, C, attribute_id), platt_params)
#print '#train-perf ',attribute_id,C,mean((prediction*Lplatt_tst)>0),mean(Lplatt_tst>0)
#print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Lplatt_tst)>0),mean(Lplatt_tst>0)
except RuntimeE
or:
Lprior = mean(Ltrn)
prediction = sign(Lprior)*ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
print
sys.stde
, "#E
or during testing. Using constant platt scaling"
platt_params=[1.,0.]

# ----------------------------- now apply to test classes ------------------

sg('clean_features', 'TEST')
for feature in all_features:
sg('add_features', 'TEST', Xtst[feature])
del Xtst[feature]

sg('init_kernel', 'TEST')
prediction=sg('classify')
probabilities = SigmoidPredict(prediction, platt_params)

savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)

#print '#test-perf ',attribute_id,C,mean((prediction*Ltst)>0),mean(Ltst>0)
#print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Ltst)>0),mean(Ltst>0)
return prediction,probabilities,Ltst
if __name__ == '__main__':
import sys
try:
attribute_id = int(sys.argv[1])
except IndexE
or:
print "Must specify attribute ID!"
raise SystemExit
try:
split = int(sys.argv[2])
except IndexE
or:
split = 0
try:
C = float(sys.argv[3])
except IndexE
or:
C = 10.
pred,prob,Ltst = train_attribute(attribute_id,C,split)
print "Done.", attribute_id, C, split
MATLAB/DAP/attributes.sh
#!
in
ash
# Animals with Attributes Dataset
# Train all attribute classifiers for fixed split and regularizer
SPLIT=0
C=10
for A in `seq 1 85` ;
do
./new-attributes.py $A $SPLIT $C
done
MATLAB/DAP
uild_matfiles.m
clear all, close all
% dataset
pnam = '/agbs/share/datasets/Animals_with_Attributes';
% output
outpath = '.';
% There are 6 feature representations:
% - cq: (global) color histogram (1x1 + 2x2 + 4x4 spatial pyramid, 128 bins each, each histogram L1-normalized)
% - lss[1]: local self similarity (2000 entry codebook, raw bag-of-visual-word counts)
% - phog[2]: histogram of oriented gradients (1x1 + 2x2 + 4x4 spatial pyramid, 12 bins each, each histogram L1-normalized or all zero)
% - rgsift[3]: rgSIFT descriptors (2000 entry codebook, bag-of-visual-word counts, L1-normalized)
% - sift[4]: SIFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
% - surf[5]: SUFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
feat = {'cq','lss','phog','rgsift','sift','surf'};
nfeat = [2688,2000,252,2000,2000,2000];
% [1] E. Shechtman, and M. Irani: "Matching Local Self-Similarities
% across Images and Videos", CVPR 2007.
%
% [2] A. Bosch, A. Zisserman, and X. Munoz: "Representing shape with
% a spatial pyramid kernel", CIVR 2007.
%
% [3] Koen E. A. van de Sande, Theo Gevers and Cees G. M. Snoek:
% "Evaluation of Color Descriptors for Object and Scene
% Recognition", CVPR 2008.
%
% [4] D. G. Lowe, "Distinctive Image Features from Scale-Invariant
% Keypoints", IJCV 2004.
%
% [5] H. Bay, T. Tuytelaars, and L. Van Gool: "SURF: Speeded Up
% Robust Features", ECCV 2006.
%% set some constants
% class names of all classes
[tmp,classes] = textread([pnam,'/classes.txt'],'%d %s'); clear tmp
% class names of training/test classes
trainclasses = textread([pnam,'/trainclasses.txt'],'%s');
testclasses = textread([pnam,'/testclasses.txt' ],'%s');
% classes(trainclasses_id) == trainclasses
trainclasses_id = -ones(length(trainclasses),1);
for i=1:length(trainclasses)
for j=1:length(classes)
if strcmp(trainclasses{i},classes{j})
trainclasses_id(i) = j;
end
end
end
% classes(testclasses_id) == testclasses
testclasses_id = -ones(length(testclasses),1);
for i=1:length(testclasses)
for j=1:length(classes)
if strcmp(testclasses{i},classes{j})
testclasses_id(i) = j;
end
end
end
% predicate names of all 85 predicates
[tmp,predicates] = textread([pnam,'/predicates.txt'],'%d %s');
% pca matrix: probability class-attribute pca(i,j) = P(a_j=1|c=i)
% contains RELATIVE CONNECTION STRENGTH linearly scaled to 0..100
pca = textread([pnam,'/predicate-matrix-continuous.txt']);
% class antelope has 4 missing values (black,white,blue,
own) => copy from lion
pca(1,1:4) = pca(43,1:4);
% derive binary matrix from continuous
pca_bin = pca > mean(pca(:));
% pca_bin = textread([pnam,'/predicate-matrix-binary.txt']);
save([outpath,'/constants.mat'],'pnam','feat','nfeat','classes',...
'trainclasses','testclasses','trainclasses_id','testclasses_id', ...
'predicates','pca','pca_bin')
%% save Matlab files one per feature type
nperclass = zeros(length(classes),1);
for idc = 1:50
for idf = [1:2,4:6]
fnam = [pnam,'/Features/',feat{idf},'-hist/',classes{idc}];
no = numel(dir(fnam))-2;
nperclass(idc) = no;
Xc = sparse(nfeat(idf),no);
for ido = 1:no
Xc(:,ido) = textread(sprintf('%s/%s_%04d.txt',fnam,classes{idc},ido),'%f');
end
fprintf('%s\t%04d: %s\n',feat{idf},ido,classes{idc})
save(sprintf('%s/feat/x_%s_c%02d.mat',outpath,feat{idf},idc),'Xc')
end
end
save([outpath,'/nperclass.mat'],'nperclass')
MATLAB/DAP/collect_results.m
datapath = '.';
load([datapath,'/constants.mat'])
for cvsplit = 0:5 % 1:5
for log3_C = -13:-9 % -13:-9
fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);
if exist(fnam,'file')
data = load(fnam);

% recompute predictions
% calculate p( attribute = j | image ) from p( train class = j | image )
pfa_te = data.pfc_te * ( pca ./ repmat(sum(pca,2),1,85) );
% calculate p( test class = j | image ) from p( attribute = j | image )
s_pcate = sum(pca(data.cte,:));
is_pcate = zeros(size(s_pcate));
is_pcate(s_pcate~=0) = 1./s_pcate(s_pcate~=0);
pfc_pr = pfa_te * (pca(data.cte,:).*repmat(is_pcate,10,1))';
% class assignment
mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
for i=1:length(ypr)
if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
ypr(i) = data.cte( id( mx(i,:) ) );
end
acc_pr = 100*sum(ypr==data.yte)/numel(ypr);
fprintf('split %d, C=%1.2e: Acc = %1.3f%% (%d/%d)\n',...
cvsplit,3^log3_C,acc_pr,sum(ypr==data.yte),numel(ypr))
else
fprintf('%s missing\n',fnam)
end
end
end
MATLAB/DAP/constants.mat
pnam:[1x44 char a
ay]
feat:[1x6 cell a
ay]
nfeat:[1x6 double a
ay]
classes:[50x1 cell a
ay]
predicates:[85x1 cell a
ay]
prca:[50x85 double a
ay]
prca_bin:[50x85 uint8 (logical) a
ay]
MATLAB/DAP/DAP_eval.py
#!/us
in/env python
"""
Animals with Attributes Dataset
Perform Multiclass Predicition from binary attributes and evaluates it.
"""
import os,sys
sys.path.append('/agbs/cluste
chl/libs/python2.5/site-packages/')
from numpy import *
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def loaddict(filename,converter=str):
D={}
for line in file(filename).readlines():
line = line.split()
D[line[0]] = converter(line[1].strip())

return D
# adapt these paths and filenames to match local installation
classnames = loadstr('../classes.txt',nameonly)
numexamples = loaddict('numexamples.txt',int)
def evaluate(split,C):
global test_classnames
attributepattern = './DAP/cvfold%d_C%g_%%02d.prob' % (split,C)

if split == 0:
test_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
train_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
else:
startid= (split-1)*10
stopid = split*10
test_classnames = classnames[startid:stopid]
train_classnames = classnames[0:startid]+classnames[stopid:]

test_classes = [ classnames.index(c) for c in test_classnames]
train_classes = [ classnames.index(c) for c in train_classnames]
M = loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)
L=[]
for c in test_classes:
L.extend( [c]*numexamples[classnames[c]] )
L=a
ay(L) # (n,)
P = []
for i in range(85):
P.append(loadtxt(attributepattern % i,float))
P = a
ay(P).T # (85,n)
prior = mean(M[train_classes],axis=0)
prior[prior==0.]=0.5
prior[prior==1.]=0.5 # disallow degenerated priors
M = M[test_classes] # (10,85)
prob=[]
for p in P:
prob.append( prod(M*p + (1-M)*(1-p),axis=1)/prod(M*prior+(1-M)*(1-prior), axis=1) )
MCpred = argmax( prob, axis=1 )

d = len(test_classes)
confusion=zeros([d,d])
for pl,nl in zip(MCpred,L):
try:
gt = test_classes.index(nl)
confusion[gt,pl] += 1.
except:
pass
for row in confusion:
row /= sum(row)

return confusion,asa
ay(prob),L
def plot_confusion(confusion):
from pylab import figure,imshow,clim,xticks,yticks,axis,setp,gray,colo
ar,savefig,gca
fig=figure(figsize=(10,9))
imshow(confusion,interpolation='nearest',origin='upper')
clim(0,1)
xticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],rotation='vertical',fontsize=24)
yticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],fontsize=24)
axis([-.5,9.5,9.5,-.5])
setp(gca().xaxis.get_major_ticks(), pad=18)
setp(gca().yaxis.get_major_ticks(), pad=12)
fig.subplots_adjust(left=0.30)
fig.subplots_adjust(top=0.98)
fig.subplots_adjust(right=0.98)
fig.subplots_adjust(bottom=0.22)
gray()
colo
ar(shrink=0.79)
savefig('AwA-ROC-confusion-DAP.pdf')
return
def plot_roc(P,GT):
from pylab import figure,xticks,yticks,axis,setp,gray,colo
ar,savefig,gca,clf,plot,legend,xlabel,ylabel
from roc import roc
AUC=[]
CURVE=[]
for i,c in enumerate(test_classnames):
class_id = classnames.index(c)
tp,fp,auc=roc(None,GT==class_id, P[:,i] ) # larger is bette
print "AUC: %s %5.3f" % (c,auc)
AUC.append(auc)
CURVE.append(a
ay([fp,tp]))
order = argsort(AUC)[::-1]
styles=['-','-','-','-','-','-','-','--','--','--']
figure(figsize=(9,5))
for i in order:
c = test_classnames[i]
plot(CURVE[i][0],CURVE[i][1],label='%s (AUC: %3.2f)' % (c,AUC[i]),linewidth=3,linestyle=styles[i])

legend(loc='lower right')
xticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
yticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
xlabel('false negative rate',fontsize=18)
ylabel('true positive rate',fontsize=18)
savefig('AwA-ROC-DAP.pdf')
def main():
try:
split = int(sys.argv[1])
except IndexE
or:
split = 0
try:
C = float(sys.argv[2])
except IndexE
or:
C = 10.
confusion,prob,L = evaluate(split,C)
print "Mean class accuracy %g" % mean(diag(confusion)*100)
plot_confusion(confusion)
plot_roc(prob,L)

if __name__ == '__main__':
main()
MATLAB/DAP/liblinear_cv5.m
function liblinear_cv5(cvsplit,log3_C)
% path to liblinea
addpath /agbs/cluste
hn/mpi_animal_challenge/li
liblinear-1.33/matla
% path to Matlab feature representation
datapath = '/ky
agbs/chl/mysrc/Animals-with-Attributes/code';
% build training-testing split
if cvsplit==0
% get original split
tmp = load([datapath,'/constants.mat'],'trainclasses_id','testclasses_id');
cte = tmp.testclasses_id';
ctr = tmp.trainclasses_id';
clear tmp
else
% build training-testing split
cte = (cvsplit-1)*10+(1:10); % test classes
ctr = setdiff(1:50,cte); % training classes
end
load([datapath,'/constants.mat'])
%% load training data (40 classes)
fprintf('Load training set\n')
Xtr = []; ytr = [];
for idc = ctr % 40 classes
Xc = [];
for idf = 1:6 % 6 features
data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
Xc = [Xc; data.Xc];
end
Xtr = [Xtr,Xc];
ytr = [ytr; idc*ones(size(Xc,2),1)];
fprintf(' %s(%d)\n',classes{idc},size(Xc,2))
end, Xtr = Xtr';
% train model
fprintf('Learning\n')
% logistic regression
C = 3^log3_C;
argstr = sprintf('-s 0 -c %f',C);
model = train(ytr, Xtr, argstr);
%% make prediction on training data
tic
[l,acc_tr,p] = predict(ytr, Xtr, model, '-b 1');
T = toc;
fprintf('training took %1.2f s\n',T)
pfc_tr = zeros(length(l),50); pfc_tr(:,model.Label) = p; % full 50 matrix
%% load test data (10 classes)
fprintf('Load test set\n')
Xte = []; yte = [];
for idc = cte % 10 classes
Xc = [];
for idf = 1:6 % 6 features
data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
Xc = [Xc; data.Xc];
end
Xte = [Xte,Xc];
yte = [yte; idc*ones(size(Xc,2),1)];
fprintf(' %s(%d)\n',classes{idc},size(Xc,2))
end, Xte = Xte';
%% predict train classes on test data
[l,acc_te,p] = predict(yte, Xte, model, '-b 1');
pfc_te = zeros(length(l),50); pfc_te(:,model.Label) = p; % full 50 matrix
%% predict test classes on test data
% calculate p( attribute = j | image ) from p( train class = j | image )
pfa_te = pfc_te * ( prca ./ repmat(sum(prca,2),1,85) );
% calculate p( test class = j | image ) from p( attribute = j | image )
pfc_pr = pfa_te * (prca(cte,:).
epmat(sum(prca(cte,:)),10,1))';
% class assignment
mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
for i=1:length(ypr)
if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
ypr(i) = cte( id( mx(i,:) ) );
end
acc_pr = 100*sum(ypr==yte)/numel(ypr);
fprintf('Accuracy = %1.4f%% (%d/%d)\n',acc_pr,sum(ypr==yte),numel(ypr))
% save results
fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);
save(fnam,'cvsplit','log3_C','argstr','C','acc_tr','acc_pr',...
'ctr','cte','pfc_tr','pfc_te','pfc_pr','ytr','yte','ypr')
MATLAB/DAP/new-attributes.py
#!/us
in/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('./')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern = './feat/%s-%s.pic.bz2'
labels_pattern = './feat/%s-labels.pic.bz2'
all_features = ['cq']
attribute_matrix = 2*loadtxt('../predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('../classes.txt',nameonly)
attributenames = loadstr('../predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
featurehist={}
for feature in all_features:
featurehist[feature]=[]

labels=[]
for classname in all_classes:
class_id = classnames.index(classname)
class_size = 0
for feature in all_features:
featurefilename = feature_pattern % (classname,feature)
print '# ',featurefilename
histfile = bzUnpickle(featurefilename)
featurehist[feature].extend( histfile )

labelfilename = labels_pattern % classname
print '# ',labelfilename
print '#'
labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )

for feature in all_features:
featurehist[feature]=a
ay(featurehist[feature]).T # shogun likes its data matrices shaped FEATURES x SAMPLES

labels = a
ay(labels)
return featurehist,labels
def train_attribute(attribute_id, C, split=0):
from shogun import Classifier,Features,Kernel,Distance
attribute_id = int(attribute_id)
print "# attribute ",attributenames[attribute_id]
C = float(C)
print "# C ", C

if split == 0:
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here