Latent Compatibility Space

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

5.3. METHODOLOGY 49

tributes. Each attribute a

is associated with a set of elements representing its possible values

D fe

; e

; : : : ; e

g, where e

refers to the i-th element and M

is the total number of el-

ements regarding a

. For simplicity, we compile all E

’s in order and hence derive a uniﬁed

set of attribute elements E D

qD1

D fe

; e

; : : : ; e

g, where M D

qD1

. In addition,

we have a set of positive top-bottom pairs S D f.t

; b

/; .t

; b

/; : : : ; .t

; b

/g composed by

fashion experts, where N is the total number of positive pairs. Accordingly, for each top t

, we

can derive a set of positive bottoms B

D fb

2 Bj.t

; b

/ 2 Sg. Let s

denote the compatibility

between the top t

and bottom b

, based on which we can distinguish whether the given fashion

items are compatible or not.

5.3.2 SEMANTIC ATTRIBUTE REPRESENTATION

As a matter of fact, the online fashion item is usually characterized by a visual image, certain user-

generated contextual description and structured category labels. In a sense, the visual image and

structured category labels can faithfully capture the essential features of fashion items, such as

the

color

shape

, and

category

, while the user-generated contextual description may be unreliable

as it can be intrinsically noisy, not to mention the mendacious ones edited by crafty sellers.

erefore, similar to the existing work [139], we only exploit the reliable visual cues as well as

the structured category information to model the compatibility between fashion items. Notably,

existing eﬀorts mainly adopt advanced deep neural networks to learn the eﬀective presentations

for fashion items and measure the compatibility owning to their compelling success in various

research tasks. Nevertheless, as a pure data-driven learning scheme, deep neural network suﬀers

from the poor interpretability due to the fact that each dimension of the learned representation

cannot explicitly refer to the intuitive semantic aspect of fashion items. Toward this end, we aim

to learn the meaningful representations for fashion items, whose dimensions directly stand for

the semantic attributes and hence enhance the model interpretability.

On one hand, regarding the sophisticated visual signals, we argue that taking advantage of

the well pre-trained attribute classiﬁcation networks is the most natural and straightforward way

to obtain the interpretable semantic representations of fashion items. As to ensure the perfor-

mance of the attribute classiﬁcation networks, we align each attribute a

with a separate attribute

classiﬁcation network h

. It is worth noting that as the category information also contributes

an essential attribute of fashion items, here we have Q  1 attributes characterized by the visual

cues. We feed the visual image I

of the i-th top/bottom into these h

’s, and obtain the semantic

attribute representations as follows:

D h



j‚



; q D 1; 2; : : : ; Q  1; (5.1)

where ‚

denotes the network parameter of h

and f

2 R

is the network output of h

. e

d -th entry in f

refers to the probability that the top t

presents the attribute element e

. In

particular, we denote f

D Œf

I f

I : : : I f

Q1

 as the ﬁnal semantic attribute representation of the

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Latent Compatibility Space

Create new playlist

Sign In

Sign Up

Table of Contents for
Latent Compatibility Space