Learning similarity metrics from case solution ... - Semantic Scholar

Learning similarity metrics from case solution similarity


Carlos Morell, Rafael Bello, Ricardo Grau, Yanet Rodríguez Universidad Central de Las Villas CUBA

Computer Science Department. Universidad Central de Las Villas. Msc. Yanet Rodríguez, [email protected]

1


Content Motivation

1. Motivation.

Basis

2. Basis. Feature weighting algorithm

Experimental Results

Conclusions

3. New feature weighting algorithm. 4. Experimental Results. 5. Conclusions.


2


Motivation Motivation

Problem

Feature weighting algorithm

New Case

Re tai n Learned Case

Conclusions

Confirmed Solution

New

RetrievedCase Case

Knowledge

Re vi

Tested/ Repaired Case

Case Base

Re e us

se


Re e triev

Basis

Solved Case

Suggested Solution


Retrieve: Determine most similar case(s). Reuse: Solve the new problem re-using information and knowledge in the retrieved case(s). Revise: Evaluate the applicability of the proposed solution in the real-world. Retain:Update case base with new learned case for future problem solving.

3


Motivation Motivation

Basis



• The relationship between the retrieval procedure and the adaptation method is crucial because it is desirable to recover cases requiring smaller adaptation efforts. • It is convenient to adjust the similarity criteria to minimize the amount of transformations to be carried out over the retrieved solutions. • Most approaches to learning feature weights assume CBR systems for classification tasks

Conclusions


4


Basis Motivation

Basis


• A case c is a tuple ∈ D x S consisting in a problem’s description and an associated solution. • A case description will be represented using a feature-value representation. σd(d,di) will be the similarity between descriptions d and di : n


Conclusions

σ d (d , d i ) = ∑ wa ⋅ sima (d , d i ) a =1

• Case solution can be arbitrary complex: σS (s,si) will be the similarity between solutions s and si


5


Basis Motivation

Basis



• The more similar two problem descriptions are, the more useful it is to use one of the solutions for the other problem. • The similarity σS(s,si) between two object solutions is an a posteriori criterion. • The similarity σd(d,di) between two object descriptions is an a priori criterion.

Conclusions


6


Basis Motivation

Ordering generated by σS (a posteriori)

Ordering generated by σD (a priori)

Basis



Oσ

S

∀i σ S (

( q ) = {c1 , c2 ,..., cn}

s , sc q

i

) ≤ σS (

s , sc q

Oσ

D

( q ) = {c , c ,..., c }

∀i σ D (d q , d

) i +1

π

cπ i

1

π

2

) ≤ σ D (d q , d

π

n

) cπ i +1

Conclusions

n

Dσ

(q) = D

∑ i =1

i −π i

 2  n 2   


7


Feature Weighting algorithm

Motivation

Basis

Problem: To find the weight vector w so that, given a known local similarity functions, the resulting global similarity function fulfills that Dσ (q) ∀q ∈ CB is minimum. D


n


σ d (d , d i ) = ∑ wa ⋅ sima (d , d i ) a =1

Conclusions


8



Motivation

Basis


The solution to the previously stated problem is to minimize the disorder caused by the similarity function over the entire case base. It is equivalent to minimize the following error function: n

ew (q) = ∑ (σ D (d ,d i

i ) −σ s ( s, si ))

2

∀ q ∈ CB


Conclusions

minimization of ew (q), for all the q in CB, particularly implies the minimization of the difference according to sD and sS between the separation of a well-known case q with respect to the general CB


9



Motivation

The average quadratic error is calculated according to the previous expression is:

Basis


1 E ( w) = 2 n

n

1 ( c ) = ∑ j e w n2 j =1

∑ ∑ (σ n

j =1 k =1

)

2

n

w S

(c j , ck ) − σ D (c j , ck )


Conclusions

A formulation of the error first derivative with respect to the weights is: ∂E ( w) 2 n n = 2 ∑∑ (σ Sw (c j , c k ) − σ D (c j , c k ) ) ⋅simi (c j , c k ) ∂wi n j =1 k =1


10



Motivation

Basis



• Now it is feasible to use the conjugated gradient method to minimize the error modifying the weights in the direction of the Conjugated Gradient 1. initialize weight-vector w 2. determine for each case from the CB its utility ?S with respect to the rest 3. compute the error E(w) according to (4) 4. initialize the learning rate ? 5. repeat until ? becomes very small w' σ (q, c) so that: D a. Generate a new function

w'i = wi − Conclusions

∂E ( w) ⋅λ ∂wi

b. normalize the new weights c. compute E(w’) according to (4) d. if E(w') < E(w) then w = w' else ? = ?/2


11



Motivation

Basis

• Now it is feasible to use the conjugated gradient method to minimize the error modifying the weights in the direction of the Conjugated Gradient 1. Initialize weight-vector w



Conclusions

2.Determine for each case from the CB its utility σS with respect to the rest 3.Compute the error E(w) according to (4) 4.Initialize the learning rate λ 5.Repeat until λ becomes very small a.Generate a new function so that: b. normalize the new weights c.Compute E(w’) according to (4) d. if E(w') < E(w) then w = w' else λ = λ/2 Computer Science Department. Universidad Central de Las Villas. Msc. Yanet Rodríguez, [email protected]

12



Motivation

The original purpose of this method was the development of a Computer Aided Process Planning System for symmetricalrotational parts.

Basis



Conclusions

The selection of the appropriate weights for each feature becomes a very difficult task for the experts in the application domain basically due to its lack of knowledge about the process of adaptation. In this particular application, the solution utility can be measured as the amount of T-operators that are needed, by using transformational analogy as adaptation method, to adapt to problem q the plan of the recovered case c


13



Motivation

Basis



Conclusions

Given two planning problems ci and cj with known solutions (plans) si and sj, the similarity of their solutions (useful) can be expressed as a dual function of the edit distance:

σ s ( si , s j ) = util (si , s j ) = 1 −

d LEV (si , s j ) si + s j

where |si| expresses the amount of steps in the plan solution si.

Note that if the plans are identical, then σs(si,sj) =1; on the other hand, if they differ in all the steps, the value is 0.


14



Motivation

Basis


Therefore, the hypothesis to demonstrate in this experiment is the following: H: In the elaboration of the manufacturing plans the use of a weighted similarity function, by applying the proposed algorithm for weights calculation, allows the recovering of plans that require less adaptation effort.


Conclusions


15



Motivation

Basis



The case base contain 130 cases, where each case keeps the description of the piece and its manufacturing plan. For each case of this CB, a solution given by the experts is known a priori. Considering each case of the CB as a problem q to solve, two different orders from the remaining cases are obtained.

Conclusions


16



Motivation

Basis



In order to prove the hypothesis that was previously specified, in the recovery of the most similar cases to problem q by using the similarity function (sim), two variants are depicted . 1. Feature weights are not considered, that is, all of them have the same importance (we will refer to it as nonweighted similarity) 2.The weights obtained with the algorithm previously proposed are considered. (weighed similarity)

Conclusions


17



Motivation

Basis



Conclusions

For each case q there exist two measures of “associated disorder”. Accordingly with the previously defined choices, two continuous variables arise: Dsim1 (variant I) and Dsimw (variant II) It is desirable to reject the hypothesis that the two variables have the same distribution (equality of positive and negative ranks in the differences of observed variables). A non parametric test was used: the Wilcoxon signed-rank test. To improve accuracy in the Wilcoxon test significance, Monte Carlo simulation techniques are used.


18



Motivation

Basis



The results of the comparison are as follows. The disorder measured in the calculation of sim with the variant II is less than the disorder observed with variant I in 99 from 130 cases; and there aren’t ties. Then, the Wilcoxon test determines significant advantages of the weighted similarity (Significance: 0.000 with confidence interval close to this value)

Conclusions


19


Conclusions Motivation

Original Model

In this work, the similarity between solutions is used as a heuristic to estimate the features’ importance and it is presented an experiment in the domain of the cases-based planning that shows the effectiveness of such approach.

Extended Model

Results and Experiments

Conclusions


20


End ¡Thanks for your attention!

• Questions? • Comments?

… Computer Science Department. Universidad Central de Las Villas. Msc. Yanet Rodríguez, [email protected]

21