Asked by:
Calculating object distance/similarity in Bayesian Networks
Question

I have a tree structure of attributes (observed/hidden). This structure is built using some objects (data points) and with the help of a prior domain ontology. Obviously, each object can be represented as a vector o=(x1, x2,..., xn) of observable variables.
I'm interested to know the similarity between two objects (o1 and o2)?
Would you please consider the following scenario and give me a viewpoint in Graphical Models? i.e I'm looking for a computational approach for object similarity in hierarchical attribute space.
Sunday, June 2, 2013 12:35 AM
All replies

Do you have any training data regarding similarity? Or is it just a bag of points? How many values does each attribute have? What is the significance of the tree structure?Monday, June 3, 2013 2:55 PMOwner

I have some training data (the relatedness value between some data points):
o1, o2, r
where "o1" and "o2" are data points and "r" is the value of relatedness.
I can build a binary or weighted feature vector. So the values of each attribute may be binary (0/1) or a real value between [0, 1].
The significance of tree structure is that it model the relatedness between observable features. Each hidden variable clusters some observable variable like latent tree model: http://www.jair.org/papers/paper2530.html
Monday, June 3, 2013 4:31 PM 
A latent tree model is simply a generative model of data. The interior nodes exist to model correlation between leaves. Similarity between objects is a rather different thing. What do the observed "r" values represent, and how are they related to the tree?
Monday, June 3, 2013 4:59 PMOwner 
Suppose I have a set of documents, for example:
d1="Emancipation Proclamation"
d2="Gettysburg Address"
These documents are represented using term features (observed variables):
d1 = ("Emancipation", "Proclamation")
d2 = ("Gettysburg", "Address")
Using observed variables (visible terms) for computing semantic relatedness between the above documents leads to mismatch.
Now, suppose that I have a hierarchical model which represent the relationship between observed variables using some latent variables (more general concepts). Exactly, I want to know if I have such a hierarchical relationship, is it possible to learn a graphical model?
If it's possible, then what computational possibilities will be provided by this learned model?, i.e. classification, object similarity and etc.
Monday, June 3, 2013 7:56 PM 
Any generative model can be used for classification or object similarity. But the notion of similarity that you will get has more to do with correlation than "semantic relatedness". For example, you could use Latent Dirichlet Allocation to do classification and similarity. But the topics produced by this method are chosen to model correlation only.
 Proposed as answer by John GuiverMicrosoft employee, Owner Wednesday, August 14, 2013 7:43 AM
Wednesday, June 5, 2013 3:18 PMOwner