Posted on January 1, 2017 at 12:00 PM
To use graph-theoretic principles for selecting a subset of features of a data set having very large number of features.
ApproachMap the features of a data set as a graph, either based on similarity between features or based on information contribution of the features. This gives a unique visualization of the feature relevance or redundancy. Then eliminate irrelevant features and use graph –theoretic principles of drawing sub-graphs to derive a subset of the potentially redundant features.
Current StatusThe initial formulation of the graph-theoretic representation of the features has been done based on similarity and information contribution. Those have been named as Feature Association Map (FAM) and Feature Information Map (FIM) respectively.
Next StepNext step is to implement the concept of fuzzy sets in deciding similarity of features instead of assuming a hard value of similarity threshold. Also, use the concept of rough sets to derive a single subset from multiple potentially optimal subsets. Finally, the algorithm will be tested with a very high dimensional data set.
Team Members