2015年4月20日 星期一

Online Dictionary Learning for Sparse Coding

Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro

Abstract

In sparse coding, where each training example is represented by sparse linear combination of basic elements, which forms a dictionary, a learned dictionary, is shown to have better performance over a pre-defined one. Though, evidences have demonstrated that learned dictionaries are crucial to achieve state-of-art results, the optimization problem one has to deal with is often the bottle neck and limits the more ubiquitous utilization. 

The authors proposed a online learning method, which hopefully solved problems often encountered in several dictionary learning algorithms such as too big a training set and a dynamic changing one. Basically, it harnesses the stochastic gradient descent, which is shown to often be the fastest in terms of convergence though not good conventionally, by taking an example (or a tiny set) a time. 


In a typical dictionary learning, one tries to minimizes the differences between training example and its sparse representation as shown in the first equation, which is equivalent to solving the one below, where D is the dictionary and alpha a slack variable ensuring sparsity. In the paper, the authors proposed a method iteratively solving D and alpha given the other fixed, which in the paper is proven of convergence. The authors also suggested several side methods to boost the performance.

Contributions
  1. Propose a method that is faster in both small and large training set than previous methods and show it could be used on real case such as removing imprinting of images.
  2. Cast dictionary learning problem as a optimization of a non-convex objective function over a convex set

2015年4月13日 星期一

To aggregate or not to aggregate: Selective match kernels for image search

Giorgos Tolias, Yannis Avrithis, Herve ́ Je ́gou

  Abstract

This paper focused on improving visual recognition accuracy, no matter it's objects, locations or scenes. The ubiquitous approach targeted at this problem is local descriptor and most of them are derived from BoW, which seeks to quantize the obtained descriptors into several discrete visual words and hence represent each image as a vector, which could be further combined with machine learning techniques such as SVM to achieve fair results. 

These methods, however, could be categorized into two "genres", of which are matching based and aggregated approaches. The authors hence analyzed them and proffered a new method, which is unprecedented in that it took the merits from these two and combine them to yield a result whose performance is even higher than that of state-of-art approaches. 

The matching based methods, in some sense, utilizes the descriptors independently, while the aggregated approach, indicated by its name, coalesced several descriptors, which if properly set not only preserves some individual information, but also has the advantage of using less memory. 

In the experiment, through setting different parameters, the authors showed that it's worth combing matching based and aggregated approaches to yield superior results.

Contribution
  1. Analyze the difference between these two seemingly different approaches 
  2. Combine flawlessly these two, proposing a method that is not only efficient in terms of memory consumption but also yields a better results.
Questions
  1. Not sure what it meant to have multiple assignments. And to "replicate each descriptor vector and assign each instance to a different visual word" ?
  2. The meaning of selectivity and threshold? Is it parameters to dictate how local descriptors are chosen or rendered?

2015年4月11日 星期六

Nonlinear Dimensionality Reduction by Locally Linear Embedding

Sam T. Roweis1 and Lawrence K. Saul

Abstract

In this paper, the authors proposed a method, which could help dimension reduction. They begun saying since in lots of different spheres, professions often encounter tons of data, which if no further processing, are normally of hundreds if not thousands of dimensions, hindering scientists to efficiently deal with them. Of this reason, the importance of dimension reduction can't be overemphasized. However, during this process, similarities, distances or certain characteristics are apt to lose or wrongly preserved. Hence, the method proffered in this paper utilized local areas that are overlapped to preserve the traits shown in the original dimensionality after reduced to a lower one.

First, several common methods harnessed in dimension reduction are discussed, during which, the author stated that several methods simply use the Euclidian distance, which could produce misleading result when dealing with distribution like manifold. Though, some methods such as one employing the shortest path could in some way avoid this nuisance, their problem is the time-consuming dynamic programming required to compute the shortest-path distance.  

The authors hence proposed the method which harnesses local information to grasp the global one. Through representing each data point by the K nearest neighbors, each with a weight controlling how important that neighbor is when reconstructing the point, the authors managed to preserve the correlations between points and their neighbors after applying dimension reduction. Besides, by using overlapped local information, this method not only gets away with the rather heavy computation in shortest path distance, but also successfully model the non-linear characteristics of the whole distribution via these linear representations. 

Contributions

  1. Proposing a method preserving meaningful information in each axises of its reduced dimensionality, which has the merits of not only efficiency but also efficacy.
  2. Combining with other feature processing and learning schemas, the method will yield even better performance.