Untitled

		Задачи, в которых используются графы в качестве данных всегда представляли вызов в задачах машинного обучения из-за своей сложной структуры.
		Однако, множество важных практических задач формулируются именно через графы.
		Например, в социальной сети мы хотим спрогнозировать профиль интересов пользователей \cite{yang2011like} или
		найти новые связи с друзьями \cite{liben2007link}. В графах белок-белковых взаимодействий можно решать задачу прогнозирования функциональных групп белков \cite{radivojac2013large}. Для всех подобных задач машинного обучения желательно иметь набор информативных, независимых признаков. В случае типичных задачах анализа данных на графах такой набор отсутсвует. В таком случае приходится извлекать хорошие признаки для вершин самостоятельно, учитывая те зависимости, которые накладывает граф. В каждой отдельной

		Graph is a complex data structure for machine learning tasks, but many important real-world problems are formulated exactly in this form. For example, in a social network, we are interested in predicting interests of users \cite{yang2011like} or finding missing link of friends \cite{liben2007link}. In a protein-protein interaction network we might be interested in predicting functional labels of proteins \cite{radivojac2013large}. Any supervised machine learning algorithm requires a set of informative, discriminative, and independent features. In usual network analysis task we do not have such kind of features and need to extract good ones, considering relations with other nodes, which can be hard to implement in each particular case. Here we have two main approaches. One is to manually construct this feature set. In this approach we need to have expert knowledge in a domain-specific area, the features are designed for the specific problem and do not generalize across different prediction tasks, not even speaking about human time costs. Another approach is to learn such features. We do not need any specific knowledge and ideally can use one technique for all kind of tasks.\\

		To sum up, machine learning tasks over graphs require careful preprocessing. Recently developed approaches in representation learning simplify feature extraction for a big variety of problems \cite{bengio2013representation}. However, many feature extraction methods are not designed for graph tasks exactly, so they can not be expressive enough to capture all information stored in graph data. In this work, we are developing new representation learning technique, which is based on modern approaches and specially designed for graph data. We expect that the method will improve quality for many types of network tasks. \\

		Recently, efficient estimation of word representations in vector space (word2vec) was introduced \cite{mikolov2013word2vec}. It is a powerful approach in natural language processing that gave rise to a wave of similar approaches in different domains. Also, several network representation methods such as deepWalk \cite{perozzi2014deepwalk} and node2vec \cite{grover2016node2vec} use word2vec as a background method.
		However, originally word2vec developed for sequence type of data and it is needed to have a sequence of nodes (instead of words) to use it. Therefore such approaches as deepWalk and node2vec differs only in the way how to sample the sequence of nodes from graph.\\

		We want to use another functional, which is more suitable for network data but keep the main representation learning framework the same. One of the candidates for such quality criteria is recently introduced Histogram Loss \cite{ustinova2016learning}. It was developed for learning deep embeddings in image retrieval tasks but seems to be a perfect candidate for graph problems too. The loss is computed in two steps. First, two distance distributions for positive and negative sample pairs are estimated. Second, the probability of a positive pair to have a lower similarity score than a negative pair is calculated. In a usual case, positive means match of labels of two data points, and negative is a mismatch. In the case of graph data edge can be considered as matching indicator. If two nodes have edge between them, they are a positive pair, if not, they are a negative pair and we directly use the graph structure.\\

		We will test this approach in the same standard data sets as  for node2vec, to make correct comparison with this method, which is state-of-the-art node embedding algorithm and expect to achieve better result on them. To be precise, we will use such datasets as BlogCatalog \cite{zafarani2009social}, Protein-Protein Interaction network (PPI) \cite{stark2011biogrid}, Wikipedia \cite{mahoney2009large} for multilabel classification and Facebook \cite{leskovec2015snap}, PPI \cite{stark2011biogrid}, arXiv \cite{leskovec2015snap} for link prediction task. \\

		To sum up, our aim is to develop a new state-of-the-art approach to handling graph data with modern techniques and methods to improve quality for a wide range of network tasks. The goal has a big potential impact on modern research in this area.