Social network search based on semantic analysis and learning

2016-03-20 06:51FeifeiKouJunpingDuYijiangHeLingfeiYe

CAAI Transactions on Intelligence Technology 2016年4期

Feifei Kou,Junping Du*,Yijiang He,Lingfei Ye

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia,School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China

Feifei Kou,Junping Du*,Yijiang He,Lingfei Ye

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia,School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China

Because of everyone's involvement in social net works,social networks are fullof massive multimedia data,and events are got released and disseminated through social networks in the form of multi-modal and multi-attribute heterogeneous data.There have been numerous researches on social network search.Considering the spatio-temporal feature of messages and social relationships among users,we summarized an overall social network search framework from the perspective of semantics based on existing researches.For social network search,the acquisition and representation of spatio-temporal data is the basis,the semantic analysis and modeling of social network cross-media big data is an important component,deep semantic learning of social networks is the key research field,and the indexing and ranking mechanism is the indispensable part.This paper reviews the current studies in these fields,and then main challenges of social network search are given.Finally,we give an outlook to the prospect and further work of social network search.

Semantic analysis;Semantic learning;Cross-modal;Social network search

With social networks becoming more and more popular,a large number of users are continuously active.They always published what they see and what they think at any moment. Users in social networks can focus on people they are interested in,and comment on the messages or forward them. Social networks have accumulated a large amount of usergenerated data.As a result of the existence of mobile phone cameras,there have been a variety of heterogeneous media, such as images and videos in social networks in addition to text information.In social networks,such as twitter and Sina Weibo,a message posted is limited to 140 characters[1].It is too short to convey what users want to express,so they usually use more than one image in addition to concise text to describe the event.The social network data is multi-modal and heterogeneous,and also accompanied with multiple attributes, such as the time stamp and the location information when the massage is posted,and the spatio-temporal information of the event described by the twitter(i.e.,where and when the event happened).There is a large amount of social relationships in social networks.When releasing micro-blogs,many users often use hashtags,and sometimes share related links to provide more details about the events[2].The massive social network data contains extremely valuable information,and the information characteristics bring both opportunities and challenges for social network search.

On social network platforms,through publishing,forwarding and commenting,people can share information quickly and efficiently.Many events are first reported on social network platforms,and then gradually develop into hot issues. Therefore,how to obtain the information of interest in the massive social network data becomes especially important[3]. Users can search for timely information,social information,as well as topical information.The timely information search means keeping up with what is happening,and understanding trends or news.The social information search means searching specific individuals and what they are interested in.The topical information search means searching specific topics andpublic sentiments about them.Social network search has attracted wide attention,for example,the international famous TREC conference has set micro-blog search topics in recent years.Current social network searches,such as searches on Twitter and Sina,users enter the search keywords,and the retrieved list is returned.However,the problems caused by the large amount of social network data,short contents,users' misspellings and narrative style diversity,lead to poor social network search results in the form of keywords.

With the development of deep learning,cross-modal retrieval has made some progresses.Using text to search image or using image to search text both have achieved good search results[4].In real life,when we see an image,we want to know what has happened behind the image.As a result of everyone's involvement in social networks,there is massive multimedia data in social network.Searching an image on the social network,we want to get the tweets including the same or similar images,and we also need the tweets containing texts related to or consistent with the searching image.Through social network searches,we can get more detailed information.If we want to know more details about the event that a brief micro blog describes,the corresponding representative images can give us a richer and more vivid description of the situation.Therefore,it is very attractive and also challenging to analyze the multi-modal heterogeneous social network data. By expressing and modeling the spatial and temporal information,semantic analysis and semantic learning can be used to realize social network cross-modal search.

At present,social network search has aroused wide concern,the main research work is focused on the following aspects.The acquisition and representation of the spatiotemporal social network data,namely,to obtain social network data,to clean and filter the data in a reasonable manner,to extract the data features,and to store and manage data effectively;The semantic analysis and modeling of the cross-modal data in social networks,to use the semantic analysis technology to achieve accurate content understanding of cross-modal data in social networks,to model the multiple features of social network data,such as temporal and spatial features,text features,visual features and social features,and map these data features into the same representation space; The deep semantic learning of social networks.Namely,to realize the association mapping learning of the text features and visual features to achieve heterogeneous data matching; The indexing and ranking mechanism of social network search,to realize social network search,we should index and rank social network data according to different search demands.Considering the actual situation and setting different weights to different features,such as,text features,visual features,social features,temporal and spatial features,hashtags,link information,social relationships,different ranking methods can be designed.Based on the above analysis and recent research results of social network search,overall framework of social network search based on semantic analysis and learning is summarized in Fig.1.

This paper analyzes the characteristics of social network data,gives the main purpose of social network search and the existing search form in social networks,points out the significance of social network cross-modal search,and gives the main architecture of social network search.Furthermore,this paper reviews the current study of the acquisition and the presentation of spatio-temporal data,the analysis and modeling of social networks,the deep semantic learning of social networks,and the indexing and ranking mechanism of social network search,and then the main challenges of social network search are given.Finally,the future work in this field is discussed.

1.The acquisition and representation of the cross-modal and spatio-temporal data in online social networks

Social network is a kind of relationship network that can help users establish online friendships,and interests,hobbies, status as well as activity information that can be shared among friends,with powerful information release,dissemination, acquisition and sharing functions.Social network data mainly includes text information,image information,time stamp,and location information[5].The acquisition and representation of the spatio-temporal data provides research foundation for event detection[6],cross-media search[7],image annotation [8]and other study fields.

1.1.The acquisition of the spatio-temporal data in online social networks

Fig.1.Overall framework of social network search based on semantic analysis and learning.

There are two main ways to obtain social network data. One is for the social networks providing good data API interface such as Twitter and Flickr;we can acquireinformation steams by keeping a long socket connection after we pass the OAuth validation.The other is for the social networks that do not provide good data API interface,such as Sina Weibo;researchers can only use the crawler technology to get the desired information.

Zhou et al.[9]put forward a Sina Weibo data crawler supporting parallelization technology.This tool can crawl specific Weibo information such as user's fans,and Weibo text content in real time.Using keyword matching and parallelization technology,it can get many users'information at the same time.Experimental results indicated that data acquired by this tool is real-time and accurate.Zeng et al.[10]developed a micro-blog web crawler based on micro-blog topics. They designed a breadth-first search strategy and applied it to the micro-blog crawler.The main characteristics of this task is the analysis and design of short-text subject extraction technology and multiple keyword matching technique.The designed prototype system based on micro-blog topics can scrape and store micro-blog data in real-time.Although the efficiency is lower,the grabbed micro-blog data has good topic relevance.

1.2.The representation of the multi-modal and multiattribute data in online social networks

Vector space model,LDA topic model and word embedding are usually used to represent text features.In Ref.[11], text information was modeled by LDA topic model.A word dictionary for local corpus was constructed,and the distributions of words on different topics can be obtained by probabilistic topic model,and the semantic of text can be mapped into the topic space.In Ref.[12],the model of skip-gram is used to represent the text,by predicting the surrounding words,then every word in the text is represented with a fixed length vector.Moreover,the authors trained the models on a large amount of data,so that words and phrases can be represented in high quality.Allthe representation methods of text features have achieved great success in long texts,especially the topic model and the word embedding methods.However, textmessages in social network are usually very short,how to use the existing text feature representations to represent short text is an urgent problem.

In the field of cross media,SIFT features and the deeply learned image features by deep learning methods are usually used for image representation.SIFT(scale invariant feature transform)feature is an important feature in the field of computer vision.Different from the traditional identification methods,SIFT does not focus on the overall characteristics, but focuses on local points,and builds scale space as the character description of local points,selects the appropriate scale to adapt to the change of the image size,and the utilization of direction information makes itwell adaptto rotation. For cross-media search,SIFT features are usually clustered by k-means algorithm,then the visual words can be acquired. Using topic model to model visual words,it can get the distributions of visual words under different topics,finally the image is mapped into a topic semantic space[13].

At present,the deep image features are mainly acquired through deep neural network.The commonly used two deep neuralnetwork structures are AlexNet[14]and VGG-Net[15]. AlexNet is a convolutional neural network with five convolution layers,two full connection layers and a SoftMax layer. VGG-Net uses more network layers,generally reaching 16 to 19,and in the convolution layers,the sizes of convolution filters are the same.The image features acquired by deep learning methods have achieved a better effect than manual features on tasks such as image classification,image annotation and image retrieval.The big data brought by Internet provides many training samples for deep neuralnetworks,and the usage of GPU guarantees the operation speed.

The time features and location features of messages published by users in social networks can be discretely processed or continuously modeled.For example,time is sliced in dynamic LDA[16].In Ref.[17],the authors modeled time information as beta distribution,and the location information was represented by continuous latitude and longitude.And the continuous latitude and longitude were also modeled as beta distribution respectively.In Ref.[18],the authors proposed STM-twitter LDA,which also used beta distribution to represent continuous time information,but the location information was processed discretely.It collected twitter data of 16 countries atthe same period,and used a country as a region to model the location information.Modeling time feature and location feature as continuous probability distribution can give more accurate results for social network search,but not all cases require continuous modeling,processing time feature and location feature discretely can largely improve search speed.

1.3.Storage and management of social network data

A large amount of data is generated all the time in social networks.Social network data has the characteristics of big data and fast data atthe same time,so how to effectively store and manage social network data is very important.In order to deal with the situation where the big data characteristics and fast data characteristics are coexisting in social networks, Gilad Mishne et al.[19]first designed a Hadoop-based data system,which achieved good effect in dealing with big data. However,when processing real-time fast data,the effect was not good.To support Twitter query suggestions,they replaced itby memory processing engine,and social network real-time search can be realized.The query operations are carried out in memory for most of the existing social network searches,and the memory is scarce.When a lotof data is stored in memory, it is very difficult to realize real-time social network search. Amr Magdy et al.[20]proposed a kFlushing strategy to increase the memory hit ratio.If the memory is full,it will transfer the less relevant data from memory to hard disk,then new data can be handled in memory to achieve real-time search in social networks.

Socialnetworks provide a platform for users to publish and disseminate information,and provide a data base for the researches on large scale social networks[21].Existingrepresentation methods of text features,image features,time feature and location feature have achieved good effect in specific field.However,for multi-modal and multi-attribute social network data,how to fuse all kinds of features together and express them consistently need further studies.In order to achieve real-time search in social network with big and fast data,we need to develop a more appropriate way of data storage and management.

2.Semantic analysis and modeling of social network cross-media big data

2.1.Semantic analysis and modeling of short texts

Shorttexts are the mainstream in social network,so how to model text information and solve the content sparsity has attracted much attention of many scholars.In Ref.[22],a new method to model topics in short texts named biterm topic model(BTM)was proposed by the authors.To overcome sparsity issue of short texts,the authors exploited rich global word co-occurrences instead of the sparse document-level ones.The probabilistic topic model does not rely on any external information,and directly models unordered word pairs.Experimental results demonstrated that more prominent and coherent topics can be discovered by BTM,even on long texts,it can also achieve good results.Another simple but popular way to deal with the sparsity problem is to aggregate short texts into long documents,then to train them with a standard topic model.In Ref.[23],the authors assumed that a single tweet usually has a single topic,and a probabilistic topic model named as Twitter-LDA was put forward for the short texts in twitter.

2.2.Semantic analysis and modeling of cross-modal data

Semantic analysis technologies for big data provide critical support for the understanding of social network data,and they are also the basis of many big data applications.The main problem is that with the rapid production of heterogeneous network data,the data itself exists in multiple modalities,so how to identify the corresponding conceptfrom heterogeneous media has become a hot issue.Blei et al.[24]designed three different generative models based on the analysis of texts and images.One is the strong topic correlation model,where the topics of texts and images are the same.Another is weak correlation model,where texts and images are sampled from the same topic distribution respectively.The other model assumes texts as the annotation of images,and the topic of a text is sampled uniformly from the image topic.The authors mapped texts and images into the same topic space through the proposed Corr-LDA model,and excellent results have been achieved both in tasks of image annotation and image search. Bian et al.[25]proposed a novel multimodal LDA model,it can discover subtopics from microblogs and acquired subtopic distributions by exploring the correlation among different data modalities.With the multi-modal LDA model,visualized summaries of trending topics can be automatically generated. In the field of cross-modal semantic analysis and modeling, using probabilistic topic model to analyze and model data in different modal is a mainstream.By improving the existing processing model and using semantic analysis and modeling technology,the establishment of more accurate cross-modal semantic space is still worthy of further study.

In the above generative model based on latent Dirichlet distribution,strict text-image pairs are treated as the study objects,assuming thatthey have the same topic distribution,or one modality topic distribution is dependent on the other modality topic distribution.However,descriptions of the same events or objects are notonly heterogeneous in modalities,but are also not a one-to-one correspondence in content,quantity, level of granularity.In Ref.[26],a bilateral correspondence topic model was proposed,it considered multiple dependence relationships between the topics of texts and images,and can fl exibly model social media data.The model can be used to cluster and summarize social network data.In Ref.[27],a multi-modal mutual topic reinforce modeling method M3R was proposed.in this method,external data such as category information and topic interaction are used to establish the public space for multi-modal data.The learned latent representations for multi-modal data are correlated and discriminative.It has achieved excellent effectiveness in cross-modal retrieval experiments.In Ref.[28],to deal with the unpaired data,a novel Matrix decomposition algorithm for cross-modal matching was proposed.The matrices of different modalities are jointly learned,and some class label information is used.

2.3.Semantic analysis and modeling of multi-attribute data

The image and text data in social network contains lots of spatio-temporal information that differs from other platforms, such as the time and location information of events described by micro-blogs,where and when the tweet was released,and users'location information.The combination of the spatiotemporal information plays a crucial role in cross modal search to analyze semantic heterogeneous features and establish the public semantic space[29].In Ref.[30],crossplatform video recommendation method was proposed to combine spatio-temporal information.In Ref.[31],to use spatio-temporal information of micro-blog data,a methodology to automatically summarize events was developed,and has achieved good results in event detection.In Ref.[32],the authors considered color,spatial and temporal information to identify each person in a video.Considering spatial and temporalinformation in social network search can greatly improve the search results.

It is a hot issue to analyze and model social network data utilizing the correlation among them and combining their multiple attributes.In Ref.[33],inspired by deep learning achievements in natural language processing,the authors proposed a method named Deep Walk,modeling data into undirected graph.The vertices in the graph represent data,and the edges between them imply the relationship between data.Through this method,latent representations can be learned.As an online algorithm,Deep Walk is also scalable and parallelizable.Wu etal.[34]utilized click data collected from user behavior dataset,took the click times as the relevance relation feature,combined with Deep Walk,and established public space for click character and multi-modal data.Deep Walk algorithm can model the attribute characters as the edge and the value on it,and has achieved excellent results in application.But Bryan Perozzi et al.did not give strict theoretical proofs.In Ref.[35],Cao et al.showed the analysis and derivation of graph model,and the detailed derivation of depth random walk algorithm was also given.

In the field of cross-modalsemantic analysis and modeling, probabilistic topic model is mainly used to model different modal data.It is difficult to model multiple attributes and multiple modalities at the same time,and it is still worthy of further study.

3.Deep semantic learning of online social network crossmedia big data

Social network search is using data in the form of a specific modality as input,and returning a list of all the relevant data. So applying deep semantic learning to social network data to achieve accurate understanding of search intention and to match query and search exactly is the key to realize accurate search.Existing social network search is based on keyword search,and uses externalrepositories such as ontology or wiki encyclopedia to achieve accurate understanding of the keyword[36].Users always express their opinions when they release messages on social network.Applying opinion mining and sentiment analysis to social network data to improve the effect of semantic learning and accurate search is also a research hotspot[37].As a result of the existence of a great many pictures and videos in social networks,mapping learning between text characteristics and visual features has attracted the attention of a large number of scientific researchers[38].Usually typical images in social networks describe a specific scene,the scene recognition task of social network images is an important research direction of crossmodal social network search[39].

3.1.Semantic learning with external knowledge repositories

Because searching keywords of social network search are relatively short,search based on keywords may lead to query intention ambiguity.Therefore,external knowledge repositories are commonly used to enhance the accuracy of query understanding.In Ref.[40]and Ref.[41],hashtags attached to twitters are segmented,and the entities in hashtags are linked to Wikipedito enrich semantics and realize accurate search.Ontology is an important knowledgebase for semantic search,however,it usually consumes a lot of time and labor, and the data stored is usually static.In Ref.[42],the authors proposed a text model based on Wikipedia rather than establish a certain ontology.It first used FCA(Formal Concept Analysis)to definite the concept in a document,and used the Wikipedia to determine the weights of concepts.In Ref.[43], Wikipedia and WordNetwere both used as externalknowledge repositories to eliminate the ambiguity of text understanding, combined with Google corpus and topic model to classify users and realize a social network user recommendation.In Ref.[44],a knowledge-based query expansion method was proposed for the social network search.The external knowledge base they used is the Freebase,with query expansion derived from it,more related tweets can be returned.The authors applied this method to social network search task, because the query is well understood,the retrieval results are better.

In addition to the above methods,an eventoften appears on multiple network platforms,such as Sina News,micro-blog, Baidu encyclopedia,Wikipedia and other social network sites or news platforms.Recently,there are scholars using deep neural network to learn the similarity between multimodal data at different levels[45].Combined with collaborative learning approach,the semantic learning results of other platforms can be applied to social networks,and the effect of semantic learning can be improved.

3.2.Deep semantic mapping learning of cross-modal data

Deep learning has achieved excellent effect in semantic learning of cross-media big data[46].In Ref.[47],a deep visual-semantic embedding model was proposed.The text feature is extracted by the pre-trained skip-gram model,and is represented as an embedded vector.The visual feature is extracted by the pre-trained deep neural network.In Ref.[48], a deep canonical correlation analysis method was proposed, compared to CCA and KCCA it has achieved better performance in image annotation.In Ref.[49],to realize feature extraction from large and noisy data,a deep semantic relation learning method was proposed,and the semantic relations in the original image-word pairs ware well preserved by this method.In Ref.[50],a correspondence deep auto-encoder method was put forward,it can be used to learn the mapping relationships between texts and images.In Ref.[51],the authors used semantic attention model,and combined visual features and visual concepts,and then a natural language description of an image can be automatically generated from a recurrent neural networks.

The semantic similarity learning between images and texts can act on different levels,such as local similarity learning, global similarity learning and compositional similarity learning.In Ref.[52],the author aligned the image segment with the text segment,and realized the mapping relationship learning between image CNN features and text features.In Ref.[53],a method that can automatically discover semantic vocabularies was proposed,establishing mutual space for image features and semantic vocabularies.The cross-media retrieval experiments demonstrated its effectiveness.In Ref.[54],the local similarity and global similarity between images and texts were combined together,and a compositionaldeep cross modality learning method was proposed.In Ref.[55],the author used the convolutionalneuralnetworks to match the image CNN features with word,phrase,and sentence.The proposed matching CNN model composed texts to different semantic fragments,and the inter-modal relations between the composed fragments and images were learned. For the social network search task,it can choose suitable forms to do the semantic matching relationship learning.

The biggest obstacle of cross-media search in social networks is the semantic gap of heterogeneous data.Therefore, feature representations and mapping relationship learning between heterogeneous media data are essential to solve the cross-media search problem.In Ref.[56],the authors proposed a multi-modal retrieval method based on deep learning. To capture both intermodal and intra-modal semantic relationships of heterogeneous data,the method first needs to learn an objective function.Two learning methods have been proposed.One method used the stacked auto-encoders,it is an unsupervised approach requiring minimum prior knowledge. The other method used neural language model and the deep convolutionalneuralnetwork,which is a supervised approach. The semantic relationship learning method is memory efficient,and experimental results have demonstrated its effectiveness.Semantic mapping learning method based on deep learning has been widely applied to cross-modal search in social networks[57].Using deep features to represent crossmedia data and using deep architecture to learn the mapping relationship can effectively improve searching accuracy. However,the dimension of deep features is very high, although it can improve the accuracy of search,it will also bring a decline in searching speed at the same time[58].

3.3.Sentimentanalysis ofcross-media big data in online social networks

According to the situation thatSocial network data can reflect user's emotions,many scholars introduced the emotional analysis in the semantic learning process to improve the learning effect[59].Sixto etal.[60]presented an approach of sentiment analysis,which combined the BM25 ranking function with a linear Support Vector model,and has achieved good effectiveness in sentiment analysis of twitter.Aliaksei et al.[61] designed a deep convolutional neural network for the task of twitter sentiment analysis,which has achieved good results in the weibo corpora at phrase level and message level.In Ref.[62],the authors divided social network images as visualrelated images and emotion-related images,and extracted visualvocabularies and emotion vocabularies from visualimages and emotion images respectively.The topics of text are the sampling results from the visual vocabularies and emotion vocabularies respectively depending on differentcorrelations.The VELDA proposed by the authors has achieved good results in cross-media retrieval tasks.Sentiment analysis methods,however,mostly simply divided sentiment into three categories: positive,negative,and neutralemotions[63],so how to improve the effect of social network search based on more meticulous sentiment analysis method needs further researches.

3.4.Scene recognition of social network images

Images of the unexpected events transmitted in social networks may be inclined to describe the whole scene rather than a single object,so in order to make the best use of image information,we need a better implementation of scene recognition.Yuan Y et al.[64]proposed a scene recognition algorithm based on human visual system,modeling a stream of regular deep architecture.The algorithm can better utilize the structure information of data,and can be designed to realize scene classification under high-level features in an unsupervised way.

In social networks,there are many similar images describing the scene of an event or hot topics.These similar images are always in different resolutions,and also there are some redundant features included in the images.This social network character makes it difficult to recognize the scene of an image.In Ref.[65],to deal with images with different resolutions,the authors utilized multitask modelbecause itcan find the intrinsic relationships among images.Moreover,to ensure that the most useful information can be extracted from an image,the authors proposed a method named SFSMR, which used the sparse feature selection and manifold regularization.Using the method put forward by the authors,the accuracy of scene recognition has been effectively improved. In addition,because of different shooting angles or distances, different images of the same scene were presented in multiscale features,which often contained different numbers and sizes of objects.The larger the size of the same object,the less the number of objects,and vice versa.In Ref.[66],to deal with this problem,the author proposed the multi-scale neural network for scale-specific feature extractions.In the study of social network cross-modal search,image scene recognition is an important research.However,due to the fact thatimages of the same scene are often multi-scale and multi-resolution,how to extract the most useful information and solve the data bias problem at the same time is worth exploring.

4.Indexing and ranking mechanism of social network search

With the popularity of social networks,users can easily spread information of events they are concerned about on the Internet,and online social networks have become very important media platforms[67].Many of the most widely known events are first spread out in social networks[68]. Because of the large amountof social network data and strong real-time performance,people are more likely to search things of their interests in social networks.Among the research fields of social network search,the indexing and ranking mechanism is crucial to the realization of real-time search.As for searching from the numerous similar images existing in social networks,cross-modal Huffman code is an effective strategy.We introduce the above two research fields(i.e. indexing and ranking mechanism,cross-modal Huffman code) and the most commonly used evaluation criteria of social network search.

4.1.Indexing and ranking mechanism

With the development of network technology and a large increase in the number of Internet users,there has been a massive increase of data on new generation network platforms. With complex information on the Internet,it is difficult for users to rapidly find their most needed information from a large amountof network resources.Based on this,all kinds of information retrieval technologies and search engine technologies have been widely studied and rapidly developed.In Ref.[69],based on comprehensive consideration of time characteristics,social features and text features of social network,the authors proposed a 3D cube inverted indexing mechanism and designed the 3D cube threshold algorithm, which can effectively update different parameters of the system.They proposed the sorting mechanism that can fuse various features,and realized the personalized real-time search in social networks.In Ref.[70],micro-blog search was realized with comprehensive consideration of micro-blog location information.Moreover,according to different regional query rate and update rate,the authors proposed a method to adjust the size of the indexes,which can be used to save memory and achieve micro-blog real-time search.

Existing micro-blog searching is realized in a way that users enter the keyword set,then searching system returns the list of related micro-blogs,including texts,images,videos, links,labels,and social relation features.Taking various characteristics of social network data into account and calculating the similarity between query terms to rank the returned list is the key step of micro-blog searching.In Ref.[71],with comprehensive consideration of micro-blog content characteristics,link characteristics and user relationship characteristics,the authors proposed a micro-blog search scoring mechanism.When applied to the Twitter data set,it achieves good results.In Ref.[72],using designed BloomFilter Chains, candidate documents can be quickly generated,and thus the search speed can be improved.In Ref.[73],the authors put forward a modified language model.By giving appropriate weights to time features to reorder the returned micro-blog lists,good retrieval results can be achieved.

Liang et al.[74]proposed a modified hybrid sorting algorithms named TimeRA.Different from other ranking aggregation algorithms,TimeRA considered time features besides inheriting different searching algorithms'merits.The authors also utilized Matrix factorization methods to model the missing posts.The TimeRA algorithm performed well in retrieval task.In addition,Wang et al.[75]proposed a method that scores the related posts according to the number of users and the number of replies.The more the replies and forwards, the higher the score.They also utilized some time windows to extract outlier features.With the outlier features,the users' engagements are transferred into influence of scores.

4.2.Cross-modal Huffman code

Cross-media data on the Internet is explosively growing, but most of the multimedia images for the same event are similar.These similar images refer to different versions of the original image after many transformations.Under these circumstances,the quick retrieval in large-capacity database has become increasingly important.Therefore,itis crucialto build an effective approximation algorithm that processes crossmedia data to improve the accuracy and efficiency of crossmedia social network search.The combination of deep learning techniques and hash coding has been widely concerned[76].The hash coding technique is used to effectively reduce the dimension of multi-dimensional features while achieving the effective clustering of similar cross-media data. Cao Y et al.[77]proposed an association hash network to uniformly represent images and texts to achieve cross-modal search.Liu H et al.[78]proposed a monitoring matrix factorization hashing algorithm to map the differentmodalities of data to the same public space.Wang Y et al.[79]proposed LBMCH algorithm.Firstly,data on different modalities are established in each Hamming space.Then,automatically it creates a mapping between different modalities,well preserving the characteristics of data itself.A joint learning method of image and text embedding was proposed by Wang L et al. [80],which used a neural network with two branches and multiple linear processes.High-dimensional features of images and text can be mapped to Hamming space by the crossmedia retrieval method based on hashing algorithm.A lowdimensional hash sequence representing an image or a text can be generated.It also can improve the retrieval speed,and satisfy massive image retrieval requirements.

5.Scientific challenges of social network search based on semantic analysis and learning

According to the above study,we can see that online social network search is very attractive,and many researchers are committed to this research.As shown in the overall framework of social network search based on semantic analysis and learning,main research fields of the social network search can be divided into four parts.

(1)Acquisition and representation of the cross-modal and spatio-temporal data in online social networks

Cross-media big data of social networks contains text and images with spatio-temporal information,as well as some other multi-attribute and multi-modal heterogeneous media.The access and expression ofthe cross-modaland spatio-temporal data in on line social networks is the basis of social network search, there are two main challenges to realize it.The first challenge is how to quickly and efficiently get accurate spatio-temporal data information from massive data.To achieve this goal,we need to extract text features,visual features,time characteristics,spatial characteristics and social characteristics,and utilize existing feature representation methods to solve social network content sparsity of short text.The other challenge is how to establish model for cross-media data with multi-attribute and multi modal characteristics to implement the integration of multi source heterogeneous data and entity resolution processing.

(2)Cross-media semantic analysis and modeling in support of the spatio-temporal and social characteristics

Cross-media semantic analysis and modeling technologies provide critical support for semantic learning of cross-media big data in social networks.Social network data exists in different media forms,and has spatio-temporal and social attributes at the same time.The first challenge is thatwe need to identify the corresponding concepts from heterogeneous media.The other challenge we are facing is to design a unified model to analyze and deal with multi-modal data and their various attributes,and then map them into the shared semantic space.The goal of semantic analysis and modeling is to get the consistent expression of multi-attribute heterogeneous data,so how to realize the semantic representation of different characteristics in the same semantic space by optimization is also a great challenge.

(3)Deep semantic learning of cross-media big data in online social networks

In the field of cross-media big data semantic learning,deep learning method has achieved good effects.The texts with spatio-temporal characteristics in the social network usually are short,so they have the characteristics of content sparsity. Corresponding images are commonly not strictly consistent with texts.Considering these problems,the first challenge is that we need to combine the deep features with spatiotemporal features,and choose appropriate level to learn the similarity between the text features and visual features.The other challenge is how to set up a deep neural network architecture to learn the mapping relationship between text features and visual features.We can also use the mapping relationship between texts and images learned in other fields to improve the mapping learning results on social network platforms and realize deep visual semantic matching.

(4)The indexing and ranking mechanism of social network search

Online social network have been an important platform for information releasing and dissemination,and more and more users tend to search on social networks.What type of indexes is used to reduce memory consumption and how to use a variety of social network features to establish personalized ordering mechanism to achieve accurate search are the main challenges of social network search.

6.Conclusions and prospects

As a media platform that everyone is involved in,social network plays an important role in releasing and dissemination of unexpected events.Cross-media social network search based on semantic analysis and learning is the hot issue now. In this paper,we concluded the overall framework of social network search,and briefly reviewed the main aspects of current researches,that is,the spatio-temporal information acquisition and expression of the cross-media big data in social networks;the semantic analysis,modeling and deep semantic learning algorithm of social network cross-media big data in support of the spatio-temporal and social characteristics;and the indexing and ranking mechanism of social network search.We summarized the key role of various techniques in social network search based on recent researches.Although the existing theoretical and technical methods have made remarkable achievements,some problems remain to be further considered.We analyzed the key challenges faced by social network search,and the future work of social network search can be concluded as follows:

As for data acquisition and expression,we need to acquire and express the effective social network information including spatio-temporal information,image information,text information,social information more quickly and easily.Choosing the appropriate methods to carry on the storage management also needs further research.As for data analysis and modeling, integrating the spatio-temporal characteristics,social characteristics,text characteristics and visual characteristics to realize the unified modeling is an important way to promote cross-modal social network search.In deep semantic learning of multi-modal and multi-attribute heterogeneous data,how to use existing deep learning approaches combined with multiple attributes to learn the association mapping of multi-modal data is the key to social network cross-modal search.As for searching algorithms,combining with the existing indexing and ranking methods,researches on deep learning and hash indexing technique still need further exploration to improve searching speed and accuracy.Social network search based on semantic analysis and learning is attractive,and the future work is worthy of study.

Acknowledgment

This work is supported by the National Natural Science Foundation of China(No.61532006,No.61320106006,No. 61502042).

[1]W.Feng,C.Zhang,W.Zhang,J.Han,J.Wang,C.Aggarwal,et al., STREAMCUBE:hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream,in:IEEE,International Conference on Data Engineering,2015,pp.1561-1572.

[2]Y.J.Duan,Research on Key Technologies of Microblog Search,Doctoral dissertation,University of Science and Technology of China,2014.

[3]D.l Wang,G.Yu,S.Feng,Y.F.Zhang,Y.B.Bao,Research on modeling entities and their relations for social media search,Chin.J.Comput.39 (4)(2016)657-674.

[4]L.Castrejon,Y.Aytar,C.Vondrick,H.Pirsiavash,A.Torralba, Learning Aligned Cross-modal Representations from Weakly Aligned Data,2016.

[5]K.Duan,D.J.Crandall,D.Batra,Multimodal learning in loosely organized web images,in:2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),IEEE Computer Society,2014,pp. 2465-2472.

[6]J.Pang,F.Jia,C.Zhang,W.Zhang,Unsupervised web topic detection using a ranked clustering-like pattern across similarity cascades,IEEE Trans.Multimed.17(6)(2015)843-853.

[7]Y.Wei,Y.Zhao,Z.Zhu,S.Wei,Y.Xiao,J.Feng,et al.,Modalitydependent cross-media retrieval,ACM Trans.Intell.Syst.Technol.7(4) (2015).

[8]A.Karpathy,F.F.Li,Deep visual-semantic alignments for generating image descriptions,Comput.Vis.Pattern Recognit.(2014)3128-3137.

[9]Z.H.Zhou,H.R.Zhang,J.Xie,Data crawler for Sina Weibo based on Python,J.Comput.Appl.34(11)(2014)3131-3134.

[10]X.H.Zeng,Research on Topic Based Micro-blog Web Crawler(Doctoral dissertation),Wuhan University of Technology,2014.

[11]J.C.Pereira,E.Coviello,G.Doyle,N.Rasiwasia,G.R.G.Lanckriet, R.Levy,etal.,On the role of correlation and abstraction in cross-modal multimedia retrieval,IEEE Trans.Softw.Eng.36(3)(2013)521-535.

[12]T.Mikolov,I.Sutskever,K.Chen,G.Corrado,J.Dean,Distributed representations of words and phrases and their compositionality,Adv. Neural Inf.Process.Syst.26(2013)3111-3119.

[13]K.Liu,Research on Semantic-based Cross-Media Consistency(Doctoral dissertation),Beijing Jiaotong University,2015.

[14]A.Krizhevsky,I.Sutskever,G.E.Hinton,Imagenet classification with deep convolutional neural networks,Adv.Neural Inf.Process.Syst.25 (2)(2012)2012.

[15]K.Simonyan,A.Zisserman,Very deep convolutional networks for largescale image recognition,Comput.Sci.(2014).

[16]D.M.Blei,J.D.Lafferty,Dynamic topic models,in:International Conference on Machine Learning,2006,pp.113-120.

[17]X.Wang,A.Mccallum,Topics overtime:a non-Markov continuous-time model of topicaltrends,in:ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2006,pp.424-433.

[18]H.Cai,Y.Yang,X.Li,Z.Huang,What are popular:exploring twitter features for event detection,tracking and visualization,in:ACM International Conference on Multimedia,2015,pp.89-98.

[19]G.Mishne,J.Dalton,Z.Li,A.Sharma,J.Lin,Fastdata in the era of big data:twitter's real-time related query suggestion architecture,in:ACM SIGMOD International Conference on Management of Data,2013,pp. 1147-1158.

[20]A.Magdy,R.Alghamdi,M.F.Mokbel,On main-memory flushing in microblogs data management systems,in:IEEE,International Conference on Data Engineering,IEEE Computer Society,2016,pp.445-456.

[21]J.Tang,W.G.Chen,Depth analysis and mining for large social data, Chin.Sci.Bull.(Z1)(2015)509-519.

[22]X.Yan,J.Guo,Y.Lan,X.Cheng,A biterm topic model for short texts, in:International Conference on World Wide Web,ACM,2013,pp. 1445-1456.

[23]W.X.Zhao,J.Jiang,J.Weng,J.He,E.P.Lim,H.Yan,et al.,Comparing twitter and traditional media using topic models,Lect.Notes Comput. Sci.(2011)338-349,6611/2011.

[24]D.M.Blei,M.I.Jordan,Modeling annotated data,in:International ACM SIGIR Conference on Research and Development in Information Retrieval,2003,pp.127-134.

[25]J.Bian,Y.Yang,T.S.Chua,Multimedia summarization for trending topics in microblogs,in:ACM International Conference on Conference on Information&Knowledge Management,2013,pp.1807-1812.

[26]Z.Y.Wang,Social Multimedia Analysis and Summarization(Doctoral dissertation),Tsinghua University,2013.

[27]Y.Wang,F.Wu,J.Song,X.Li,Y.Zhuang,Multi-modal mutual topic reinforce modeling forcross-media retrieval,in:Proceedings of the ACM International Conference on Multimedia,2014,pp.307-316.

[28]C.Kang,S.Xiang,S.Liao,C.Xu,Learning consistent feature representation for cross-modalmultimedia retrieval,IEEE Trans Multimed.17 (3)(2015),1-1.

[29]J.Bian,Y.Yang,H.Zhang,T.S.Chua,Multimedia summarization for social events in microblog stream,IEEE Trans.Multimed.17(2)(2015) 216-228.

[30]Z.Deng,M.Yan,J.Sang,C.Xu,Twitter is faster:personalized timeaware video recommendation from twitter to youtube,ACM Trans. Multimed.Comput.Commun.Appl.11(2)(2015)1-23.

[31]N.Thapen,D.Simmie,C.Hankin,The early bird catches the term: combining twitter and news data for event detection and situational awareness,Comput.Sci.(2015).

[32]R.Auguste,J.Martinet,P.Tirilly,Space-time Histograms and Their Application to Person Re-identification in TV Shows,ICMR,2015,pp. 91-97.

[33]B.Perozzi,R.Al-Rfou,S.Skiena,Deepwalk:Online Learning of Social Representations,Eprint Arxiv,2014,pp.701-710.

[34]F.Wu,X.Lu,J.Song,S.Yan,Z.Zhang,Y.Rui,et al.,Learning of multimodalrepresentations with random walks on the click graph,IEEE Trans.Image Process.25(2)(2016)630-642.

[35]S.Cao,W.Lu,Q.Xu,GraRep:Learning Graph Representations with Global Structural Information,2015,pp.891-900.

[36]K.Bontcheva,D.Rout,Making sense of social media streams through semantics:a survey,Semantic Web 5(2014)373-403.

[37]K.K.Pawar,P.Shrishrimal,R.R.Deshmukh,Twitter sentiment analysis: a review,Int.J.Sci.Eng.Res.6(2015).

[38]F.X.Feng,Deep Learning for Cross-modal Retrieval(Doctoral dissertation),Beijing University of Posts and Telecommunications,2015.

[39]B.C.Ooi,K.L.Tan,S.Wang,W.Wang,Q.Cai,G.Chen,et al.,SINGA: A Distributed Deep Learning Platform,2015,pp.685-688.

[40]P.Bansal,R.Bansal,V.Varma,Towards deep semantic analysis of hashtags,in:European Conference on IrResearch,ECIR 2015,Vienna,Austria, March 29-April2,2015.Proceedings,vol.9022,2015,pp.453-464.

[41]P.Bansal,S.Jain,V.Varma,Towards semantic retrieval of hashtags in microblogs,in:World Wide Web Conference,2015.

[42]K.J.Hong,H.J.Kim,A semantic search technique with Wikipedia-based text representation model,in:International Conference on Big Data and Smart Computing,2016,pp.177-182.

[43]B.Deb,I.Mukherjee,S.N.Srirama,E.Vainikko,A semantic followee recommender in Twitter using Topicmodel and Kalman filter,in:IEEE International Conference on Control and Automation,IEEE,2016.

[44]C.Lv,R.Qiang,F.Fan,J.Yang,Knowledge-based Query Expansion in Real-time Microblog Search,Information RetrievalTechnology,Springer International Publishing,2015.

[45]S.Qian,T.Zhang,R.Hong,C.Xu,Cross-domain collaborative learning in social multimedia,in:ACM International Conference on Multimedia, 2015,pp.99-108.

[46]H.Zang,Research and Applications of Multi-modal Feature Confusion Based on Deep Neural Networks(Doctoral dissertation),Beijing University of Posts and Telecommunications,2014.

[47]A.Frome,G.S.Corrado,J.Shlens,S.Bengio,J.Dean,M.Ranzato,etal., Devise:a deep visual-semantic embedding model,Nips(2013) 2121-2129.

[48]F.Yan,K.Mikolajczyk,Deep correlation for matching images and text, Comput.Vis.Pattern Recognit.(2015).

[49]H.Zhang,X.Shang,H.Luan,Y.Yang,T.S.Chua,Learning features from large-scale,noisy and social image-tag collection,in:ACM International Conference on Multimedia,2015,pp.1079-1082.

[50]F.Feng,X.Wang,R.Li,I.Ahmad,Correspondence autoencoders for cross-modal retrieval,ACM Trans.Multimed.Comput.Commun.Appl. 12(1s)(2015)1-22.

[51]Q.You,H.Jin,Z.Wang,C.Fang,J.Luo,Image Captioning with Semantic Attention,2016.

[52]A.Karpathy,A.Joulin,F.F.Li,Deep fragment embeddings for bidirectionalimage sentence mapping,Adv.Neural Inf.Process.Syst.3(2014) 1889-1897.

[53]A.Habibian,T.Mensink,C.G.M.Snoek,Discovering semantic vocabularies for cross-media retrieval,in:ACM International Conference on Multimedia Retrieval,2015,pp.131-138.

[54]X.Jiang,F.Wu,X.Li,Z.Zhao,W.Lu,S.Tang,et al.,Deep compositional cross-modal learning to rank via local-global alignment,in:ACM International Conference on Multimedia,ACM,2015,pp.69-78.

[55]L.Ma,Z.Lu,L.Shang,H.Li,Multimodal convolutional neural networks for matching image and sentence,Comput.Sci.(2015)2623-2631.

[56]W.Wang,X.Yang,B.C.Ooi,D.Zhang,Y.Zhuang,Effective deep learning-based multi-modal retrieval,Vldb J.Int.J.Very Large Data Bases 25(1)(2016)79-101.

[57]F.Zhao,Y.Huang,L.Wang,T.Tan,Deep semantic ranking based hashing for multi-label image retrieval,Comput.Vis.Pattern Recognit. (2015)1556-1564.IEEE.

[58]Y.Xia,Fast Similar Image Search in Large Scale Image Database (Doctoral dissertation),University of Science and Technology of China, 2015.

[59]T.Niu,S.Zhu,L.Pang,A.E.Saddik,Sentiment Analysis on Multi-view SocialData,MultiMedia Modeling,SpringerInternationalPublishing,2016.

[60]J.Sixto,A.Almeida,D.L′opez-De-Ipi~na,Improving the sentiment analysis process of Spanish tweets with BM25,Nat.Lang.Process.Inf. Syst.(2016).

[61]A.Severyn,A.Moschitti,Twitter sentiment analysis with deep convolutional neural networks,in:The International ACM SIGIR Conference,2015.

[62]T.Chen,H.M.Salaheldeen,X.He,M.Y.Kan,D.Lu,VELDA:relating an image Tweet's text and images,in:Twenty-Ninth AAAI Conference on Artificial Intelligence,2015.

[63]V.A.Kharde,S.Sonawane,Sentiment Analysis of Twitter Data:A Survey of Techniques,2016.

[64]Y.Yuan,L.Mou,X.Lu,Scene recognition by manifold regularized deep learning architecture,IEEE Trans.Neural Netw.Learn.Syst.26(10) (2015)1.

[65]X.Lu,X.Li,L.Mou,Semi-supervised multitask learning for scene recognition,IEEE Trans.Cybern.45(9)(2014)1.

[66]L.Herranz,S.Jiang,X.Li,Scene recognition with CNNs:objects,scales and dataset bias,in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016,pp.571-579.

[67]X.Zhang,X.Chen,Y.Chen,S.Wang,Z.Li,J.Xia,Eventdetection and popularity prediction in microblogging,Neurocomputing 149(2015) 1469-1480.

[68]J.Zhao,X.Wang,Z.Ma,Towards events detection from microblog messages,Int.J.Hybrid Inf.Technol.7(2014)201-210.

[69]Y.Li,Z.Bao,G.Li,K.L.Tan,Real time personalized search on social networks,in:IEEE,International Conference on Data Engineering,2015, pp.639-650.

[70]A.Magdy,M.F.Mokbel,S.Elnikety,S.Nath,Y.He,Mercury:a memory-constrained spatio-temporalreal-time search on microblogs,in: IEEE International Conference on Data Engineering,2014,pp.172-183.

[71]S.Ravikumar,K.Talamadupula,R.Balakrishnan,S.Kambhampati, Raprop:Ranking Tweets by Exploiting the Tweet/user/web Ecosystem and Inter-tweet Agreement,2013.MAI/51-06M(E),2345-2350.

[72]N.Asadi,J.Lin,Fast candidate generation for real-time tweet search with bloom filter chains,ACM Trans.Inf.Syst.31(3)(2013),8-8.

[73]S.Li,H.Ning,Z.Han,H.Qi,A method formicroblog search by adjusting the language model with time,in:Eighth International Conference on Internet Computing for Science and Engineering,2015,pp.25-28.

[74]S.Liang,Z.Ren,W.Weerkamp,E.Meij,M.De Rijke,Time-aware rank aggregation for microblog search,in:ACM International Conference on Conference on Information and Knowledge Management,2014,pp. 989-998.

[75]W.Wang,L.Duan,A.Koul,A.Sheth,YouRank:let user engagement rank microblog search results,in:International AAAI Conference on Weblogs and Social Media,2014.

[76]J.Tang,K.Wang,L.Shao,Supervised matrix factorization hashing for cross-modal retrieval,IEEE Trans.Image Process.Publ.IEEE Signal Process.Soc.25(7)(2016),1-1.

[77]Y.Cao,M.Long,J.Wang,Correlation Hashing Network for Efficient Cross-modal Retrieval,2016.

[78]H.Liu,R.Ji,Y.Wu,W.Liu,G.Hua,Supervised Matrix Factorization for Cross-modality Hashing,2016.

[79]Y.Wang,X.Lin,L.Wu,W.Zhang,Q.Zhang,LBMCH:learning bridging mapping for cross-modalhashing,in:International ACMSIGIR Conference on Research and Development in Information Retrieval, 2015.

[80]L.Wang,Y.Li,S.Lazebnik,Learning deep structure-preserving imagetext embeddings,Comput.Sci.(2015).

Feifei Kouwas born in 1989.She received her M.S. degree in Computer technology from Beijing Technology and Business University.She is now a Ph.D. candidate in Computer Science and Technology of Beijing University of Posts and Telecommunications. Her research interests include social network search, semantic analysis and semantic learning.

Junping Duwas born in 1963.She is now a professor and Ph.D.tutor at the School of Computer Science and Technology,Beijing University of Posts and Telecommunications.Her research interests include artificial intelligence,image processing and pattern recognition.

Yijiang Hewas born in 1994.He received the B.S. degree in Network Engineering from Nangjing University of Posts and Telecommunications.He is now studying for a master's degree in Computer science and Technology from Beijing University of Posts and Telecommunications.His research interests include data mining and deep learning.

Lingfei Yewas born in Hubei,China.She received the B.S.degree in Network Engineering from Beijing University of Posts and Telecommunications in 2015. She is currently pursuing the Master's degree in Computer Science and Technology from Beijing University of Posts and Telecommunications.Her current research interest is personalized tourism search.

Available online 22 December 2016

*Corresponding author.

E-mail address:junpingdu@126.com(J.Du).

Peer review under responsibility of Chongqing University of Technology.

http://dx.doi.org/10.1016/j.trit.2016.12.001

2468-2322/Copyright©2016,Chongqing University of Technology.Production and hosting by Elsevier B.V.This is an open access article under the CC BY-NCND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).

Copyright©2016,Chongqing University of Technology.Production and hosting by Elsevier B.V.This is an open access article under the CC BY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).

CAAI Transactions on Intelligence Technology2016年4期

CAAI Transactions on Intelligence Technology的其它文章: Recent Advances on Human-Computer Dialogue; Building a click model:From idea to practice; A survey on rough set theory and its applications; Evolutionary computation in China:A literature survey; Research progress of artificial psychology and artificial emotion in China; A review on Gaussian Process Latent Variable Models