Table of Content

    20 July 2017
    Volume 52 Issue 7
    Granular computing approach for formal concept analysis and its research outlooks
    LI Jin-hai, WU Wei-zhi
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  1-12.  doi:10.6040/j.issn.1671-9352.0.2017.279
    Abstract ( 935 )   PDF (1118KB) ( 640 )   Save
    References | Related Articles | Metrics
    Formal concept analysis is a useful mathematical method for knowledge representation and processing and its key tool is concept lattice. However, the construction of concept lattice takes exponential time complexity, which to some extent makes data processing inefficient and hinders fast development of this theory and its application. Granular computing is well-known for formation of granule, transformation of granule, and synthesis and decomposition of granule. Granular computing allows to consider problem by granularity in various levels, and strikes a balance between accuracy and time consuming in solving problem based on the practical requirements. The main research aim of granular computing approach for formal concept analysis is to incorporate these advantages of granular computing into traditional formal concept analysis for efficiently solving data analysis and processing. More specifically, this paper shows the main research topics of granular computing approach for formal concept analysis from the perspectives of Galois connection based granular computing model, object granule, attribute granule, relation granule, relation-based concept 山 东 大 学 学 报 (理 学 版)第52卷 - 第7期李金海,等:形式概念分析的粒计算方法及其研究展望 \=-granularity, granular rule, granular reduct, granular concept and learning, and concept granular computing systems. In addition, some challenging problems are also proposed for dealing with big data and cognitive learning. The obtained results will provide some references for the further study of granular computing approach of formal concept analysis.
    Reduct updating method in a dynamic formal context based on granular discernibility attribute matrix
    HUANG Tao-lin, NIU Jiao-jiao, LI Jin-hai
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  13-21.  doi:10.6040/j.issn.1671-9352.4.2017.077
    Abstract ( 320 )   PDF (830KB) ( 228 )   Save
    References | Related Articles | Metrics
    Knowledge reduction is an important researchdirection in knowledge discovery. Its research can make rule acquisition easier from data. However, in the real-world, information updating happens as time goes by. This paper mainly discusses how to get the new granular consistent set when the formal context in an updating state from the perspective of granular discernibility matrix. Finally, some properties about granular discernibility attribute matrix are discussed.
    The hybrid parallel rough set model based on pansystems operators
    LI Li, GUAN Tao, LIN He
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  22-29.  doi:10.6040/j.issn.1671-9352.4.2017.089
    Abstract ( 578 )   PDF (854KB) ( 267 )   Save
    References | Related Articles | Metrics
    According to the concept of hybrid parallel space, we propose hybrid parallel rough sets based on hybrid parallel equivalence operators in pansystems, using the transformation thought of pansystems theory and the method of equivalence relations to approximate the target concept. Then, by discussing the basic properties of the hybrid parallel rough set model, it is proved that the model is the generalized expression form of the pansystems rough set. An example is shown that the particles of different knowledge bases are generated under the action of different hybrid parallel equivalent operators, which provides a new approach for the further research of granular computing.
    The fuzzy belief structure and attribute reduction based on multi-granulation fuzzy rough operators
    HU Qian, MI Ju-sheng, LI Lei-jun
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  30-36.  doi:10.6040/j.issn.1671-9352.4.2017.130
    Abstract ( 492 )   PDF (808KB) ( 294 )   Save
    References | Related Articles | Metrics
    Multi-granulation is a hot direction in rough set theory. To make multi-granulation model more applicable to practical data, and to improve the usability of the model, the fuzzy concept is employed in multi-granulation model. A multi-granulation fuzzy rough set model is constructed based on fuzzy similarity relation, and a fuzzy belief structure is established. The belief function and probability function are constructed based on the upper and lower approximations of the multi-granulation fuzzy rough set under the trust structure. An attribute reduction of multi-granulation fuzzy rough sets is explored under fuzzy equivalence relation, and a reduction algorithm is formulated.
    Triadic concept analysis based on rough set theory
    WANG Xia, ZHANG Qian, LI Jun-yu, LIU Qing-feng
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  37-43.  doi:10.6040/j.issn.1671-9352.4.2017.183
    Abstract ( 396 )   PDF (1387KB) ( 258 )   Save
    References | Related Articles | Metrics
    Rough set approximation operators are introduced into triadic concept analysis to define object oriented triadic concepts and property oriented triadic concepts. Firstly, a possibility operator and a necessity operator are defined based on the ternary relation between the object set, attribute set and condition set of a triadic context. And properties of those two types of derivation operators are obtained. Then object oriented triadic concepts and property oriented triadic concepts are defined by using those two types of derivation operators. Finally, triadic diagrams are designed to describe all these object oriented triadic concepts and property oriented triadic concepts more directly.
    A semi-supervised spam review classification method based on heuristic rules
    ZHANG Peng, WANG Su-ge, LI De-yu, WANG Jie
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  44-51.  doi:10.6040/j.issn.1671-9352.1.2016.PC6
    Abstract ( 521 )   PDF (935KB) ( 277 )   Save
    References | Related Articles | Metrics
    Nowadays the Internet has affected everyones lives. E-commercial websites such as online-shopping, group purchases, and online consumption have already become most popular consumption patterns. Almost every e-commercial websites enable and encourage their customers to write a review on their products and services. These customers generative reviews are valuable to potential consumers and merchants, which leads a situation that spam reviews are added into the e-commercial websites manually on purpose of promoting products or damaging reputation of other merchants. Based on this application background, the spam reviews detection research aims to get rid of spam reviews and to make full use of normal customer reviews. This paper focus on COAE2015-TASK4, which sets up a public task of spam review detection. We proposed a semi-supervised spam review classification method based on heuristic rules using the corpora resources provided by the COAE2015-TASK4. Experiments showed our method can effectively detect spam reviews and keep a high classification accuracy of normal customer reviews.
    Emotion-specific word embedding learning for emotion classification
    DU Man, XU Xue-ke, DU Hui, WU Da-yong, LIU Yue, CHENG Xue-qi
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  52-58.  doi:10.6040/j.issn.1671-9352.1.2016.072
    Abstract ( 711 )   PDF (1291KB) ( 417 )   Save
    References | Related Articles | Metrics
    We present a method for emotion classification based on word vector learning which considering the inner patterns and emotion labels of words. Based on the CBOW model, we introduce the inner patterns and the emotion label, in order to enrich the emotional semantics of the word vectors. For one input document, according to the TF-IDF weight of the word, we use the weighted linear combination as the text representation. We use the word vectors or text vectors as the input of the emotion classifier, using machine learning classification method(LR, SVM, CNN), to verify the experimental results in emotion classification task. Experiments show that the presented algorithm performs better than CBOW model.
    Semantic graph optimization algorithm based chinesemicroblog opinion summarization
    ZHANG Cong, PEI Jia-huan, HUANG Kai-yu, HUANG De-gen, YIN Zhang-zhi
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  59-65.  doi:10.6040/j.issn.1671-9352.1.2016.PC2
    Abstract ( 524 )   PDF (872KB) ( 190 )   Save
    References | Related Articles | Metrics
    To obtain key information in different topics efficiently, microblog opinion summarization has been a hot spot in natural language processing recently. The baseline method of this paper extracts keywordsusing TF-IDF algorithm, and calculate the importance scores of microblogs to filter out opinion summarization directly; the naive improved methodadded a step of sentiment classification, andremove microblogs which are of low importance and high semantic repetitionusing semantic distance between microblogs to generate opinion summarization;the method based on semantic graph optimization algorithm constructs a complete graph using importance scores and semantic distance of microblogs, and filters out the opinion summarization using graph optimization algorithm. According to the official result of evaluation,on the test dataset of COAE2016, the average ROUGE-1 value, ROUGE-2 value and ROUGE-SU4 value of 10topics using the naive improved methodreached 26.39%, 0.68% and 5.69% respectively, and got 6 max values out of 9 kinds of evaluation index. Besides, the results of experiments done on COAE2016 sample datasetshows that by using the method based on semantic graph optimization algorithmthe ROUGE-1 value, ROUGE-2 value and ROUGE-SU4 value increased by 0.63%, 1.51%, 2.69% respectively.
    Short text clustering based on word embeddings and EMD
    HUANG Dong, XU Bo, XU Kan, LIN Hong-fei, YANG Zhi-hao
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  66-72.  doi:10.6040/j.issn.1671-9352.1.2016.123
    Abstract ( 837 )   PDF (1207KB) ( 765 )   Save
    References | Related Articles | Metrics
    Short text clustering plays an important role in data mining. The traditional short text clustering model has some problems, such as high dimensionality、sparse data and lack of semantic information. To overcome the shortcomings of short text clustering caused by sparse features、semantic ambiguity、dynamics and other reasons, this paper presents a feature based on the word embeddings representation of text and short text clustering algorithm based on the moving distance of the characteristic words. Initially, the word embeddings that represents semantics of the feature word was gained through training in large-scale corpus with the Continous Skip-gram Model. Furthermore, use the Euclidean distance calculation feature word similarity. Additionally, EMD(Earth Movers Distance)was used to calculate the similarity between the short text. Finally, apply the similarity between the short text to Kmeans clustering algorithm implemented in the short text clustering. The evaluation results on three data sets show that the effect of this method is superior to traditional clustering algorithms.
    Spam messages identification based on multi-feature fusion
    LI Run-chuan, ZAN Hong-ying, SHEN Sheng-ya, BI Yin-long, ZHANG Zhong-jun
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  73-79.  doi:10.6040/j.issn.1671-9352.1.2016.041
    Abstract ( 477 )   PDF (1208KB) ( 252 )   Save
    References | Related Articles | Metrics
    Spam message has increasingly become a serious problem affecting peoples daily live. the informative texts are short and sparse, especially the spam message, in order to avoid filtering mechanism, its structure and content is not always standardized so that the traditional text feature extraction method does not fully apply to text classification. This paper extract the feature item from the structure and semantics of two angles of short message, establish semantic feature list and use multi-feature fusion method to quantitatively express SMS text. According to noise and data imbalance problem exists in message, this paper compares the performance differences of NB, SVM, DT, LR, MLP and RF. The experiment shows that the RF classification algorithm can effectively reduce the impact of noise interference and data imbalance. Through the experiments on the data set which provided by Spam Message Based on Text Content Recognition in CCF 2015 China Creative Competition proved that our method works well.
    Emotion analysis on Microblog short text
    SHI Han-xiao, LI Xiao-jun, HAO Teng-da, LIU Hong, ZHU Liu-qing
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  80-90.  doi:10.6040/j.issn.1671-9352.5.2016.034
    Abstract ( 898 )   PDF (1036KB) ( 543 )   Save
    References | Related Articles | Metrics
    Studies on emotion analysis oriented to Microblog short text are a hot issue in present research. In this paper, dependency grammar was used to analyse Microblog short text and extract relation pair. We proposed the corresponding methods to compute sentiment value and add the corresponding results to discriminant model of sentiment sentences as features. We also proposed discrimination rules of sentiment sentences and utilized the rules to correspondingly preprocess or postprocess before or after the classification model in order to improve the discrimination rate of sentiment sentences. Finally, we used NLP&CC’2013 Chinese Microblog data to testify the effectiveness of our method through experiments. Results show our works have better performance comparing to the best evaluation in that year.
    User age regression with dual-channel LSTM
    CHEN Jing, LI Shou-shan, ZHOU Guo-dong
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  91-96.  doi:10.6040/j.issn.1671-9352.1.2016.019
    Abstract ( 441 )   PDF (1440KB) ( 245 )   Save
    References | Related Articles | Metrics
    Traditional age regression approach cant learn context relation, so we utilize deep learning approach which could make full use of context relation to predict users age. Specific implementation is that we propose a age regression approach based on LSTM. LSTM can learn long short-term memory, namely building long relevant connection between input values. We utilize two different features, namely textual and social features. In order to distinguish the two features and make full use of them, we propose a new age regression approach based on dual-channel LSTM. Specific implementation is to add a Merge layer into LSTM, combing text features representation and social features representation generated by LSTM, to fully learn knowledge between textual and social features. Experimental results show that our method can effectively distinguish textual and social features and improve the performance of age regression.
    A community division method based on network distance and content similarity in micro-blog social network
    ZHANG Zhong-jun, ZHANG Wen-juan, YU Lai-hang, LI Run-chuan
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  97-103.  doi:10.6040/j.issn.1671-9352.1.2016.007
    Abstract ( 534 )   PDF (1264KB) ( 258 )   Save
    References | Related Articles | Metrics
    Existing micro-blog social network community mining methods are based on the network structure, ignoring the importance of nodes behavior, and can not guarantee the adaptability on large-scale complex network structure and the efficiency of community mining. To alleviate these problems, a new method ABDC is proposed for the community network of micro-blog based on the network distance and content similarity, the method considers the structure of the social network of micro-blog at the same time taking into account the historical blog content of the node in the network, improved the accuracy of community division through analysis the historical micro-blog data, In this paper, the Louvain algorithm and its modularity are modified and used to ensure that the method can deal with large scale network data, and 山 东 大 学 学 报 (理 学 版)第52卷 - 第7期张中军,等:基于网络距离和内容相似度的微博社交网络社区划分方法 \=-get high efficiency of community mining. Experiments show that the method can efficiently mine the community structure of micro-blog network, which has great significance for academic research and business applications.
    A reversible image data hiding scheme in Homomorphic encrypted domain
    DING Yi-tao, YANG Hai-bin, YANG Xiao-yuan, ZHOU Tan-ping
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2017, 52(7):  104-110.  doi:10.6040/j.issn.1671-9352.2.2016.212
    Abstract ( 627 )   PDF (2570KB) ( 296 )   Save
    References | Related Articles | Metrics
    A reversible image data hiding scheme was proposed in homomorphic encryption domain. To reserve space for the embedded message, the original data is handled first of all. Then the image-owner used the receivers public key to encrypt the image and the sender embedded the message into the encrypted image. The receiver decrypted the encrypted image with the private key. The receiver got an image with contrast enhancement. The embedded message would be extracted and the image will be recovered if the receiver had the extracting key. At last, the MATLAB experiment proves the correctness of the scheme and better embedding rate.