Loading...

Table of Content

      
    20 July 2015
    Volume 50 Issue 07
    A multi-level page clustering method based on page segmentation
    FAN Yi-xing, GUO Yan, LI Xi-peng, ZHAO Ling, LIU Yue, YU Xiao-ming, CHENG Xue-qi
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  1-8.  doi:10.6040/j.issn.1671-9352.3.2014.270
    Abstract ( 1022 )   PDF (1442KB) ( 1753 )   Save
    References | Related Articles | Metrics
    A multi-level page clustering method based on page segmentation was proposed. In this method, pages were divided into several blocks, and then clustered by using the block feature. By adjusting the threshold of similarity between pages, three-level clustering was obtained: the first level is pages from the same website, the second level is pages from the same website with the same structures, and the last level is pages produced with the same template from the same website. Compared with traditional methods, this method not only could provide multi-level clustering, but also can cluster pages effectively.
    Chinese Micro-blog named entity linking based on multisource knowledge
    ZAN Hong-ying, WU Yong-gang, JIA Yu-xiang, NIU Gui-ling
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  9-16.  doi:10.6040/j.issn.1671-9352.3.2014.026
    Abstract ( 1382 )   PDF (1567KB) ( 1150 )   Save
    References | Related Articles | Metrics
    Named entity is an important component conveying information in texts. Micro-blog is a social network platform used to share brief real-time information, with characteristics such as short text length, nonstandard words, and even the frequent emergence of neologisms.So an accurate understanding of the named entities is needed to ensure a correct analysis of the text information. A Chinese Micro-blog entity linking strategy was proposed based on multi-resource knowledge, combing the dictionary of synonyms, the encyclopedia resources as well as the bag-of-words model together to deal with named entity linking.In this strategy, named entities to be linked in Micro-blog were mapped to the corresponding candidate entities in the knowledge base. The evaluation results obtain a micro average accuracy of 92.97%, based on experiments using data sets of NLP& CC2013 Chinese micro-blog entity linking track. Compared with the state-of-the-art result, the accuracy of this method is two percent higher,which demonstrates the effectiveness of our method.
    Research on fuzzy reasoning of brand repurchase intention based on the online reviews
    QI Fang-li, CUI Xue-lian, ZHAO Narisa
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  17-22.  doi:10.6040/j.issn.1671-9352.3.2014.154
    Abstract ( 1073 )   PDF (1477KB) ( 689 )   Save
    References | Related Articles | Metrics
    Based on the theory of planned behavior(TPB), a fuzzy reasoning model of brand repurchase intention was established from three dimensions: brand attitude, brand reputation and perceived value: then the valuation word corpus and brand repurchase intention fuzzy reasoning rules for cosmetic brands were constructed. By extracting valuation words from the consumer online reviews and making semantic analysis, brand attitude, brand reputation and perceived value were computed, and brand repurchase intention was also obtained. Combining with the total consumption, the customer classification matrix “brand repurchase intention-total consumption” was constructed. Finally, an experimental study with the case of 3 types of facial mask on the Jumei site was done and the reasonable conclusion verified the validity of the method.
    An information extraction method for scientific literature introduction
    ZHU Li-ping, LI Hong-qi, YANG Zhong-guo, LIU Qiang
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  23-30.  doi:10.6040/j.issn.1671-9352.3.2014.307
    Abstract ( 1071 )   PDF (3296KB) ( 890 )   Save
    References | Related Articles | Metrics
    The introduction of the scientific literature could be classified as three categories: background knowledge, problem analysis and work description based on analyses of write model. Each part of the three categories could be depicted by guide words, sentence structure, clue words and sentence position. These features of sentence were used to construct a rule which could distinguish the type of sentences. A rule bank was generated by features extracted from a mount of scientific article sentences. The information of the tree categories could be extracted by simply matching the three types of rules. A text information extraction experiment was studied in the fields of petroleum exploration and data mining,in which the automatically extracted result was compared to human work. The result shows that all three types of information could be extracted effectively.
    An improved DV-hop algorithm based on iterative computation and two communication ranges for sensor network localization
    ZHAO Feng, XU Xiu
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  31-37.  doi:10.6040/j.issn.1671-9352.0.2014.514
    Abstract ( 986 )   PDF (1802KB) ( 536 )   Save
    References | Related Articles | Metrics
    In order to improve the location accuracy of DV-Hop algorithm, an improved algorithm based on iterative computation and two communication ranges was proposed. This algorithm first selects a suitable communication radius for the current network topology and then uses it to estimate the average per hop distance of beacon nodes with the default communication radius of the node. Finally, an iterative algorithm was used to revise the average per hop distance obtained in the previous step so as to select the minimum average per hop distance to calculate the distance between the unknown nodes and beacon nodes. The simulation result indicates that the improved algorithm can greatly improve the location accuracy without obviously increasing algorithm complexity and communication traffic.
    Opinion target extraction with active-learning and automatic annotation
    ZHU Zhu, LI Shou-shan, DAI Min, ZHOU Guo-dong
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  38-44.  doi:10.6040/j.issn.1671-9352.3.2014.106
    Abstract ( 1319 )   PDF (1647KB) ( 1117 )   Save
    References | Related Articles | Metrics
    An opinion target extraction method combined active-learning and automatic annotation is introduced. Firstly, the results of automatically annotation with the confidence are obtained by using a few of labeled corpus to train the classifier to test the unlabeled samples: secondly, the samples of low confidence is annotated by calculating the confidence of every sample: finally, the words of low confidence in the selected samples is annotated manually, while the others are adopted the results of automatic annotation. The empirical results demonstrate that the proposed method effectively reduces the annotation cost and achieves good performance on opinion target extraction.
    Automatic target identification in frame semantic parsing
    CHEN Ya-dong, HONG Yu, YANG Xue-rong, WANG Xiao-bin, YAO Jian-min, ZHU Qiao-ming
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  45-53.  doi:10.6040/j.issn.1671-9352.3.2014.076
    Abstract ( 1089 )   PDF (1877KB) ( 776 )   Save
    References | Related Articles | Metrics
    An automatic target identification method is introduced by analyzing some classification features to distinguish target words, frame elements and non-substantive words. A scalable and accurate target identification system is obtained. The experiment results on FrameNet prove that the joint method gains 3.86% in target identification.
    Research on the multi-point collaboration detection against replication attacks
    ZHOU Xian-cun, LI Ming-xi, LI Rui-xia, XU Ming-juan, LING Hai-bo
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  54-65.  doi:10.6040/j.issn.1671-9352.0.2014.368
    Abstract ( 1511 )   PDF (3337KB) ( 631 )   Save
    References | Related Articles | Metrics
    A number of wireless sensor networks (WSNs) are versatile heterogeneous networks in practical applications. They are composed of several static and mobile networks in which the detection against replication attacks is a great challenge. Based on the collaboration mechanism of static and mobile networks, a multi-point collaboration detection scheme against replication attacks was proposed. The scheme that consists of a polynomial based time-identity related pairwise key predistribution scheme (PTPP) is used to defend the static network and a challenge/response based collaborative detection scheme (CCD) is proposed for the detection of mobile replicas. It is verified by experiments that the scheme shows good performances in both security and costs. It is a kind of practical detection scheme against replication attacks in heterogeneous wireless sensor networks.
    Optimization on MapReduce algorithm based on Hash table
    LI Rui-xia, LIU Ren-jin, ZHOU Xian-cun
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  66-70.  doi:10.6040/j.issn.1671-9352.0.2014.461
    Abstract ( 1422 )   PDF (1309KB) ( 674 )   Save
    References | Related Articles | Metrics
    Distributed parallel computing is commonly used to improve computer performance. But according to different demands, there is not a uniform way to design and implement parallel program. Parallel programming depends on the experience of developer. MapReduce, a distributed parallel programming model, put forward by Google, can perform special parallel program development and operation. MapReduce was optimized by using Hash table, which would decrease fragment of Map function, skip other redundancy function such as Combiner function, reduce transmission load and improve computing efficiency. Meanwhile, the attributes of Map function and Reduce function were kept to make MapReduce maintaining parallel.
    New microblog sentiment lexicon judgment based on generalized Jaccard coefficient
    SANG Le-yuan, XU Xin-feng, ZHANG Jing, HUANG De-gen
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  71-75.  doi:10.6040/j.issn.1671-9352.3.2014.108
    Abstract ( 1312 )   PDF (1419KB) ( 1149 )   Save
    References | Related Articles | Metrics
    New microblog sentiment lexicon polarity judgment is a basic task aiming at classifying its emotion categories in sentiment analysis. This paper proposed a new approach that can judge the polarity of new microblog sentiment lexicon. The feature vectors are employed to represent new sentiment lexicon and the existing sentiment lexicon while the weight values are calculated by PMI. The similarity between the new sentiment lexicon and the candidates which is from three sentiment lexicon sets of different polarities through the generalized Jaccard coefficient, and the relativity between the new sentiment lexicon and the existing sentiment lexicon sets is defined as the sum of the above similarities. Finally, relativity distance differences of the three sentiment lexicon sets are applied to judge the polarity. The result of experiment showed that the F-score calculated through polarity judgment algorithm base on the generalized Jaccard coefficient was two points higher than the best team in COAE 2014.
    Improvement of Lucene full-text indexing efficiency
    LI Sheng-dong, LÜ Xue-qiang, SUN Jun, SHI Shui-cai
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  76-79.  doi:10.6040/j.issn.1671-9352.3.2014.217
    Abstract ( 1087 )   PDF (1034KB) ( 566 )   Save
    References | Related Articles | Metrics
    Lucene is an excellent open-source full-text search technology framework that can be well embedded in its own search engine by expanding its functions in accordance with the framework specification. Lucene index structure and principles were studied, and the efficiency of indexing was enhanced by improving incremental indexing, increasing the size of index buffer in memory and decreasing the frequency of writing index to disk. A full-text retrieval experiments were designed. As a result, the average efficiency of creating index for 10 000 documents has been improved by 19.5%, and the method has good prospects.
    Ordering policy for substitutable products based on consumer preference bias
    LIANG Hong-yan, XU Min-li, JIAN Hui-yun
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  80-88.  doi:10.6040/j.issn.1671-9352.0.2014.406
    Abstract ( 1405 )   PDF (1626KB) ( 700 )   Save
    References | Related Articles | Metrics
    Based on consumer preference bias, a profit-order quantity decision model with stochastic demand under Conditional Value-at-Risk(CVaR)criterion was established. In order to optimize retailer's profit, when the demand is uncertain, the optimal ordering policy with the consideration of retailer's risk preference, consumer preference bias and product substitutability were discussed. The results show that when the order is in certain range, the ordering policy depends on consumer preference bias and not on the product substitutability. When the total order is in certain range, the optimal ordering quantity is positively correlated to the consumer preference bias factor and substitutable factor, and it has a negative correlation to the preference bias of another substitutable product and the retailer's risk averse. If the consumer bias between substitute products is bigger, the total ordering quantity will also be bigger.
    Research on the angle of repose of natural sand
    ZHOU Xiang-ling, muhammadtursun·ABUDUREYIM, YU Sheng-qing, LI Hua-zhen
    JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE). 2015, 50(07):  89-94.  doi:10.6040/j.issn.1671-9352.0.2014.360
    Abstract ( 1613 )   PDF (1652KB) ( 1580 )   Save
    References | Related Articles | Metrics
    In this paper we take the sands collected from the natural dunes at the southwest edge of Taklimakan desert as our experimental subjects. We drift the sands into mounds by using static funnel method and record some videos for the drifting processes Then we translate the videos into pictures and analyze them by the computer software “CorelDraw”. By analysis we get the Dependence of the static angle and the collapse angle of the sand mounds on the time and the particle size. Then we make some conclusions: (1) The angle of the sand mound always follow the same circle, i.e., from the maximum static angle to the collapse angle and back to the minimum static angle. Thus the static angles oscillate over time. (2) When the particle diameter is less than 0.3 millimeter, the difference between the collapse angle and the static angle will be increased with the increase of the particle size. Whereas when the particle diameter is greater than 0.3 millimeter, it will be decreased with the increase of the particle size. (3) The average difference between the collapse angle and the static angle is (4.6±0.6)°. (4) The difference between the collapse angle and the static angle of the mixed sand mounds is range(5.2±0.3)°.