Table of Content

    20 May 2012
    Volume 47 Issue 5
    Molecular structure matrix and study of  the quantitative relationship between the matrix and molecular physicochemical properties
    LIU Xin-hua, WU Ping
    J4. 2012, 47(5):  1-8. 
    Abstract ( 480 )   PDF (928KB) ( 1340 )   Save
    Related Articles | Metrics

    The molecular structure matrix was founded after the distance matrix and adjacency matrix. The mathematical model which can forecast the physicochemical properties of compounds was set up using the molecular structure matrix, then the chromatographic retention index of some compounds(alcohols and  chiral organic acids) and octanol/water partition coefficients of chlorobenzenes were predicted using the mathematical model. The predicted results show that the molecular structure matrix can predict the physicochemical properties of some different compounds (such as chainlike, cylic, and chiral), the molecular structure matrix is better in structure selectivity than the distance matrix and adjacency matrix.

    Synthesis and characterization of a novel coordination complex [Ni(Hpdc)(2,2’-bipy)(H2O)2]·H2O
    HAN Lu1, SHENG Dao-peng1, WEI Hui-ying1, YANG Yan-zhao1,2 *
    J4. 2012, 47(5):  9-12. 
    Abstract ( 1122 )   PDF (867KB) ( 1294 )   Save
    Related Articles | Metrics

    The  compound, [Ni(Hpdc)(2,2’-bipy)(H2O)2]·H2O (H3pdc=3,5pyrazoledicarboxylic acid, 2,2’-bipy=2,2’-bipyridine), was prepared under hydrothermal conditions. The structure of the coordination complex was determined by Xray single crystal diffraction and characterized by elemental analysis, IR spectrum and thermogravimetric analysis. The coordination complex crystallized in the triclinic system, with space group Pī. The metal ion was located in the centre of a distorted octahedron. The Ni(II) ion was sixcoordinated with three oxygen atoms and three nitrogen atoms. The four molecules that built  up a cell unit were connected via hydrogen bonds while no π-π interactions were observed between aromatic rings.

    Tracking event microblogs: a streaming dynamic topic model
    SHI Cun-hui, LIN Hong-fei*
    J4. 2012, 47(5):  13-18. 
    Abstract ( 823 )   PDF (846KB) ( 1658 )   Save
    Related Articles | Metrics

    In order to solve problems which include the topic drift phenomenon and much higher level of noise in micro-blogs, an algorithm named the Streaming Dynamic Topic Model, which improves the dynamic topic model with MEntropy, was presented to track additional events on topics. The method of the dynamic topic model was  first tried to update the topic in the whole tracking process, which enhanced the description power of the topic model by both positive and negative sides to overcome the topic drift problem. However, as a high level of neutral posts existed, MEntropy was defined and used to evaluate the importance of a microblog for tracking a topic, and was then extended to the dynamic topic model in order to make   a better distinction between even microblogs and neutral ones. Topic tracking experiments on a collection of more than 170,000 users’ 12 million microblogs show that our algorithm is more efficient and with lower noise compared with the traditional dynamic topic model.

    A model for CBR-based collaborative Web search and its applications
    SUN Jing-yu, CHEN Jun-jie*, YU Xue-li, HE Xiu
    J4. 2012, 47(5):  19-24. 
    Abstract ( 469 )   PDF (826KB) ( 1361 )   Save
    Related Articles | Metrics

     With the growing number of internet users,  search engines are widely used, so collaborative Web search becomes an everyday behavior. However the current mainstream search engines and Web browsers are designed for sole users  and are not convenient for collaborative Web search. A novel CBR-based collaborative Web search model utilizing expertise was  explored. First, two ways to implement a collaborative Web search were pointed out. Then, the proposed model and two demo systems were discussed.

    Automatic extracting topic page links from Hub page
    XIA Tian1,2
    J4. 2012, 47(5):  25-31. 
    Abstract ( 502 )   PDF (850KB) ( 1372 )   Save
    Related Articles | Metrics

    A topic link extraction method from Hub page based on extended label tree was proposed. Firs, a topic link sorted list was build and deny rules were learned by prefix tree, then, the link type was pre-determined. Second, by group splitting and re-merging, each candidate link was classified into different groups. The group type and the group which represented the hub page’s core region were identified, and finally all links were put into three different collections. Experimental results show that this method can achieve high-precision for topic link extraction without training.

    Bipartite graph based semi-supervised method for entity mining from the query log
    CAO Lei1,2, GUO Jia-feng1, CHENG Xue-qi1
    J4. 2012, 47(5):  32-37. 
    Abstract ( 544 )   PDF (889KB) ( 1295 )   Save
    Related Articles | Metrics

    Named entity mining from query log aims to mine a list of named entities with the specific type from the query log. A bipartite graph based semi-supervised ranking method, which leverages the relationship between the entities (i.e. entities share common templates) to help improve the ranking, was proposed to resolve the scarcity of seed entity in  existing work about named entity mining from the query log. First, a bipartite graph based on the candidate entities and templates was constructed. Then, the relevance score was propagated from the seed entities to other candidate entities. Finally, the candidate entities were ranked according to the relevance score. An optimization framework for the iterative process was further developed in this  ranking method. Experimental results show the effectiveness of the proposed method.

    Semantic search of microblogs
    LIU Xiao-hua1,2, WEI Fu-ru2, DUAN Ya-juan3, ZHOU Ming2
    J4. 2012, 47(5):  38-42. 
    Abstract ( 543 )   PDF (807KB) ( 1330 )   Save
    Related Articles | Metrics

    To obtain efficient information from a  huge number of microblogs which are short and often informally written, a search engine based on semantic analysis for microblogs semantic search was  proposed.  Unlike current microblogs search engines, it conducts a serials of natural language processings and text minings for microblogs to get interesting points such as named entities, events and opinions, that  are further indexed, and thus  two brand new scenarios are enabled, i.e., classifiction browsing and advanced search. The challenges and their possible solutions, a reference implementation framework, and related core semantic computing technologies, e.g., semantic role labeling, were presented.

    A Chinese organization′s full name and matching abbreviation  algorithm based on edit-distance
    HUANG Lin-sheng1, DENG Zhi-hong1,2, TANG Shi-wei1,2, WANG Wen-qing3, CHEN Ling3
    J4. 2012, 47(5):  43-48. 
    Abstract ( 947 )   PDF (817KB) ( 1633 )   Save
    Related Articles | Metrics

    When dealing with the specific problem of a  Chinese organization′s full name and matching abbreviation,  the traditional string matching algorithm based on editdistance performs poorly. A new algorithm,  also based on editdistance, was provided. The improvements include the following steps: (1)  making the Chinese word segmentation  fit  the Chinese grammatical structure features, (2) modifying the editoperation weights with the redefined semantic similarity, (3) adjusting these weights by adaptive learning, and (4) choosing the full name with minimum edit-distance as the matching result. Experimental results show that our algorithm can effectively achieve higher abbreviationfull name matching accuracy.

    Multiple kernel learning in denoising space
    WANG Peng-ming, ZHONG Mao-sheng, LIU Zun-xiong
    J4. 2012, 47(5):  49-52. 
    Abstract ( 631 )   PDF (874KB) ( 1391 )   Save
    Related Articles | Metrics

    A multiple kernel learning (MKL) technique called lp regularized multiple kernel Fisher discriminant analysis (lp MK-FDA) was reviewed, and MKL′s performance was compared  fixed-norm and p-norm. According to the phenomenon that original feature space  noises exist, the effect of feature space denoising on MKL was investigated. Experiments on the VOC 2007 dataset show that with both the original kernels or denoised kernels, lp MKFDA outperforms its fixed-norm counterparts, and the feature space denoising boosts the performance of both single kernel FDA and lp MKFDA, and also there is a positive correlation between the learnt kernel weights and the amount of variance kept by feature space denoising.

    Music similarity research based on the Web tag
    LIU Xuan1, XU Jie-ping1*, CHEN Jie2
    J4. 2012, 47(5):  53-58. 
    Abstract ( 474 )   PDF (844KB) ( 1190 )   Save
    Related Articles | Metrics

    A method was  proposed for music classification using web tags on web mining, and  the user tags made from Last.fm were used as features to study the music similarity. The web tags extracted from Last.fm were the  music semantic feature. The Latent Semantic Analysis (LSA) was used for dimension reduction, and finally, according to the similarity between music,  clustering results were obtained by the improved K-means. The experimental results show that the proposed method in this paper can get a better result in music classification.

    EB-SVM: support vector machine based data pruning with informatior entropy
    CAO Lin-lin1,2, ZHANG Hua-xiang1,2*, WANG Zhi-chao1,2
    J4. 2012, 47(5):  59-62. 
    Abstract ( 685 )   PDF (866KB) ( 1202 )   Save
    Related Articles | Metrics

    The generalization performance of SVM applied to classification problems will be reduced if different class data are seriously overlapped. A new approach EBSVM (entropy based support vector machine) is presented to prune data based on the concept of the information entropy for support vector machine. The EB-SVM employs the information entropies of the training data to remove the patterns far from the boundaries and delete the noise and overlapped instances close  to the boundaries, and then uses the pruned dataset to construct a SVM classifier. Experimental results show the EB-SVM takes less time than SVM and improves the classification accuracy.

    Multi-label RBF neural networks learning algorithm  based on clustering optimization
    FENG Xin-ying1,2, JI Hua1,2, ZHANG Hua-xiang1,2
    J4. 2012, 47(5):  63-67. 
    Abstract ( 602 )   PDF (841KB) ( 1445 )   Save
    Related Articles | Metrics

    Multi-label learning, combining RBF neural network and K-means clustering algorithm, has achieved good effects. But because the number of clusters cannot be well determined in advance, an accurate value of the clustering cannot be obtained. This problem will lead to lower quality clustering and  clustering instability,  and then affect the stability and the classification performance of the multi-label RBF neural network algorithm. To solve the optimization problems, from the angle of sample geometry, an index function for clustering validity was employed to find the optimal number of clusters for each class. Theoretical research and experimental results show that the improved ML-IRBF algorithm can effectively boost better performance in terms of the stability and capability of classification.

    Text segmentation of patent summary based on a classification algorithm
    DING Chang-lin, CAI Dong-feng, WANG Pei-yan
    J4. 2012, 47(5):  68-72. 
    Abstract ( 519 )   PDF (904KB) ( 1573 )   Save
    Related Articles | Metrics

    Patent summaries are condensed representation of the patents, and if  patent summaries are divided by using their contents, the corresponding patents will be more accurately positioned. Because the length of each patent summary is too short and there are no signs between two different contents, the traditional text segmentation methods cannot be used. In this paper, the problem of text segmentation of a patent summary was changed into sentence classification, and the classification algorithms attempted to solve the problem. The effects of solving the problem with different classification algorithms and different features were analyzed, and the results proved that the segmentation method of the patent summaries by using the methods of sentence classification is feasible.

    Research of Twitter data collection
    FANG Wei-wei1,2, LI Jing-yuan1, LIU Yue1, YU Zhi-hua1, CAO Peng1,2, ZHANG Kai1
    J4. 2012, 47(5):  73-77. 
    Abstract ( 660 )   PDF (816KB) ( 1631 )   Save
    Related Articles | Metrics

    In order to achieve  real-time and efficient access to the data of Twitter,two different methods based on Twitter List API and Lookup API were presented after analyzing the shortcomings of  traditional collection methods. By classifying users, this method can precisely  control the frequency of calling API. A series of experiments on over 260,000 users and over 6 million messages were carried out, and the results show that the combination of the two methods can be efficiently  used to collect Twitter data in real-time.

    Measuring user influence of a microblog based on information diffusion
    GUO Hao, LU Yu-liang, WANG Yu, ZHANG Liang
    J4. 2012, 47(5):  78-83. 
    Abstract ( 621 )   PDF (832KB) ( 1847 )   Save
    Related Articles | Metrics

    Information diffusion and influence modeling are hot topics in microblog research. To do research on influence quantitatively, a concept based on the message diffusion was introduced and complied with how to count it out. The proposed approach was validated on real world datasets, and the result of experiments shows that our method is both effective and stable, especially in condition of limited dataset and time span.

    Modeling of community structure based on P2P streaming systems
    LIU Qi, GE Lian-sheng, QIN Feng-lin
    J4. 2012, 47(5):  84-88. 
    Abstract ( 488 )   PDF (833KB) ( 1104 )   Save
    Related Articles | Metrics

    A modeling study of community structure based on P2P streaming systems was studied. Two types of community structure models, named k-n model and k-n-t model, were presented, of which the small world network characteristics are theoretically analyzed and compared. Numerical result shows that the community structure of k-n model has a higher clustering coefficient than that of k-n-t model, while the community structure k-n-t model achieves a better tradeoff between clustering coefficient and average path length.

    Research of simply separable function sets structure in partial K-valued logic
    GONG Zhi-wei1, LIU Ren-ren2*, WANG Ting2
    J4. 2012, 47(5):  89-92. 
    Abstract ( 512 )   PDF (879KB) ( 1154 )   Save
    Related Articles | Metrics

    According to the completeness theory in partial K-valued logic, the structure of simply separable function sets in partial K-valued logic was discussed. First, the number of direct divisions of m-ary relationships in partial K-valued logic was solved. Then, on the basis of the division, all of the simply separable function sets were given. Finally, the properties of preserving K-ary simply separable function sets were analyzed.

    Research on the DC algorithm for an anomaly detection system based on TLR
    GUO Chen1, LIANG Jia-rong2, LUO Chao3, PENG Shuo1
    J4. 2012, 47(5):  93-97. 
    Abstract ( 619 )   PDF (822KB) ( 1271 )   Save
    Related Articles | Metrics

    To improve the accuracy of traditional Dendritic Cell Algorithms in data abnormality detection tests, we propose the Dendritic Cell Algorithm used by an anomaly detection system based on Toll-Like receptors in combination with the working mechanism of innate immune Toll-Like receptors in the biological immune system. In this algorithm, mature and immature Dendritic Cells are first  obtained by using the Dendritic Cell Algorithm, and then provided to the Toll-Like receptors as inputs,  and the TC level is activated to judge whether the algorithm is abnormal.

    The effects of habitat loss on the spatial PD game
    ZHANG Feng-pan1, WANG Jian-bin2, DU Shu-de3
    J4. 2012, 47(5):  98-102. 
    Abstract ( 590 )   PDF (817KB) ( 1207 )   Save
    Related Articles | Metrics

    The Prisoner′s Dilemma (PD) game is the main theoretical framework in which the maintenance of cooperation in biological populations was studied. Spatial structure serves as the key to this dilemma. A  model of  a spatial PD game under a metapopulation framework was built, and  the  effects of habitat loss on cooperation and population size were studied. The main results were  that, due to moderate habitat loss, the fraction of  cooperators in the population was enhanced. Moreover, the population  size may undergo a temporary period of prosperity just before  extinction even while the habitat loss was increased.  These implied  that   the multibehavior strategy within a population may be a mechanism to defend against the influences    of the changing environment.

    Bifurcation solutions and stability of a predator-prey system with predator saturation and competition
    FENG Xiao-zhou1,2, NIE Hua2
    J4. 2012, 47(5):  103-107. 
    Abstract ( 538 )   PDF (893KB) ( 895 )   Save
    Related Articles | Metrics

    A predator-prey system with predator saturation and competition is investigated. The uniqueness, existence and stability of bifurcation solution which bifurcates from double multiplicity eigenvalue are obtained by using the Lyapunov-Schmidt procedure.

    A filter non-monotone trust region algorithm with a simple quadratic model
    FENG Lin1,2, DUAN Fu-jian1, HE Wen-long1
    J4. 2012, 47(5):  108-114. 
    Abstract ( 494 )   PDF (864KB) ( 1292 )   Save
    Related Articles | Metrics

    A filter non-monotone trust region algorithm based on a simple quadratic model is proposed for unconstrained optimization problems. A filter technique is employed into the method, which makes the trial point of the trust region sub-problem  be taken more often. If the trial step is also rejected by the filter set, a search direction is obtained  by a fixed formula and a step size is obtained  by the non-monotonic Wolfe line search,  and thus a new iterative point is achieved. The algorithm does not resolve the trust region sub-problem, so the amount of computation is reduced.The global convergence of this new method is presented under fewer conditions. Preliminary numerical experiments show that the new method is effective.

    Discrete approximation of the optimal dividend barrier in the dual risk model
    XU Huai
    J4. 2012, 47(5):  115-121. 
    Abstract ( 501 )   PDF (845KB) ( 963 )   Save
    Related Articles | Metrics

     The optimal dividend barrier of the dual risk model under a barrier dividend strategy is considered in this  paper. First the exact solution of the optimal dividend barrier  is presented by Laplace transform. When analytic results are unavailable, the discrete time dual risk model can be used to provide approximations for the optimal dividend barrier. For illustration, the approximate values of optimal dividends are numerically  compared  in two numerical examples.

    A grey bilevel linear multi-objective programming problem and its algorithm
    LIU Bing-bing
    J4. 2012, 47(5):  122-126. 
    Abstract ( 662 )   PDF (836KB) ( 1602 )   Save
    Related Articles | Metrics

     Based on the bilevel linear multi-objective programming problem with multiple objectives at the lower level and the characteristic of the grey system, a grey bilevel linear multi-objective programming problem is put forward, and its model and theorem are given. Under the assumption of the constraint region of the proposed model nonempty and compactness, it is  shown that the optimal solution of the drifting grey bilevel linear multi-objective programming problem can be reached on the extreme point of the constraint region. Finally, an algorithm based on the k-th best method is developed and its global convergence is proven. Numerical examples show that the proposed algorithm is effective.