JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE)

Molecular structure matrix and study of the quantitative relationship between the matrix and molecular physicochemical properties

LIU Xin-hua, WU Ping

J4. 2012, 47(5): 1-8.

Abstract ( 1388 )

PDF (928KB) ( 2064 )

Save

Related Articles | Metrics

The molecular structure matrix was founded after the distance matrix and adjacency matrix. The mathematical model which can forecast the physicochemical properties of compounds was set up using the molecular structure matrix, then the chromatographic retention index of some compounds(alcohols and chiral organic acids) and octanol／water partition coefficients of chlorobenzenes were predicted using the mathematical model. The predicted results show that the molecular structure matrix can predict the physicochemical properties of some different compounds (such as chainlike, cylic, and chiral), the molecular structure matrix is better in structure selectivity than the distance matrix and adjacency matrix.

Synthesis and characterization of a novel coordination complex ［Ni(Hpdc)(2,2’-bipy)(H2O)2］·H2O

HAN Lu1, SHENG Dao-peng1, WEI Hui-ying1, YANG Yan-zhao1,2 *

J4. 2012, 47(5): 9-12.

Abstract ( 2148 )

PDF (867KB) ( 3385 )

Save

Related Articles | Metrics

The compound, ［Ni(Hpdc)(2,2’-bipy)(H2O)2］·H2O (H3pdc=3,5pyrazoledicarboxylic acid, 2,2’-bipy=2,2’-bipyridine), was prepared under hydrothermal conditions. The structure of the coordination complex was determined by Xray single crystal diffraction and characterized by elemental analysis, IR spectrum and thermogravimetric analysis. The coordination complex crystallized in the triclinic system, with space group Pī. The metal ion was located in the centre of a distorted octahedron. The Ni(II) ion was sixcoordinated with three oxygen atoms and three nitrogen atoms. The four molecules that built up a cell unit were connected via hydrogen bonds while no π-π interactions were observed between aromatic rings.

Tracking event microblogs: a streaming dynamic topic model

SHI Cun-hui, LIN Hong-fei*

J4. 2012, 47(5): 13-18.

Abstract ( 1504 )

PDF (846KB) ( 2133 )

Save

Related Articles | Metrics

In order to solve problems which include the topic drift phenomenon and much higher level of noise in micro-blogs, an algorithm named the Streaming Dynamic Topic Model, which improves the dynamic topic model with MEntropy, was presented to track additional events on topics. The method of the dynamic topic model was first tried to update the topic in the whole tracking process, which enhanced the description power of the topic model by both positive and negative sides to overcome the topic drift problem. However, as a high level of neutral posts existed, MEntropy was defined and used to evaluate the importance of a microblog for tracking a topic, and was then extended to the dynamic topic model in order to make a better distinction between even microblogs and neutral ones. Topic tracking experiments on a collection of more than 170,000 users’ 12 million microblogs show that our algorithm is more efficient and with lower noise compared with the traditional dynamic topic model.

A model for CBR-based collaborative Web search and its applications

SUN Jing-yu, CHEN Jun-jie*, YU Xue-li, HE Xiu

J4. 2012, 47(5): 19-24.

Abstract ( 1024 )

PDF (826KB) ( 2135 )

Save

Related Articles | Metrics

With the growing number of internet users, search engines are widely used, so collaborative Web search becomes an everyday behavior. However the current mainstream search engines and Web browsers are designed for sole users and are not convenient for collaborative Web search. A novel CBR-based collaborative Web search model utilizing expertise was explored. First, two ways to implement a collaborative Web search were pointed out. Then, the proposed model and two demo systems were discussed.

Automatic extracting topic page links from Hub page

XIA Tian1,2

J4. 2012, 47(5): 25-31.

Abstract ( 1178 )

PDF (850KB) ( 2234 )

Save

Related Articles | Metrics

A topic link extraction method from Hub page based on extended label tree was proposed. Firs, a topic link sorted list was build and deny rules were learned by prefix tree, then, the link type was pre-determined. Second, by group splitting and re-merging, each candidate link was classified into different groups. The group type and the group which represented the hub page’s core region were identified, and finally all links were put into three different collections. Experimental results show that this method can achieve high-precision for topic link extraction without training.

Bipartite graph based semi-supervised method for entity mining from the query log

CAO Lei1,2, GUO Jia-feng1, CHENG Xue-qi1

J4. 2012, 47(5): 32-37.

Abstract ( 1300 )

PDF (889KB) ( 1742 )

Save

Related Articles | Metrics

Named entity mining from query log aims to mine a list of named entities with the specific type from the query log. A bipartite graph based semi-supervised ranking method, which leverages the relationship between the entities (i.e. entities share common templates) to help improve the ranking, was proposed to resolve the scarcity of seed entity in existing work about named entity mining from the query log. First, a bipartite graph based on the candidate entities and templates was constructed. Then, the relevance score was propagated from the seed entities to other candidate entities. Finally, the candidate entities were ranked according to the relevance score. An optimization framework for the iterative process was further developed in this ranking method. Experimental results show the effectiveness of the proposed method.

Semantic search of microblogs

LIU Xiao-hua1,2, WEI Fu-ru2, DUAN Ya-juan3, ZHOU Ming2

J4. 2012, 47(5): 38-42.

Abstract ( 1337 )

PDF (807KB) ( 2310 )

Save

Related Articles | Metrics

To obtain efficient information from a huge number of microblogs which are short and often informally written, a search engine based on semantic analysis for microblogs semantic search was proposed. Unlike current microblogs search engines, it conducts a serials of natural language processings and text minings for microblogs to get interesting points such as named entities, events and opinions, that are further indexed, and thus two brand new scenarios are enabled, i.e., classifiction browsing and advanced search. The challenges and their possible solutions, a reference implementation framework, and related core semantic computing technologies, e.g., semantic role labeling, were presented.

A Chinese organization′s full name and matching abbreviation algorithm based on edit-distance

HUANG Lin-sheng1, DENG Zhi-hong1,2, TANG Shi-wei1,2, WANG Wen-qing3, CHEN Ling3

J4. 2012, 47(5): 43-48.

Abstract ( 1960 )

PDF (817KB) ( 2631 )

Save

Related Articles | Metrics

When dealing with the specific problem of a Chinese organization′s full name and matching abbreviation, the traditional string matching algorithm based on editdistance performs poorly. A new algorithm, also based on editdistance, was provided. The improvements include the following steps: (1) making the Chinese word segmentation fit the Chinese grammatical structure features, (2) modifying the editoperation weights with the redefined semantic similarity, (3) adjusting these weights by adaptive learning, and (4) choosing the full name with minimum edit-distance as the matching result. Experimental results show that our algorithm can effectively achieve higher abbreviationfull name matching accuracy.

Multiple kernel learning in denoising space

WANG Peng-ming, ZHONG Mao-sheng, LIU Zun-xiong

J4. 2012, 47(5): 49-52.

Abstract ( 1272 )

PDF (874KB) ( 1961 )

Save

Related Articles | Metrics

A multiple kernel learning (MKL) technique called lp regularized multiple kernel Fisher discriminant analysis (lp MK-FDA) was reviewed, and MKL′s performance was compared fixed-norm and p-norm. According to the phenomenon that original feature space noises exist, the effect of feature space denoising on MKL was investigated. Experiments on the VOC 2007 dataset show that with both the original kernels or denoised kernels, lp MKFDA outperforms its fixed-norm counterparts, and the feature space denoising boosts the performance of both single kernel FDA and lp MKFDA, and also there is a positive correlation between the learnt kernel weights and the amount of variance kept by feature space denoising.

Music similarity research based on the Web tag

LIU Xuan1, XU Jie-ping1*, CHEN Jie2

J4. 2012, 47(5): 53-58.

Abstract ( 1212 )

PDF (844KB) ( 1895 )

Save

Related Articles | Metrics

A method was proposed for music classification using web tags on web mining, and the user tags made from Last.fm were used as features to study the music similarity. The web tags extracted from Last.fm were the music semantic feature. The Latent Semantic Analysis (LSA) was used for dimension reduction, and finally, according to the similarity between music, clustering results were obtained by the improved K-means. The experimental results show that the proposed method in this paper can get a better result in music classification.

EB-SVM: support vector machine based data pruning with informatior entropy

CAO Lin-lin1,2, ZHANG Hua-xiang1,2*, WANG Zhi-chao1,2

J4. 2012, 47(5): 59-62.

Abstract ( 1418 )

PDF (866KB) ( 1626 )

Save

Related Articles | Metrics

The generalization performance of SVM applied to classification problems will be reduced if different class data are seriously overlapped. A new approach EBSVM (entropy based support vector machine) is presented to prune data based on the concept of the information entropy for support vector machine. The EB-SVM employs the information entropies of the training data to remove the patterns far from the boundaries and delete the noise and overlapped instances close to the boundaries, and then uses the pruned dataset to construct a SVM classifier. Experimental results show the EB-SVM takes less time than SVM and improves the classification accuracy.

Multi-label RBF neural networks learning algorithm based on clustering optimization

FENG Xin-ying1,2, JI Hua1,2, ZHANG Hua-xiang1,2

J4. 2012, 47(5): 63-67.

Abstract ( 1292 )

PDF (841KB) ( 2203 )

Save

Related Articles | Metrics

Multi-label learning, combining RBF neural network and K-means clustering algorithm, has achieved good effects. But because the number of clusters cannot be well determined in advance, an accurate value of the clustering cannot be obtained. This problem will lead to lower quality clustering and clustering instability, and then affect the stability and the classification performance of the multi-label RBF neural network algorithm. To solve the optimization problems, from the angle of sample geometry, an index function for clustering validity was employed to find the optimal number of clusters for each class. Theoretical research and experimental results show that the improved ML-IRBF algorithm can effectively boost better performance in terms of the stability and capability of classification.

Text segmentation of patent summary based on a classification algorithm

DING Chang-lin, CAI Dong-feng, WANG Pei-yan

J4. 2012, 47(5): 68-72.

Abstract ( 1293 )

PDF (904KB) ( 2305 )

Save

Related Articles | Metrics

Patent summaries are condensed representation of the patents, and if patent summaries are divided by using their contents, the corresponding patents will be more accurately positioned. Because the length of each patent summary is too short and there are no signs between two different contents, the traditional text segmentation methods cannot be used. In this paper, the problem of text segmentation of a patent summary was changed into sentence classification, and the classification algorithms attempted to solve the problem. The effects of solving the problem with different classification algorithms and different features were analyzed, and the results proved that the segmentation method of the patent summaries by using the methods of sentence classification is feasible.

Research of Twitter data collection

FANG Wei-wei1,2, LI Jing-yuan1, LIU Yue1, YU Zhi-hua1, CAO Peng1,2, ZHANG Kai1

J4. 2012, 47(5): 73-77.

Abstract ( 1865 )

PDF (816KB) ( 2877 )

Save

Related Articles | Metrics

In order to achieve real-time and efficient access to the data of Twitter,two different methods based on Twitter List API and Lookup API were presented after analyzing the shortcomings of traditional collection methods. By classifying users, this method can precisely control the frequency of calling API. A series of experiments on over 260,000 users and over 6 million messages were carried out, and the results show that the combination of the two methods can be efficiently used to collect Twitter data in real-time.

Measuring user influence of a microblog based on information diffusion

GUO Hao, LU Yu-liang, WANG Yu, ZHANG Liang

J4. 2012, 47(5): 78-83.

Abstract ( 1508 )

PDF (832KB) ( 3707 )

Save

Related Articles | Metrics

Information diffusion and influence modeling are hot topics in microblog research. To do research on influence quantitatively, a concept based on the message diffusion was introduced and complied with how to count it out. The proposed approach was validated on real world datasets, and the result of experiments shows that our method is both effective and stable, especially in condition of limited dataset and time span.

Modeling of community structure based on P2P streaming systems

LIU Qi, GE Lian-sheng, QIN Feng-lin

J4. 2012, 47(5): 84-88.

Abstract ( 1294 )

PDF (833KB) ( 1491 )

Save

Related Articles | Metrics

A modeling study of community structure based on P2P streaming systems was studied. Two types of community structure models, named k-n model and k-n-t model, were presented, of which the small world network characteristics are theoretically analyzed and compared. Numerical result shows that the community structure of k-n model has a higher clustering coefficient than that of k-n-t model, while the community structure k-n-t model achieves a better tradeoff between clustering coefficient and average path length.

Research of simply separable function sets structure in partial K-valued logic

GONG Zhi-wei1, LIU Ren-ren2*, WANG Ting2

J4. 2012, 47(5): 89-92.

Abstract ( 1197 )

PDF (879KB) ( 1851 )

Save

Related Articles | Metrics

According to the completeness theory in partial K-valued logic, the structure of simply separable function sets in partial K-valued logic was discussed. First, the number of direct divisions of m-ary relationships in partial K-valued logic was solved. Then, on the basis of the division, all of the simply separable function sets were given. Finally, the properties of preserving K-ary simply separable function sets were analyzed.

Research on the DC algorithm for an anomaly detection system based on TLR

GUO Chen1, LIANG Jia-rong2, LUO Chao3, PENG Shuo1

J4. 2012, 47(5): 93-97.

Abstract ( 1341 )

PDF (822KB) ( 1935 )

Save

Related Articles | Metrics

To improve the accuracy of traditional Dendritic Cell Algorithms in data abnormality detection tests, we propose the Dendritic Cell Algorithm used by an anomaly detection system based on Toll-Like receptors in combination with the working mechanism of innate immune Toll-Like receptors in the biological immune system. In this algorithm, mature and immature Dendritic Cells are first obtained by using the Dendritic Cell Algorithm, and then provided to the Toll-Like receptors as inputs, and the TC level is activated to judge whether the algorithm is abnormal.

The effects of habitat loss on the spatial PD game

ZHANG Feng-pan1, WANG Jian-bin2, DU Shu-de3

J4. 2012, 47(5): 98-102.

Abstract ( 1437 )

PDF (817KB) ( 1811 )

Save

Related Articles | Metrics

The Prisoner′s Dilemma (PD) game is the main theoretical framework in which the maintenance of cooperation in biological populations was studied. Spatial structure serves as the key to this dilemma. A model of a spatial PD game under a metapopulation framework was built, and the effects of habitat loss on cooperation and population size were studied. The main results were that, due to moderate habitat loss, the fraction of cooperators in the population was enhanced. Moreover, the population size may undergo a temporary period of prosperity just before extinction even while the habitat loss was increased. These implied that the multibehavior strategy within a population may be a mechanism to defend against the influences of the changing environment.

Bifurcation solutions and stability of a predator-prey system with predator saturation and competition

FENG Xiao-zhou1,2, NIE Hua2

J4. 2012, 47(5): 103-107.

Abstract ( 1211 )

PDF (893KB) ( 1484 )

Save

Related Articles | Metrics

A predator-prey system with predator saturation and competition is investigated. The uniqueness, existence and stability of bifurcation solution which bifurcates from double multiplicity eigenvalue are obtained by using the Lyapunov-Schmidt procedure.

A filter non-monotone trust region algorithm with a simple quadratic model

FENG Lin1,2, DUAN Fu-jian1, HE Wen-long1

J4. 2012, 47(5): 108-114.

Abstract ( 1181 )

PDF (864KB) ( 2313 )

Save

Related Articles | Metrics

A filter non-monotone trust region algorithm based on a simple quadratic model is proposed for unconstrained optimization problems. A filter technique is employed into the method, which makes the trial point of the trust region sub-problem be taken more often. If the trial step is also rejected by the filter set, a search direction is obtained by a fixed formula and a step size is obtained by the non-monotonic Wolfe line search, and thus a new iterative point is achieved. The algorithm does not resolve the trust region sub-problem, so the amount of computation is reduced.The global convergence of this new method is presented under fewer conditions. Preliminary numerical experiments show that the new method is effective.

Discrete approximation of the optimal dividend barrier in the dual risk model

XU Huai

J4. 2012, 47(5): 115-121.

Abstract ( 1161 )

PDF (845KB) ( 1344 )

Save

Related Articles | Metrics

The optimal dividend barrier of the dual risk model under a barrier dividend strategy is considered in this paper. First the exact solution of the optimal dividend barrier is presented by Laplace transform. When analytic results are unavailable, the discrete time dual risk model can be used to provide approximations for the optimal dividend barrier. For illustration, the approximate values of optimal dividends are numerically compared in two numerical examples.

A grey bilevel linear multi-objective programming problem and its algorithm

LIU Bing-bing

J4. 2012, 47(5): 122-126.

Abstract ( 1459 )

PDF (836KB) ( 2323 )

Save

Related Articles | Metrics

Based on the bilevel linear multi-objective programming problem with multiple objectives at the lower level and the characteristic of the grey system, a grey bilevel linear multi-objective programming problem is put forward, and its model and theorem are given. Under the assumption of the constraint region of the proposed model nonempty and compactness, it is shown that the optimal solution of the drifting grey bilevel linear multi-objective programming problem can be reached on the extreme point of the constraint region. Finally, an algorithm based on the k-th best method is developed and its global convergence is proven. Numerical examples show that the proposed algorithm is effective.

Table of Content