Publications

A. International Indexed Journals:

A25. Mohiuddin Solaimani, Mohammed Iftekhar, Latifur Khan, Bhavani Thuraisingham, Joe Ingram and Sadi Evren Seker , “Online Anomaly Detection For Multi-Source Vmware Using A Distributed Streaming Framework”, Software Practice And Experience, Wiley, V. 46, Is. 11, Pp. 1441-1588 , 2016

Abstract- Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Such non-conformant patterns typically correspond to samples of interest and are assigned to different labels in different domains, such as outliers, anomalies, exceptions, and malware. A daunting challenge is to detect anomalies in rapid voluminous streams of data.

This paper presents a novel, generic real-time distributed anomaly detection framework for multi-source stream data. As a case study, we investigate anomaly detection for a multi-source VMware-based cloud data center, which maintains a large number of virtual machines (VMs). This framework continuously monitors VMware performance stream data related to CPU statistics (e.g., load and usage). It collects data simultaneously from all of the VMs connected to the network and notifies the resource manager to reschedule its CPU resources dynamically when it identifies any abnormal behavior from its collected data. A semi-supervised clustering technique is used to build a model from benign training data only. During testing, if a data instance deviates significantly from the model, then it is flagged as an anomaly.

Effective anomaly detection in this case demands a distributed framework with high throughput and low latency. Distributed streaming frameworks like Apache Storm, Apache Spark, S4, and others are designed for a lower data processing time and a higher throughput than standard centralized frameworks. We have experimentally compared the average processing latency of a tuple during clustering and prediction in both Spark and Storm and demonstrated that Spark processes a tuple much quicker than storm on average.

A24. Al-Khateeb, Tahseen; Masud, Mohammad; Al-Naami, Khaled; SEKER, Sadi; Khan, Latifur; Aggarwal, Charu; Han, Jiawei; Trabelsi, Zouheir; Mustafa, Ahmad, “Recurring And Novel Class Detection Using Class-Based Ensemble For Evolving Data Stream”, IEEE Transactions On Knowledge And Data Engineering (TKDE), V. 28, Is. 10, Pp. 2752 – 2764 , 2016

Abstract-Streaming data is one of the attention receiving sources for concept-evolution studies. When a new class occurs in the data stream it can be considered as a new concept and so the concept-evolution. One attractive problem occurring in the concept-evolution studies is the recurring classes from our previous study. In data streams, a class can disappear and reappear after a while. Existing studies on data stream classification techniques either misclassify the recurring class or falsely identify the recurring classes as novel classes. Because of the misclassification or false novel classification, the error rates increases on those studies. In this paper we address the problem by defining a novel ensemble technique “class-based” ensemble which replaces the traditional “chunk-based” approach in order to detect the recurring classes. We discuss the details of two different approaches in class-based ensemble and explain and compare them in detail. Different than the previous studies in the field, we also prove the superiority of both “class-based” ensemble method over state-of-art techniques via empirical approach on a number of benchmark data sets including Web comments as text mining challenge.

A23. Khaled Mohammed Al-Naami, Sadi Evren Seker and Latifur Khan, “GISQAF: Mapreduce Guided Spatial Query Processing And Analytics System “, Software: Practice And Experience, Wiley, V. 42, Is. 10, Pp. 1329 – 1349, 2016

Abstract – The Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with more than 250 million observations covering all countries in the world since January 1, 1979. TABARI and CAMEO are the tools that are used to collect and code events from all international news coverage. To query such big geospatial data, traditional RDBMS can no longer be used, and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proven to be a scalable platform to process and analyze Big Data in the cloud. Hadoop, as an implementation of MapReduce, is an open-source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and does not perform efficiently. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Query and Analytics Framework (GISQAF), which has been built on top of SpatialHadoop. GISQAF focuses on two parts: query processing and data analytics. For the query processing part, we show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on the GDELT dataset with a size of 60 GB. We show the results for various types of queries. For the data analytics part, we present an approach for finding Spatial co-occurring events. We show how GISQAF is suitable and efficient to handle data analytics techniques.

A22. Sadi Evren SEKER, “Computerized Argument Delphi Technique”, IEEE Access, 2015, v. 3, pp. 368 – 380

DOI: 10.1109/ACCESS.2015.2424703

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7089162

Abstract – The aim of this study is the computerization of the argument Delphi method. The Delphi method is mainly designed for qualitative prediction within a group of experts, where the experts make predictions and a facilitator controls these predictions until the experts end up with a level of consensus. Argument Delphi, as opposed to the classical Delphi model, is built on the contradictions of the ideas of the experts. Argument Delphi mainly focuses on a discussion topic and asks experts to create new arguments and criticize other arguments from other experts. After a certain level of contradiction, the method yields an amount of contradictory, criticized arguments and builds a decision over these antitheses, as in the Hegelian approach. This is the first time the argument Delphi method has been modeled in a graph of arguments and the problem of qualitative decision has been transferred into a graph problem using Delphi method. This paper is also the first time that argument aggregation and evaluation methods have been proposed. Moreover, the computerized version of argument Delphi is applied to real-world problems using crowd involvement through Facebook. The problem is defined as the prediction of petroleum prices for the end of year and more than 100 contributors from all around the world argued and criticized each other. This paper also discusses the findings of this case study.

A21. Sadi Evren SEKER, Atik Kulaklı, “Macroeconomic ICT Facts and Mobile Telecom Operators via Social Networks and Web Pages”, Journal of Economics Business and Management, JOEBM 2016 Vol.4(2): 99-103 ISSN: 2301-3567 DOI: 10.7763/JOEBM.2016.V4.374
[PDF]
http://www.joebm.com/index.php?m=content&c=index&a=show&catid=56&id=687
Abstract— This study has three major outcomes, the first major outcome of the research is the comparison of countries in Balkan region by some characteristic differences between the mobile phone users, market structures and profitability of mobile operators. The second major outcome is the normalization of mobile operator actions and comparing different operators from different countries with respect to their normalized success. The third major outcome of study is first time collecting the web and social network activeness of companies and building a internet activeness model like number of Facebook shares, number of tweets mentioning operator, number of followers in linked-in, number of unique daily visitors to the web page of operator, number of backlinks from Google, Yahoo or Bing, the Google page rank and so on. We collected all these data and analyzed to build our model of internet activeness from 43 different operators in 13. We believe the analysis are useful for all businesses in these markets, which are related to the internet connection, mobile phone users, telecom operators or e-business. Also a cross-country comparative study can be useful for further market analysis and political and/or macroeconomic studies.

Index Terms— Business intelligence, ICT, social network analysis, mobile phone operators, GSM penetration rate.

A20. Sadi Evren SEKER, Bojan GEORGOEVSKI, (2014) “Financial Crisis and the ICT Industry, Cross Market Research on Europe, US, Turkish and Gulf Countries”, Research Journal of Finance and Accounting, v. 5 , is. 24, pp. 78 – 87, 2014

[PDF]
http://www.iiste.org/Journals/index.php/RJFA/article/view/18495
Abstract- The ICT industry has been the driver for economic growth for several years before the financial crisis. The purpose of the paper is to look at the effects of last financial crisis on the ICT industry and how this macroeconomic shock was transmitted to households. Since the cost of capital became much higher, this weakened the long-term growth of companies, especially those in needs of financing. Also this had a major effect on consumers, who freeze spending, which eventually decreases revenues for companies. Moreover, we aim to provide new empirical evidence on the impact of world financial crisis on the ICT industry and CPI index in different areas of the world. By using different simulations and analysis, we investigate the effect of the crisis on ICT industries and households. We propose a statistical model based on Pearson’s r and linear regression. We use GDP values from four different markets, which are United States, European, Turkish and Gulf Countries and we apply our statistical model with normalization. We use gross domestic product for tracking the financial crisis, consumer price index for households and percentage of ICT Export and Import on the whole export and import of the market as indicator. The statistical model we proposed has a high success rate for between 50% to 77%, depending on the variables and the markets. We also demonstrate the correlation between ICT and GDP in four regions and we show there is 1-year delay between the movement of ICT graph and GDP graph. From this information, it is possible to make predictions about financial crisis via ICT.

A19. Cihan Mert, Sadi Evren SEKER, Giorgi Jamburia, Mehmet Hüseyin Temel, (2014) “Technological Impact Of Employment Web Sites In Caucasus Region”, European Journal of Business Research, vol. 14, is. 3, pp. 81-86, 2014

[PDF]

Abstract- This research conducts the web statistics of the employment web sites with the technological impact on the macroeconomics. The statistical information gathered from web-o-metrics of the Caucasus region job seeking web sites like the number of visitors, Facebook likes or shares, twitter messages about web site, number of back links counted by google, bing or Alexa. On the other hand, the macroeconomic and demographic facts like the population, unemployment rate, median age or migration rate.

A18. Sadi Evren Seker, Bilal Cankir, and Mehmet Lutfi Arslan, ” Information and Communication Technology Reputation for XU030 Quote Companies,” International Journal of Innovation, Management and Technology vol. 5, is. 3, pp. 221-225, June, 2014. DOI:10.7763/IJIMT.2014.V5.517

[PDF]
http://www.ijimt.org/index.php?m=content&c=index&a=show&catid=55&id=815
Abstract—By the increasing spread of information technology and Internet improvements, most of the large-scale companies are paying special attention to their reputation on many types of the information and communication technology. The increasing developments and penetration of new technologies into daily life, brings out paradigm shift on the perception of reputation and creates new concepts like esocieties, techno-culture and new media. Contemporary companies are trying to control their reputation over the new communities who are mostly interacting with social networks, web pages and electronic communication technologies. In this study, the reputation of top 30 Turkish companies, quoted to the Istanbul Stock Market, is studied, based on the information technology interfaces between company and society, such as social networks, blogs, wikis and web pages. The web reputation is gathered through 17 different parameters, collected from Google, Facebook, Twitter, Bing, Alexa, etc. The reputation index is calculated by z-index and fscoring formulations after the min-max normalization of each web reputation parameter.

A17. Kızıl, C., Arslan, M. L., Şeker, Ş. E., “An Accounting Viewpoint for the Relationship Between Intellectual Capital and Web Trends of BIST 30 Firms in Turkey”, Maliye Finans Yazıları Dergisi, Yıl: 28, Sayı: 101, Nisan 2014, ss.53-81.

[PDF]
http://www.finanskulup.org.tr/assets/maliyefinans/101/mfy-101_ckizil_larslan_seseker_an_accounting_viewploint.pdf
Abstract—This study focuses on the correlation between intellectual capital and web trends of the index bist-30, which holds the top 30 companies in Istanbul Stock Market (BIST). Main trends of web sites and companies are collected separately via web tools. Also, intellectual capital is studied and measured based on two methods, which are Market Value / Book Value and Value Added Intellectual Coefficient (VAIC) techniques. Data required for studying, measuring and accounting intellectual capital is gathered from web sites, firm annual reports, company financial statements and Public Disclosure Platform published by the BIST administration.

A16. Mehmet Emin Okur, Zuhal Akkas Dilbaz, Sadi Evren SEKER, “Human Resource Management Problems for Insurance Company Mergers: A Case Study“, Journal of Advanced Management Science (JOAMS), 12/2014, Vol. 2, Is. 4, pp. 316-320, DOI:10.12720/joams.2.4.316-320

[PDF]
http://www.joams.com/index.php?m=content&c=index&a=show&catid=39&id=162
Abstract—In this study, we have studied a case of two leading insurance company merger, which are Aviva and Ak insurance companies operating in Turkey. The company merger case is studied according to the human resource management (HRM) approach. During the study, the HRM perspective of merger is based on the interviews by C-level executives and many HRM related outputs, like gender, age or education level statistics are collected from the company merger. The case, studied in this research, is the biggest merger in Turkish insurance sector up until now and the outputs first time published in this paper states a useful base line for further studies. In this paper, the qualitative outcomes of merger, like the number of employees, education levels or the gender and age distribution is published as well as the qualitative outputs like the strategic achievements, organizational commitment or the leadership model. Another important section is holding the steps followed by the HRM department after the merger operation.

A15. M. Lutfi Arslan, Sadi Evren SEKER, Cevdet Kizil, “Innovation Driven Emerging Technology from two Contrary Perspectives: A Case Study of Internet”, Emerging Markets Journal 03/2014; 3(3):87-07. DOI:10.5195/emaj.2014.54

[PDF]
http://emaj.pitt.edu/ojs/index.php/emaj/article/view/54
Abstract: Internet is a well, organized technological achievement of human being and a rapidly improving medium through time. All the novel technological achievements like web 2.0 or web 3.0 are new epochs of Internet technology and Internet is spreading in multiple dimensions, reforming the paradigm, and innovating the technology in a self- renewing fashion. In this paper, the technological construction of Internet and the social paradigms are discussed from two contrary perspectives. Either as “problem solvers” or “technical experts”, the characteristics of incumbents of technological positions seems very problematic in terms of their roles in shaping technology. Are they so disinterested and unbiased on creation of technology? Can we reduce their roles as such? How can we make sure that they are neutral? If we put their roles that way, what about freedom of individual decision-making?

A14. Sadi Evren Seker, Bilal Cankir, and Mehmet Emin Okur, “Strategic Competition of Internet Interfaces for XU30 Quoted Companies,” International Journal of Computer and Communication Engineering vol. 3, no. 6, pp. 464-468, 2014.

[PDF]
http://www.ijcce.org/index.php?m=content&c=index&a=show&catid=44&id=438
Internet is causing paradigm shifts on almost every aspects of the life. One major paradigm shift also occurs on the strategic competition field. The new strategic competition is studied based on Porter’s value chain analysis and Internet can be considered as a technological improvement on the information and communication technology, which can also be considered as an interface between companies and the environment. The company/environment interface is directly related to the strategic competition and Porter’s five forces. By the new paradigm of information and communication, companies should pay a great attention on their Internet based reputation built on the Internet based interfaces. For example, a company with millions of shares on the social media has an obvious advantage over a company on the same sector without any web page. In this study, the companies are criticized by their Internet interfaces, which are social media interfaces such as Facebook or Twitter and company web pages and blogs measuring hate-marks and love-marks of the companies and Web 2.0 sources such as wikis. After collecting statistical information about these Internet interfaces of each company on Internet interfaces, the companies are indexed based on their Internet interface utilization. Furthermore a new model of competition based on Porter’s value chain analysis is built and applied for the Internet interfaces.

A13.Kızıl, C, Şeker, Ş. E., Avarkan, T. (2014), “Türk İşletme Dünyasından Muhasebe Barter Uygulamaları ve Örnekleri”, Yalova Sosyal Bilimler Dergisi, Cilt: 4, Sayı: 7, ss. 67-78. http://yusbed.net/issue/view/5000001892 (24 Mart 2014).

[PDF]

A12. S. E. SEKER, Y. Unal, Z. Erdem, H. Erdinc Kocer (2014), “Ensembled Correlation between Liver Analysis Outputs”,  International Journal of Biology and Biomedical Engineering, ISSN: 1998-4510, Volume 8, pp. 1-5, 2014

[PDF]
http://www.naun.org/main/NAUN/bio/2014/a022010-099.pdf
Abstract: Data mining techniques on the biological analysis are spreading for most of the areas including the health care and medical information. We have applied the data mining techniques, such as KNN, SVM, MLP or decision trees over a unique dataset, which is collected from 16,380 analysis results for a year. Furthermore we have also used meta-classifiers to question the increased correlation rate between the liver disorder and the liver analysis outputs. The results show that there is a correlation among ALT, AST, Billirubin Direct and Billirubin Total down to 15% of error rate. Also the correlation coefficient is up to 94%. This makes possible to predict the analysis results from each other or disease patterns can be applied over the linear correlation of the parameters.

A11. Mehmet Lutfi ARSLAN, Sadi Evren SEKER (2014), “Web Based Reputation Index of Turkish Universities”, International Journal of E-Education E-Business E-Management and E-Learning (IJEEEE), 2014, Issn : 2010-3654, vol.4, is.3, pp.197-203 , DOI : 10.7763/IJEEEE.2014.V4.330

[PDF]

Available from : http://www.ijeeee.org/index.php?m=content&c=index&a=show&catid=44&id=670

Abstract : This paper attempts to develop an online reputation index of Turkish universities through their online impact and effectiveness. Using 16 different web based parameters and employing normalization process of the results, we have ranked websites of Turkish universities in terms of their web presence. This index is first attempt to determine the tools of reputation of Turkish academic websites and would be a basis for further studies to examine the relation between reputation and the online effectiveness of the universities.

A10. Sadi Evren SEKER, Oguz ALTUN, Ugur AYAN, Cihan MERT (2014), “A Novel String Distance Function based on Most Frequent K Characters”, International Journal of Machine Learning and Computation (IJMLC),2014, Issn : 2010-3700, vol.4, is.2, pp.177-183

[PDF]
Abstract : This study aims to publish a novel similarity metric to increase the speed of comparison operations. Also the new metric is suitable for distance-based operations among strings.
Most of the simple calculation methods, such as string length are fast to calculate but doesn’t represent the string correctly. On the other hand the methods like keeping the histogram over all characters in the string are slower but good to represent the string characteristics in some areas, like natural language.
We propose a new metric, easy to calculate and satisfactory for string comparison.
Method is built on a hash function, which gets a string at any size and outputs the most frequent K characters with their frequencies.
The outputs are open for comparison and our studies showed that the success rate is quite satisfactory for the text mining operations.

A9. Sadi Evren SEKER, Cihan MERT (2013), “A Novel Feature Hashing For Text Mining”, International Black Sea University Journal of Technical Science & Technologies,2013, Issn : 2298-0032, vol.2, pp.37-40

[PDF]
Abstract : Because of the increasing studies on the big data, holding text as data source, the importance of feature hashing has a major role in the literature now. A usual way of text mining on big data, mostly requires a layer of feature hashing, which reduces the size of fea-ture vector. For example getting the word count yields hundreds of thousands of features in most of the cases and taking the pos-tagging would reduce this number into features about 50. By the feature hashing the size of feature vector reduces reasonably and the data mining processes like classification, clustering or associa-tion can run faster. And in some cases, executing some algorithms is impossible with current hardware, where parallel or distributed programming takes into account.
The feature hashing approaches usually can be categorized into two groups. The first group deals with natural language pro-cessing (NLP) algorithms and tries to extract a relatively smarter hash results, which represents the input characteristics at maxi-mum or the mathematical hashing algorithms, which do not deal with the context or meaning of the text input and just processes the input for some binary output. For example POS-Tagging ap-proaches can carry on some features of the input to the output on the other hand hashing algorithms like MD5 or SHA-1 has no effect of input, where they only worry about the less collision on the output.
This study focus on the second group of hashing algorithms and criticize the hashing algorithms using Feistel Network which are widely utilized in the text mining studies. We propose a new ap-proach which is mainly built on the substitution boxes (s-boxes), which is in the core of all Feistel Networks and processes the text faster than the other implementations.

A8. Cevdet Kizil, Mehmet Lutfi ARSLAN, Sadi Evren SEKER (2013), Correlation between Intellectual Capital and Web Trends of Top 30 Companies in Turkey,  INTERNATIONAL JOURNAL OF SOCIAL SCIENCES AND HUMANITY STUDIES Vol. 5, No 2, 2013 ISSN: 1309-8063, pp. 39 – 49

[PDF]

Available from: http://www.sobiad.org/eJOURNALS/journal_IJSS/2013_2.htm
Abstract : This study focuses on the correlation between intellectual capital and web trends of the index BIST-30, which holds the top 30 companies in Istanbul Stock Market (BIST). The trends of web sites and companies are collected separately via web tools. Also, intellectual capital is studied and measured based on two methods, which are Market Value / Book Value and Value Added Intellectual Coefficient (VAIC) techniques. Data required for studying and measuring intellectual capital is gathered from web sites, firm annual reports, company financial statements and public lightening platform published by the BIST administration.

A7. Mehmet Lutfi ARSLAN, Sadi Evren SEKER (2013), The Impact of Employment Web Sites’ Traffic on Unemployment: A Cross Country Comparison,  INTERNATIONAL JOURNAL OF SOCIAL SCIENCES AND HUMANITY STUDIES Vol. 5, No 2, 2013 ISSN: 1309-8063, pp. 130, 138

[PDF]

Available from: http://www.sobiad.org/eJOURNALS/journal_IJSS/2013_2.htm
Abstract : Although employment web sites have recently become the main source for re- cruitment and selection process, the relation between those sites and unemploy- ment rates is seldom addressed. Deriving data from … countries and … web sites, this study explores the correlation between unemployment rates of European countries and the attractiveness of country specific employment web sites. It also compares the changes in unemployment rates and traffic on all the aforemen- tioned web sites. The results showed that there is a strong correlation between web sites traffic and unemployment rates.

A6. Sadi Evren SEKER, Cihan Mert, Khaled Al-Naami, Nuri OZALP, Ugur AYAN (2013), Time Series Analysis on Stock Market for Text Mining Correlation of Economy News, INTERNATIONAL JOURNAL OF SOCIAL SCIENCES AND HUMANITY STUDIES Vol. 6, No 1, 2014 ISSN: 1309-8063 , pp. 69 – 91

[PDF]

Available from: http://www.sobiad.org/eJOURNALS/journal_IJSS/2014_1.htm
Abstract : This paper proposes an information retrieval method for the economy news. The effect of economy news, are researched in the word level and stock market values are considered as the ground proof. The correlation between stock market prices and economy news is an already addressed problem for most of the countries. The most well-known approach is applying the text mining approaches to the news and some time series analysis techniques over stock market closing values in order to apply classification or clustering algorithms over the features extracted. This study goes further and tries to ask the question what are the available time series analysis techniques for the stock market closing values and which one is the most suitable? In this study, the news and their dates are collected into a database and text mining is applied over the news, the text mining part has been kept simple with only term frequency – inverse document frequency method. For the time series analysis part, we have studied 10 different methods such as random walk, moving average, acceleration, Bollinger band, price rate of change, periodic average, difference, momentum or relative strength index and their variation. In this study we have also explained these techniques in a comparative way and we have applied the methods over Turkish Stock Market closing values for more than a 2 year period. On the other hand, we have applied the term frequency – inverse document frequency method on the economy news of one of the high-circulating newspapers in Turkey.

A5. Sadi Evren SEKER, Cihan Mert, Khaled Al-Naami, Nuri Ozalp, Ugur Ayan (2013), Correlation between the Economy News and Stock Market in Turkey., International Journal of Business Intelligence and Review (IJBIR), vol. 4, is. 4, pp. 1-21, 2013

[PDF]
DOI:10.4018/ijbir.2013100101
Available from : http://www.igi-global.com/article/correlation-between-the-economy-news-and-stock-market-in-turkey/104735
Abstract : Depending on the market strength and structure, it is a known fact that there is a correlation between the stock market values and the content in newspapers. The correlation increases in weak and speculative markets, while they never get reduced to zero in the strongest markets. This research focuses on the correlation between the economic news published in a highly circulating newspaper in Turkey and the stock market closing values in Turkey. In the research several feature extraction methodologies are implemented on both of the data sources, which are the stock market values and economic news. Since the economic news is in natural language format, the text mining technique, term frequency – inverse document frequency is implemented. On the other hand, the time series analysis methods like random walk, Bollinger band, moving average or difference are applied over the stock market values. After the feature extraction step, the classification methods are built on the well-known classifiers support vector machine, k-nearest neighborhood and decision tree. Moreover, an ensemble classifier based on majority voting is implemented on top of these classifiers. The success rates show that the results are satisfactory to claim the methods implemented in this study can be spread to future research with similar data sets from other countries.

A4. Cihan Mert, Sadi Evren SEKER (2014), Rsa Şifrelme Sistemine Karşi Yeni Bir Çarpanlara Ayirma Saldirisi, Erzincan Üniversitesi Fen Bilimleri Enstitüsü Dergisi, v. 7 is. 1, pp. 105-131,
[PDF]
Available from: http://dergipark.ulakbim.gov.tr/erzifbed/article/view/5000034055

A3. Sadi Evren SEKER (2015), Temporal Logic Extension for Self Referring, non-existance, Multiple Recurrence and Anterior Past, Turkish Journal of Electrical Engineering and Computer Sciences, v. 23, pp. 212 – 230, DOI:10.3906/elk-1208-93,
(Accepted and online Since April 03,2013) Available from: http://journals.tubitak.gov.tr/elektrik/issues/elk-15-23-1/elk-23-1-16-1208-93.pdf
Abstract : This study focuses on the possible extensions of current temporal logics. Four extensions are proposed in this study which are self referring events, non-existing events, multiple recurrence of events and an improvement on anterior past events. Each of these extensions are on different levels of the temporal logics. The main motivation behind the extensions is the temporal analysis of the Turkish natural language. Similar to the temporal logic studies built on other natural languages, like French, Ukrainian, Italian, Korean, English or Romanian, it is first time the Turkish language has been deeply questioned in the sense of computable temporal logic by using the view of a standardized temporal markup language. This study keeps the methodology of TimeML and researches the Turkish natural language from the perspectives of Reichenbach and Allen’s temporal logics. The Reichenbach temporal logic is perfectly capable of handling the anterior temporal feeling, but it is not enough to handle the sense of “learnt” or “study” which are two past tenses in Turkish. Also Allen’s temporal logic can not handle two events following each other continuously, which case is named as recurring events in this study first time. Finally, depending on the experiences from a 4 year PhD. study on the natural language texts, this study underlines the absence of self-referring or a reference to non-existing events in temporal logics. After adding the above extensions on a computable temporal logic, the capability of tagging the events on Turkish texts has been measured an increase from 18% to 100% for a first time created Turkish corpus. Also, a new software has been implemented to visualize the tagged events and a previous software has been developed to handle events tagged for Turkish.

A2. I. Ocak, S. E. SEKER (2013), Calculation of surface settlements caused by EPBM tunneling using artificial neural network, SVM, and Gaussian processes, Environmental Earth Sciences, Springer-Verlag, Vol. 70, Is. 3, pp. 1263-1276,  DOI: 10.1007/s12665-012-2214-x, Oct. 2013

Available from: http://link.springer.com/article/10.1007%2Fs12665-012-2214-x
Abstract : Increasing demand on infrastructures increases attention to shallow soft ground tunneling methods in urbanized areas. Especially in metro tunnel excavations, due to their large diameters, it is important to control the surface settlements observed before and after excavation, which may cause damage to surface structures. In order to solve this problem, earth pressure balance machines (EPBM) and slurry balance machines have been widely used throughout the world. There are numerous empirical, analytical, and numerical analysis methods that can be used to predict surface settlements. But substantially fewer approaches have been developed for artificial neural network-based prediction methods especially in EPBM tunneling. In this study, 18 different parameters have been collected by municipal authorities from field studies pertaining to EPBM operation factors, tunnel geometric properties, and ground properties. The data source has a preprocess phase for the selection of the most effective parameters for surface settlement prediction. This paper focuses on surface settlement prediction using three different methods: artificial neural network (ANN), support vector machines (SVM), and Gaussian processes (GP). The success of the study has decreased the error rate to 13, 12.8, and 9, respectively, which is relatively better than contemporary research.

A1. I. Ocak, S. E. SEKER (2012), Estimation of Elastic Modulus of Intact Rocks by Artificial Neural Network, Rock Mechanics and Rock Engineering, Springer, Vol. 45, Is. 6, pp. 1047 – 1054, DOI: 10.1007/s00603-012-0236-z, Nov. 2012

Available from: http://link.springer.com/article/10.1007%2Fs00603-012-0236-z
Abstract : The modulus of elasticity of intact rock (E i) is an important rock property that is used as an input parameter in the design stage of engineering projects such as dams, slopes, foundations, tunnel constructions and mining excavations. However, it is sometimes difficult to determine the modulus of elasticity in laboratory tests because high-quality cores are required. For this reason, various methods for predicting E i have been popular research topics in recently published literature. In this study, the relationships between the uniaxial compressive strength, unit weight (γ) and E i for different types of rocks were analyzed, employing an artificial neural network and 195 data obtained from laboratory tests carried out on cores obtained from drilling holes within the area of three metro lines in Istanbul, Turkey. Software was developed in Java language using Weka class libraries for the study. To determine the prediction capacity of the proposed technique, the root-mean-square error and the root relative squared error indices were calculated as 0.191 and 92.587, respectively. Both coefficients indicate that the prediction capacity of the study is high for practical use.

B. International Conference Papers (peer-reviewed, Printed in Proceedings):

B18. Sadi Evren SEKER, Enes ERYARSOY (2015) Generating Digital Reputation Index: A Case Study, Elsevier, Procedia – Social and Behavioral Sciences, Volume 195, 3 July 2015, Pages 1074–1080
Paper also presented in : “World Conference on Technology, Innovation and Entrepreneurship 2015”

Abstract :Digital world is transforming wares to soft-wares, business to e-business, brick and mortar companies to online companies. With its exponentially increasing nature the transformation is continuing. One crucial value of any company, which is interrelated with all the sub divisions and operations of the company, can be considered as its reputation. This paper mainly discusses the quantitative face of reputation from the social capital perspective and its transformation by the increasing and irrepressible power of technology. We attempt to answer the question “how to represent the digital reputation of the companies in a digital world?” under the effect of transformation from web 1.0 to web 3.0. As a solution we propose a quantitative methodology for aggregating digital quantities collected from social network sites, company web pages, blogs and wikis and we propose a formulation and indexing method, built on different dimensions of digital world and by the way, the companies can be ranked respectively. As a case study, we focus on the stock market companies in Turkey and their digital reputation and we output a digital reputation index but more importantly we discuss the methodology of creating a digital index for the reputation of companies. Finally we conduct diagnostics on the output index to assess its degree of validity. We believe the research will be a guidance for the research studies in the digital reputation index, since it is one of the first index creation methodology on digital reputation studies and data source has a great variety, volume and velocity as the big data. http://www.sciencedirect.com/science/article/pii/S1877042815036307

 

B1. Sadi Evren SEKER, Ender Ozcan, Z. Ilknur Karadeniz, (2004) GENERATING JAVA CLASS SKELETON USING A NATURAL LANGUAGE INTERFACE, ICEIS, NLUCS
Abstract : An intelligent natural language interface based on Turkish Language is designed for creating Java class skeleton, listing the class and its members. This interface is developed as a part of a project named as TUJA, a tool for producing Java programs using Turkish sentences. Turkish sentences are converted into instances of schemata. There are three types of schemata: class definition schema, member method schema and member attribute schema. Concept hierarchies are utilized for building the classes and their hierarchical representation for Java class skeleton generation. In this paper, the details of the design and the implementation are described.

B2. Şadi Evren ŞEKER, Banu DİRİ , (2010) “Event Ordering for Turkish Natural Language Texts” , CSW 2010
Abstract : Besides the natural languages in Latin family, Turkish has its own temporal logic to model the order of events.
TimeML is one of the challenging markup language for temporal expressions and event orders.
In this study, we have researched the philosophy behind TimeML which are Reichenbach and Allen’s temporal logics and we have adapted Turkish temporal logic to these philosophical approaches. Also we have applied this philosophical improvement on TimeML applications and tested our success on a corpus created during this study, since there is no previous work on the temporal logic field of Turkish. The test results showed that, the success of TimeML modeling capability increased from 53% to 91%.

B3. Şadi Evren ŞEKER, Banu DİRİ (2010) , “TimeML and Turkish Temporal Logic” , Proceedings of International Conference on Artificial Intelligence, IC-AI 2010, volume 10, pp. 881-887
Abstract : Turkish is one of the widely used and relatively difficult natural language for machine processing. One of the challenges in Turkish is the temporal logic and processing the time of events.
For the Latin family natural languages, there are quite successful solutions like TimeML which is built on the Reichenbach tense analysis and Allen‟s temporal logic. Unfortunately, there is no previous work on Turkish languages up until now.
This paper covers the basic temporal models of Reichenbach and Allen and then proceeds to the improvement of these models to cover temporal logic behind Turkish natural language.
In order to test the success of this study, we have also created a corpus from child stories and tested the success of new implementation.

B4. Şadi Evren ŞEKER, “A Novel Temporal Visualization Framework for Relational Event Representation”
, MSV12, Modelling and Simulation, 2012 , CSREA Press, USA, ISBN: 1-60132-226-7, Pages 258-264
Abstract : Temporal logics are widely used in many study types, such as question answering, ontology, natural language processing, search engines, text summarization, or even visual tools like Gantt charts or UML diagrams.
Computable temporal languages are the logical systems, built on temporal logic, that can be computed to find a result. They can also be defined as computer-computable languages, built on temporal logics. All timeline drawing or planning software uses temporal logic in order to visualize or process cases.
Also, semantic web studies are one of implementation areas where the temporal modeling and reasoning is massively needed. Relation between events or event types and event of subjects can be modeled by using temporal logic.
This study introduces the temporal logics and the computable temporal languages in the current literature. For the first time, some of temporal logic problems are pointed and solved during this study.
Also a novel temporal framework is implemented with JAVA and published on the web which covers the solutions of temporal logic problems.

B5. Şadi Evren ŞEKER, “Web Spider Performance and Data Structure Analysis”, SWWS12, Semantic Web and Web Services, 2012,ISBN:1-60132-232-1, Pages: 73-77
Abstract : The aim of this study is performance evaluation of a web spider which almost all search engines utilize during the web crawling. A data structure is required to keep record of pages visited and the keywords extracted from the web site during the web crawling. The paper first goes into the detail of possible data structures for a web spider and critics all possibilities depending on their time and memory efficiencies. Furthermore the possibilities are narrowed into tree variations only and a tree is selected from each tree data structure family. Finally, a search engine is implemented and all the tree alternatives from each of the tree data structure family are also implemented and the performance of each alternative is benchmarked.

B6. Şadi Evren ŞEKER, “Turkish Query Engine on Library Ontology”, IKE12, Internet Knowledge Engineering, 2012, ISBN:1-60132-222-4, Pages:26-33
Abstract : Purpose of this project is implementing conversational software to interface dialog based sentences between the user and a library database.
This software implemented with a special expertise on library dialogs. The number of possible library dialog sentences is limited and this project covers almost all of these possible sentences. The input sentences are accepted as Turkish and a flexible management system added for further additions. For example any sentence missed on this project can be added with a simple entry on the YACC file.
The technology utilized during this project is YACC and LEX implementation on LINUX. Also the database of the project is implemented over MySQL. LEX and YACC produces C source codes and the functionality of semantic processing and the database queries are also implemented in C language.
One of the hardest part of this project is implementing Turkish language capability over C programming environment on LINUX. All the technological modules of this project which are MySQL, C, LEX, YACC and ZEMBEREK created different problems with the Turkish inputs. During these problems I have searched Internet for the Turkish input implementations of LEX and YACC or the MySQL connection through C and as a result of my findings this project is the first time implementation of Turkish characters sets by LEX, YACC and MySQL at the same time on LINUX.
One of the most important achievements after accomplishing this study is the flexibility of the input sentences. Anybody can add a new grammar rule to the YACC file buy obeying the regular expression structure of YACC. After a successful addition the project will search for this new addition in the input sentences and the answers related to this input will be produced.

B7. Şadi Evren ŞEKER, “Performance Evaluation of a Regular Expression Crawler and Indexer”, ICOMP12, Internet Computing, 2012, CSREA Press, USA, ISBN: 1-60132-220-8, Pages: 33-39
Abstract : This study aims to find a solution for the optimization of indexer and crawler modules of a search engine if the possible varieties of the search phrases are previously known as a regular expression. A search engine can be considered as an expert in any area if the search domain is narrowed and the crawling and indexing modules are optimized according to this domain. A general expertise of the search engines can be modeled with regular expressions like searching only emails or telephone numbers on the Internet. This paper mainly discusses several alternatives on an expert search engine and evaluates the performance of several varieties.

B8. Sadi Evren SEKER, Cihan Mert “Reverse Factorization and Comparison of Factorization Algorithms in Attack to RSA”, Proceedings of the International Conference on Scientific Computing, ISBN:1-60132-238-0, pp. 101-109, 2013

Abstract : Factorization algorithms have a major role in the computer security and cryptography. Most of the widely used crypto- graphic algorithms, like RSA, are built on the mathematical difficulty of factorization for big prime numbers. This re- search, proposes a new approach to the factorization by using two new enhancements. The new approach is also compared with six different factorization algorithms and evaluated the performance on a big data environment. The algorithms covered are elliptic curve method, quadratic sieve, Fermat’s method, trial division and Pollard rho methods. Success rates are compared over a million of integer numbers with different difficulties. We have im- plemented our own algorithm for random number genera- tion, which is also explained in the paper. We also empiri- cally show that the new approach has an advantage on the factorization attack to RSA.

B9. Sadi Evren SEKER, Cihan MERT “S-Box Hashing for Text Mining”, International Symposium on Computing in Informatics and Mathematics (ISCIM13), pp. 851-855, 2013

http://dspace.epoka.edu.al/handle/1/851

Abstract : One of the crucial points in the text mining studies is the feature hashing step. Most of the text mining studies starts with a text data source and processes a feature extraction methodology over the text. Most of the time the feature extraction method should be decided wisely, because, most of the times, it directly effects the results and performance. Another well-known approach is using any feature extraction method, together with the feature hashing. By the way, the feature extraction can be executed without worrying about the performance and the feature hashing reduces the size of the extracted feature vector. Today, one of the widely used hashing algorithms in text mining is the modern hashing algorithms like MD5 or SHA1, which are built over substitution permutation networks (SPN) or Fiestel Networks. The common property of most of the modern hashing algorithms is the implicitly implemented s-boxes. One of the drawbacks of the modern hashing algorithms is the collision free purpose of the algorithm. The permutation step in most of the time is implemented for this purpose and the correlation between the input text and output bits is completely obfuscated. This study focuses on the possible implementations of the s-boxes for the feature hashing. The purpose feature hashing in this study is reducing the feature vector, while keeping the correlation between the input text and the output bits.

B10. Sadi Evren SEKER, Cihan MERT, Khaled Al-Naami, Ugur AYAN, Nuri OZALP, “Ensemble classification over stock market time series and economy news“, Intelligence and Security Informatics (ISI), Proceeding of 2013 IEEE International Conference, pp 272 – 273, ISBN 978-1-4673-6214-6
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6578840
Abstract : Aim of this study is applying the ensemble classification methods over the stock market closing values, which can be assumed as time series and finding out the relation between the economy news. In order to keep the study back ground clear, the majority voting method has been applied over the three classification algorithms, which are the k-nearest neighborhood, support vector machine and the C4.5 tree. The results gathered from two different feature extraction methods are correlated with majority voting meta classifier (ensemble method) which is running over three classifiers. The results show the success rates are increased after the ensemble at least 2 to 3 percent success rate.

B11. Sadi Evren SEKER, Khaled Al-NAAMI, Latifur KHAN, “ Author Attribution on Streaming Data“,Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on , IEEE IRI pp. 497 – 503, Aug. 2013
DOI: 10.1109/IRI.2013.6642511
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6642511
Abstract : The concept of novel authors occurring in streaming data source, such as evolving social media, is an unaddressed problem up until now. Existing author attribution techniques deals with the datasets, where the total number of authors do not change in the training or the testing time of the classifiers. This study focuses on the question, “what happens if new authors are added into the system by time?”. Moreover in this study we are also dealing with the problems that some of the authors may not stay and may disappear by time or may re-appear after a while. In this study stream mining approaches are proposed to solve the problem. The test scenarios are created over the existing IMDB62 data set, which is widely used by author attribution algorithms already. We used our own shuffling algorithms to create the effect of novel authors. Also before the stream mining, POS tagging approaches and the TF-IDF methods are applied for the feature extraction. And we have applied bi-tag approach where two consecutive tags are considered as a new feature in our approach. By the help of novel techniques, first time proposed in this paper, the success rate has been increased from 35% to 61% for the authorship attribution on streaming text data.

B12. Sadi Evren SEKER, Khaled Al-NAAMI “Sentimental Analysis on Turkish Blogs via Ensemble Classifier“, PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON DATA MINING, ISBN:1-60132-239-9, DMIN, pp. 10-16,  2013

Full Proceeding Book : http://world-comp.org/proc2013/dmin/DMIN_Papers.pdf

Abstract : Sentimental analysis on web-mined data has an increasing impact on most of the studies. Sentimental influence of any content on the web is one of the most curios questions by the content creators and publishers. In this study, we have researched the impact of the comments collected from five different web sites in Turkish with more than 2 million comments in total. The web sites are from newspapers; movie reviews, e-marketing web site and a literature web site. We mix all the comments into a single file. The com-ments also have a like or dislike number, which we use as ground proof of the impact of the comment, as the senti-mental of the comment. We try to correlate the text of comment and the like / dislike grade of the proof. We use three classifiers as support vector machine, k-nearest neighborhood and C4.5 decision tree classifier. On top of them, we add an ensemble classifier based on the majority voting. For the feature extraction from the text, we use the term frequency – inverse document frequency approach and limit the top most features depending on their infor-mation gain. The result of study shows that there are about 56% correlation between the blogs and comments and their like / dislike score depending on our classification model.

B13. Sadi Evren SEKER “ Sentimental versus Impact of Blogs“, Proceedings of the International Conference on Internet Computing and Big Data, ISBN: 1-60132-249-6, pp. 127-133, ICOMP 2013

http://www.ucmss.com/cr/main/papersNew/papersAll/ICM3060.pdf
Abstract : Sentimental analysis on web-mined data has an increasing impact on most of the studies. Sentimental influence of any content on the web is one of the most curios questions by the content creators and publishers. Also the impact of the web- mined data is completely another issue than the sentiments. For example categorizing a blog post into positive or negative sentiment is a parameter and measuring the like and dislike numbers of the blog post is completely another issue. In this study, the impact and sentimental of the comments collected from five different web sites in Turkish with more than 300,000 comments in total. The web sites are from newspa- pers; movie reviews, e-marketing web site and a literature web site. All the comments are mixed into a single file. The comments have a like or dislike number, which are used as ground proof of the impact of the comment. The ground proof of the sentiment of the blog is the smileys in the post text. Also the sentiments are spread to the rest of the blogs without smileys by using the bag of words. The correlation is imple- mented by using the support vector machine classifier and success rates are 50.7% similarity between the sentiment and the impact of the comment.

B14. Sadi Evren Seker, Yavuz Unal, Zeki Erdem, H. Erdinc Kocer “Correlation Between Liver Analysis Outputs “, SCI 2013 (System, Control and Informatics) Proceedings of the 2013 International Conference on Systems, Control and Informatics (SCI 2013), Venice, Italy, Sept, 28-29, 2013, ISBN: 978-1-61804-206-4 , pp. 217-222
Conference Proceeding: http://www.europment.org/library/2013/venice/SCI.pdf
Abstract : Data mining techniques on the biological analysis are spreading for most of the areas including the health care and medical information. We have applied the data mining techniques, such as KNN, SVM, MLP or decision trees over a unique dataset, which is collected from 16,380 analysis results for a year. The results show that there is a correlation among ALT, AST, Billirubin Direct and Billirubin Total down to 15% of error rate. Also the correlation coefficient is up to 93%. This makes possible to predict the analysis results from each other or disease patterns can be applied over the linear correlation of the parameters.

 

B15. Sadi Evren Seker, Menduh DINC, M. Lutfi ARSLAN “Productive Academic Database Enrollment“, International Proceedings of Economics Development and Research (IPEDRv76), Vol. 76, pp. 40 -46, Apr, 2014, DOI: 10.7763/IPEDR. 2014. V76. 8
Also paper has taken place on International Conference on Management and Humanities (ICMH’14), in Seoul , S. Korea, 12-13 Apr. 2014.
Proceeding: http://ipedr.com/list-103-1.html
Abstract : University libraries are, more than ever, confronting to allocate their resources in more efficient and productive ways. On the one hand, they try to diversify their options to have rich collections; on the other hand, they challenge the risk of purchasing expensive, repetitive and outdated service, which creates a dilemma. Universities and research institutes in terms of academic database selection are between the poor selection and the inefficient selection. In this study, we try to measure productivity of academic database selection of Turkish universities via statistical parameters, like the number of downloaded papers, the number of academicians, the number of students and so on. The results show that the efficiency varies and the research underlines the importance of productivity through a statistical parameterization of the competitive approach in the selection process. While doing this, we provide a decision support system for the online database selection for academic libraries.

B16. Sadi Evren SEKER, Nuri Ozalp, Khaled Al-Naami, Cihan Mert, “Correlation Between Turkish Stock Market and Economy News”, Reliability Aware Data Fusion, held along with SIAM International Conference on Data Mining 2013 (SDM 2013), May 2013, Austin, TX, USA

B17. Khaled Mohammed Al Naami, Sadi Seker, Latifur Khan (2014), GISQF: An Efficient Spatial Query Processing System, Cloud Computing (CLOUD), 2014 IEEE 7th International Conference on, pp. 681-688, IEEE 2014

Abstract—Collecting observations from all international news coverage and using TABARI
software to code events, the Global Database of Event, Language, and Tone (GDELT) is the
only global political georeferenced event dataset with 250+ million observations covering all
countries in the world from January 1, 1979 to the present with daily updates. The purpose of
this widely used dataset is to help understand and uncover spatial, temporal and perceptual
trends and behaviors of the social and international system. To query such big geospatial 

C. International Book Chapters

C1. Harun Pirim and Şadi Evren Şeker (2012). Ensemble Clustering for Biological Datasets, Chapter 13, Bioinformatics, Horacio Pérez-Sánchez (Ed.), ISBN: 978-953-51-0878-8, InTech, DOI: 10.5772/49956.
Available from : http://www.intechopen.com/books/bioinformatics/ensemble-clustering-for-biological-datasets
DOI: http://dx.doi.org/10.5772/49956

D. National Reviewed Journal Publications :

D1. Şadi Evren ŞEKER, Banu Diri, (2010) Türkçe Metinler için Olay Sıralaması, syf. 63-75, Bilgisayar Mühendisleri Bölüm Başkanları Dergisi
Abstract : Aim of this study is advancing current event ordering methodologies to cover Turkish temporal logic. Currently, some additional operations are required to demonstrate the relation between events or ordering events in a natural language text, after outputting the semantical representation. There are some systematic studies based on English temporal logic and covering most of the Latin family. There are some differences between temporal logics in the
languages in addition to common temporal properties.
In this study, the temporal logics in the literature are researched. Some of these temporal logics are suitable for machine computation and some are suitable for natural language processing. An optimization is suggested on these computable and natural laguage processing suitable temporal logics to cover Turkish temporal logic.

E. National Conference Papers printed in Proceedings:

E1. Sadi Evren SEKER, A. C. Cem Say, Birol Aygun, (2003) “Türkçe Dogal Dil Arayüzlü bir Kişisel Takvim Programının, Tasarim ve Kodlamasi”,TAINN

E2. Sadi Evren SEKER, Banu Diri, (2006) Davranışsal Türkçe Metin Sınıflandırılması ve Kodlanması,ASYU

E3. Ender Ozcan, Alpay Alkan, Seniz Demir, Mesut Ali Ergin, Hakan Kul, and Sadi Evren Seker. (2003) STARS – Student Transcript and Registration System: an Open Source Internet Application. In Akademik Bilisim 2003, pages E-ref:87,

E4. Şadi Evren ŞEKER, Banu DİRİ, (2010) “Reichenbach ve Allen Zamansal Mantığı ileTimeML”, Akademik Bilişim 2010 E-ref:88

F. Books:

F1. ŞEKER, Şadi Evren “Programlama ve Veri Yapılarına Giriş JAVA, C, C++ dilleri ile” , ISBN 978-9944-62-782-5, Publication Date: Feb 2009

F2. Şadi Evren ŞEKER, ”İş Zekası ve Veri Madenciliği (WEKA ile)”, 206 pp., İstanbul, Cinius Yayınları, ISBN 978-605-127-671-7 , Publication Date: 2013

F3. İbrahim AKSEL, Mehmet Lütfi ARSLAN, Cevdet KIZIL, Mehmet Emin OKUR, Şadi Evren ŞEKER, ”Dijital İşletme”, 213 pp., İstanbul, Cinius Yayınları, ISBN ISBN: 978-605-127-675-5, Publication Date: 2013

For more information http://sadievrenseker.com/kitap/

For robots page Click here