Simplified List of Publications

Ocak I and SEKER SE (2013), “Calculation of surface settlements caused by EPBM tunneling using artificial neural network, SVM, and Gaussian processes”, Environmental Earth Sciences., January, 2013. Vol. 70, pp. 1263-1276.

Abstract: Increasing demand on infrastructures increases attention to shallow soft ground tunneling methods in urbanized areas. Especially in metro tunnel excavations, due to their large diameters, it is important to control the surface settlements observed before and after excavation, which may cause damage to surface structures. In order to solve this problem, earth pressure balance machines (EPBM) and slurry balance machines have been widely used throughout the world. There are numerous empirical, analytical, and numerical analysis methods that can be used to predict surface settlements. But substantially fewer approaches have been developed for artificial neural network-based prediction methods especially in EPBM tunneling. In this study, 18 different parameters have been collected by municipal authorities from field studies pertaining to EPBM operation factors, tunnel geometric properties, and ground properties. The data source has a preprocess phase for the selection of the most effective parameters for surface settlement prediction. This paper focuses on surface settlement prediction using three different methods: artificial neural network (ANN), support vector machines (SVM), and Gaussian processes (GP). The success of the study has decreased the error rate to 13, 12.8, and 9, respectively, which is relatively better than contemporary research.
BibTeX:

@article{smp,
  author = {Ibrahim Ocak and Sadi Evren SEKER},
  title = {Calculation of surface settlements caused by EPBM tunneling using artificial neural network, SVM, and Gaussian processes},
  journal = {Environmental Earth Sciences},
  year = {2013},
  volume = {70},
  pages = {1263--1276},
  url = {http://link.springer.com/content/pdf/10.1007%2Fs12665-012-2214-x},
  doi = {10.1007/s12665-012-2214-x}
}
Ocak I and SEKER SE (2012), “Estimation of Elastic Modulus of Intact Rocks by Artificial Neural Network”, Rock Mechanics and Rock Engineering., November, 2012. Vol. 45, pp. 1047-1054.

Abstract: The modulus of elasticity of intact rock (Ei) is an important rock property that is used as an input parameter in the design stage of engineering projects such as dams, slopes, foundations, tunnel constructions and mining excavations. However, it is sometimes difficult to determine the modulus of elasticity in laboratory tests because high-quality cores are required. For this reason, various methods for predicting E i have been popular research topics in recently published literature. In this study, the relationships between the uniaxial compressive strength, unit weight (γ) and E i for different types of rocks were analyzed, employing an artificial neural network and 195 data obtained from laboratory tests carried out on cores obtained from drilling holes within the area of three metro lines in Istanbul, Turkey. Software was developed in Java language using Weka class libraries for the study. To determine the prediction capacity of the proposed technique, the root-mean-square error and the root relative squared error indices were calculated as 0.191 and 92.587, respectively. Both coefficients indicate that the prediction capacity of the study is high for practical use.
BibTeX:

@article{rmse,
  author = {Ibrahim Ocak and Sadi Evren SEKER},
  title = {Estimation of Elastic Modulus of Intact Rocks by Artificial Neural Network},
  journal = {Rock Mechanics and Rock Engineering},
  year = {2012},
  volume = {45},
  pages = {1047--1054},
  url = {http://link.springer.com/content/pdf/10.1007%2Fs00603-012-0236-z},
  doi = {10.1007/s00603-012-0236-z}
}
Ozcan E, Seker SE and Karadeniz ZI (2004), “Generating Java Class Skeleton Using a Natural Language Interface”, In NLUCS. , pp. 126-134.

BibTeX:

@inproceedings{nlucs,
  author = {Ender Ozcan and Sadi Evren Seker and Zeynep Ilknur Karadeniz},
  title = {Generating Java Class Skeleton Using a Natural Language Interface},
  booktitle = {NLUCS},
  year = {2004},
  pages = {126-134}
}
Pirim H and SEKER SE (2012), “Ensemble Clustering for Biological Datasets, Chapter 13”, In Oxford Handbook of Innovation. Vol. Bioinformatics InTech Press.

BibTeX:

@book{bioinfcluster2012,
  author = {Harun Pirim and Sadi Evren SEKER},
  editor = {Horacio Pérez-Sánchez},
  title = {Ensemble Clustering for Biological Datasets, Chapter 13},
  booktitle = {Oxford Handbook of Innovation},
  publisher = {InTech Press},
  year = {2012},
  volume = {Bioinformatics},
  doi = {10.5772/49956}
}
Sadi Evren SEKER CM (2013), “Reverse Factorization and Comparison of Factorization Algorithms in Attack to RSA”, In Proceedings of the International Conference on Scientific Computing. , pp. 101-109.

Abstract: Factorization algorithms have a major role in the computer security and cryptography. Most of the widely used crypto- graphic algorithms, like RSA, are built on the mathematical difficulty of factorization for big prime numbers. This re- search, proposes a new approach to the factorization by using two new enhancements. The new approach is also compared with six different factorization algorithms and evaluated the performance on a big data environment. The algorithms covered are elliptic curve method, quadratic sieve, Fermat’s method, trial division and Pollard rho methods. Success rates are compared over a million of integer numbers with different difficulties. We have im- plemented our own algorithm for random number genera- tion, which is also explained in the paper. We also empiri- cally show that the new approach has an advantage on the factorization attack to RSA.
BibTeX:

@inproceedings{csc2013,
  author = {Sadi Evren SEKER, Cihan Mert},
  title = {Reverse Factorization and Comparison of Factorization Algorithms in Attack to RSA},
  booktitle = {Proceedings of the International Conference on Scientific Computing},
  year = {2013},
  pages = {101-109}
}
Sadi Evren SEKER CM (2013), “S-Box Hashing for Text Mining”, In 2nd International Symposium on Computing in Informatics and Mathematics (13).

Abstract: One of the crucial points in the text mining studies is the feature hashing step. Most of the text mining studies starts with a text data source and processes a feature extraction methodology over the text. Most of the time the feature extraction method should be decided wisely, because, most of the times, it directly effects the results and performance. Another well-known approach is using any feature extraction method, together with the feature hashing. By the way, the feature extraction can be executed without worrying about the performance and the feature hashing reduces the size of the extracted feature vector. Today, one of the widely used hashing algorithms in text mining is the modern hashing algorithms like MD5 or SHA1, which are built over substitution permutation networks (SPN) or Fiestel Networks. The common property of most of the modern hashing algorithms is the implicitly implemented s-boxes. One of the drawbacks of the modern hashing algorithms is the collision free purpose of the algorithm. The permutation step in most of the time is implemented for this purpose and the correlation between the input text and output bits is completely obfuscated. This study focuses on the possible implementations of the s-boxes for the feature hashing. The purpose feature hashing in this study is reducing the feature vector, while keeping the correlation between the input text and the output bits.
BibTeX:

@inproceedings{iscim2013,
  author = {Sadi Evren SEKER, Cihan MERT},
  title = {S-Box Hashing for Text Mining},
  booktitle = {2nd International Symposium on Computing in Informatics and Mathematics (13)},
  year = {2013},
  url = {http://dspace.epoka.edu.al/handle/1/851}
}
Sadi Evren SEKER KA-N (2013), “Sentimental Analysis on Turkish Blogs via Ensemble Classifier”, In PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON DATA MINING (DMIN13). , pp. 10-16.

Abstract: Sentimental analysis on web-mined data has an increasing impact on most of the studies. Sentimental influence of any content on the web is one of the most curios questions by the content creators and publishers. In this study, we have researched the impact of the comments collected from five different web sites in Turkish with more than 2 million comments in total. The web sites are from newspapers; movie reviews, e-marketing web site and a literature web site. We mix all the comments into a single file. The com-ments also have a like or dislike number, which we use as ground proof of the impact of the comment, as the senti-mental of the comment. We try to correlate the text of comment and the like / dislike grade of the proof. We use three classifiers as support vector machine, k-nearest neighborhood and C4.5 decision tree classifier. On top of them, we add an ensemble classifier based on the majority voting. For the feature extraction from the text, we use the term frequency – inverse document frequency approach and limit the top most features depending on their infor-mation gain. The result of study shows that there are about 56% correlation between the blogs and comments and their like / dislike score depending on our classification model.
BibTeX:

@inproceedings{dmin2013,
  author = {Sadi Evren SEKER, Khaled Al-NAAMI},
  title = {Sentimental Analysis on Turkish Blogs via Ensemble Classifier},
  booktitle = {PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON DATA MINING (DMIN13)},
  year = {2013},
  pages = {10-16}
}
Sadi Evren Seker Yavuz Unal ZEHEK (2013), “Correlation Between Liver Analysis Outputs”, In Proceedings of the 2013 International Conference on Systems, Control and Informatics (SCI 2013). , pp. 217-222.

Abstract: Data mining techniques on the biological analysis are spreading for most of the areas including the health care and medical information. We have applied the data mining techniques, such as KNN, SVM, MLP or decision trees over a unique dataset, which is collected from 16,380 analysis results for a year. The results show that there is a correlation among ALT, AST, Billirubin Direct and Billirubin Total down to 15% of error rate. Also the correlation coefficient is up to 93%. This makes possible to predict the analysis results from each other or disease patterns can be applied over the linear correlation of the parameters.
BibTeX:

@inproceedings{sci2013,
  author = {Sadi Evren Seker, Yavuz Unal, Zeki Erdem, H. Erdinc Kocer},
  title = {Correlation Between Liver Analysis Outputs},
  booktitle = {Proceedings of the 2013 International Conference on Systems, Control and Informatics (SCI 2013)},
  year = {2013},
  pages = {217-222}
}
SEKER SE (2013), “Sentimental versus Impact of Blogs”, In Proceedings of the International Conference on Internet Computing and Big Data. , pp. 127-133.

Abstract: Sentimental analysis on web-mined data has an increasing impact on most of the studies. Sentimental influence of any content on the web is one of the most curios questions by the content creators and publishers. Also the impact of the web- mined data is completely another issue than the sentiments. For example categorizing a blog post into positive or negative sentiment is a parameter and measuring the like and dislike numbers of the blog post is completely another issue. In this study, the impact and sentimental of the comments collected from five different web sites in Turkish with more than 300,000 comments in total. The web sites are from newspa- pers; movie reviews, e-marketing web site and a literature web site. All the comments are mixed into a single file. The comments have a like or dislike number, which are used as ground proof of the impact of the comment. The ground proof of the sentiment of the blog is the smileys in the post text. Also the sentiments are spread to the rest of the blogs without smileys by using the bag of words. The correlation is imple- mented by using the support vector machine classifier and success rates are 50.7% similarity between the sentiment and the impact of the comment.
BibTeX:

@inproceedings{icomp2013,
  author = {Sadi Evren SEKER},
  title = {Sentimental versus Impact of Blogs},
  booktitle = {Proceedings of the International Conference on Internet Computing and Big Data},
  year = {2013},
  pages = {127-133}
}
SEKER SE (2012), “Performance Evaluation of a Regular Expression Crawler and Indexer”, In Proceedings of the 2012 International Conference on Internet Computing (ICOMP 2012)., July, 2012. , pp. 33-39. CSREA Press.

Abstract: This study aims to find a solution for the optimization of indexer and crawler modules of a search engine if the possible varieties of the search phrases are previously known as a regular expression. A search engine can be considered as an expert in any area if the search domain is narrowed and the crawling and indexing modules are optimized according to this domain. A general expertise of the search engines can be modeled with regular expressions like searching only emails or telephone numbers on the Internet. This paper mainly discusses several alternatives on an expert search engine and evaluates the performance of several varieties.
BibTeX:

@inproceedings{icomp2012,
  author = {Sadi Evren SEKER},
  title = {Performance Evaluation of a Regular Expression Crawler and Indexer},
  booktitle = {Proceedings of the 2012 International Conference on Internet Computing (ICOMP 2012)},
  publisher = {CSREA Press},
  year = {2012},
  pages = {33--39}
}
SEKER SE (2012), “Turkish Query Engine on Library Ontology”, In Proceedings of the 2012 International Conference on Internet Knowledge Engineering (IKE 2012)., July, 2012. , pp. 26-33. CSREA Press.

Abstract: Purpose of this project is implementing conversational software to interface dialog based sentences between the user and a library database. This software implemented with a special expertise on library dialogs. The number of possible library dialog sentences is limited and this project covers almost all of these possible sentences. The input sentences are accepted as Turkish and a flexible management system added for further additions. For example any sentence missed on this project can be added with a simple entry on the YACC file. The technology utilized during this project is YACC and LEX implementation on LINUX. Also the database of the project is implemented over MySQL. LEX and YACC produces C source codes and the functionality of semantic processing and the database queries are also implemented in C language. One of the hardest part of this project is implementing Turkish language capability over C programming environment on LINUX. All the technological modules of this project which are MySQL, C, LEX, YACC and ZEMBEREK created different problems with the Turkish inputs. During these problems I have searched Internet for the Turkish input implementations of LEX and YACC or the MySQL connection through C and as a result of my findings this project is the first time implementation of Turkish characters sets by LEX, YACC and MySQL at the same time on LINUX. One of the most important achievements after accomplishing this study is the flexibility of the input sentences. Anybody can add a new grammar rule to the YACC file buy obeying the regular expression structure of YACC. After a successful addition the project will search for this new addition in the input sentences and the answers related to this input will be produced.
BibTeX:

@inproceedings{ike2012,
  author = {Sadi Evren SEKER},
  title = {Turkish Query Engine on Library Ontology},
  booktitle = {Proceedings of the 2012 International Conference on Internet Knowledge Engineering (IKE 2012)},
  publisher = {CSREA Press},
  year = {2012},
  pages = {26--33}
}
SEKER SE (2012), “A Novel Temporal Visualization Framework for Relational Event Representation”, In PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION & VISUALIZATION METHODS (MSV 2012)., July, 2012. , pp. 258-264. CSREA Press.

Abstract: Temporal logics are widely used in many study types, such as question answering, ontology, natural language processing, search engines, text summarization, or even visual tools like Gantt charts or UML diagrams. Computable temporal languages are the logical systems, built on temporal logic, that can be computed to find a result. They can also be defined as computer-computable languages, built on temporal logics. All timeline drawing or planning software uses temporal logic in order to visualize or process cases. Also, semantic web studies are one of implementation areas where the temporal modeling and reasoning is massively needed. Relation between events or event types and event of subjects can be modeled by using temporal logic. This study introduces the temporal logics and the computable temporal languages in the current literature. For the first time, some of temporal logic problems are pointed and solved during this study. Also a novel temporal framework is implemented with JAVA and published on the web which covers the solutions of temporal logic problems.
BibTeX:

@inproceedings{msv2012,
  author = {Sadi Evren SEKER},
  title = {A Novel Temporal Visualization Framework for Relational Event Representation},
  booktitle = {PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION & VISUALIZATION METHODS (MSV 2012)},
  publisher = {CSREA Press},
  year = {2012},
  pages = {258--264}
}
SEKER SE (2012), “Web Spider Performance and Data Structure Analysis”, In Proceedings of the 2012 International Conference on Semantic Web and Web Services (SWWS 2012)., July, 2012. , pp. 73-77. CSREA Press.

Abstract: The aim of this study is performance evaluation of a web spider which almost all search engines utilize during the web crawling. A data structure is required to keep record of pages visited and the keywords extracted from the web site during the web crawling. The paper first goes into the detail of possible data structures for a web spider and critics all possibilities depending on their time and memory efficiencies. Furthermore the possibilities are narrowed into tree variations only and a tree is selected from each tree data structure family. Finally, a search engine is implemented and all the tree alternatives from each of the tree data structure family are also implemented and the performance of each alternative is benchmarked.
BibTeX:

@inproceedings{swws2012,
  author = {Sadi Evren SEKER},
  title = {Web Spider Performance and Data Structure Analysis},
  booktitle = {Proceedings of the 2012 International Conference on Semantic Web and Web Services (SWWS 2012)},
  publisher = {CSREA Press},
  year = {2012},
  pages = {73--77}
}
SEKER SE (2004), “Possible Social Impacts of E-Government: A Case Study of Turkey”. Thesis at: Istanbul Technical University, Instute of Social Sciences, Science Technology and Society.

Abstract: In recent days, e-government is a popular subject in Turkey. Fresh political will and the effect of European Union, draw a directed path for e-government applications in Turkey. Besides the numerous studies in technological or judicial studies, there are only a few studies from the perspective of society. In this study we have targeted a more focused path by following the theoretical background of Science Technology and Society. The most common theories like Actor Network Theory, Systems Approach or Social Constructivism of Technology is shortly described just before the commenting and modeling e-government in Turkey. On the other hand we have published a web site holding an inquiry which contains questions about the e-government in Turkey.
BibTeX:

@mastersthesis{itu2004,
  author = {Sadi Evren SEKER},
  title = {Possible Social Impacts of E-Government: A Case Study of Turkey},
  school = {Istanbul Technical University, Instute of Social Sciences, Science Technology and Society},
  year = {2004},
  url = {http://www.shedai.net/e-devlet/egov.htm}
}
SEKER SE (2003), “Design and Implementation of a Personal Calendar with a Natural Language Interface in Turkish (Turkish Speaking Assistant)”. Thesis at: Yeditepe University, Computer Science Dept..

Abstract: NLP (Natural Language Processing) aims to provide an effective communication interface between human-beings and computer systems that has a very large scope based on different application areas and languages. NLP programs convert natural language sentences into a form that computers can handle by using morphological, syntactic and semantic analyses, which are usually performed separately but operate in coordination. We have implemented an interface for saving and querying appointments via Turkish sentences using a utilized infrastructure.
The developed program TuSA (Turkish Speaking Assistant) as the name implies fetches Turkish sentences containing information the appointments and applies morphological, syntactic and semantic analysis to convert them into logical formulas. This program provides a way to query the desired appointment data set by parsing these formulas. Although this program based on a utilizing infrastructure it has its own.
To avoid an ambiguity, at the beginning of the report we want to declare the terminology about appointments. In any part of this thesis the term appointment means a record in calendar database. Its subject can be a meeting, a dinner, an interview or anything else. But the term meeting is always used to mean a subject of appointment.
BibTeX:

@mastersthesis{msc2003,
  author = {Sadi Evren SEKER},
  title = {Design and Implementation of a Personal Calendar with a Natural Language Interface in Turkish (Turkish Speaking Assistant)},
  school = {Yeditepe University, Computer Science Dept.},
  year = {2003},
  url = {http://www.shedai.net/tusa/html/report.html}
}
SEKER SE, Al-NAAMI K and KHAN L (2013), “Author Attribution on Streaming Data”, In Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on (IEEE IRI 2013)., August, 2013. , pp. 497-503.

Abstract: The concept of novel authors occurring in streaming data source, such as evolving social media, is an unaddressed problem up until now. Existing author attribution techniques deals with the datasets, where the total number of authors do not change in the training or the testing time of the classifiers. This study focuses on the question, “what happens if new authors are added into the system by time?”. Moreover in this study we are also dealing with the problems that some of the authors may not stay and may disappear by time or may re-appear after a while. In this study stream mining approaches are proposed to solve the problem. The test scenarios are created over the existing IMDB62 data set, which is widely used by author attribution algorithms already. We used our own shuffling algorithms to create the effect of novel authors. Also before the stream mining, POS tagging approaches and the TF-IDF methods are applied for the feature extraction. And we have applied bi-tag approach where two consecutive tags are considered as a new feature in our approach. By the help of novel techniques, first time proposed in this paper, the success rate has been increased from 35% to 61% for the authorship attribution on streaming text data.
BibTeX:

@inproceedings{IRI13,
  author = {Sadi Evren SEKER and Khaled Al-NAAMI and Latifur KHAN},
  title = {Author Attribution on Streaming Data},
  booktitle = {Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on (IEEE IRI 2013)},
  year = {2013},
  pages = {497--503}
}
Mehmet Lutfi ARSLAN, Sadi Evren SEKER (2014), “Web Based Reputation Index of Turkish Universities”, International Journal of E-Education E-Business E-Management and E-Learning (IJEEEE), 2014, Issn : 2010-3654, vol.4, is.3, pp.197-203.

Abstract: TThis paper attempts to develop an online reputation index of Turkish universities through their online impact and effectiveness. Using 16 different web based parameters and employing normalization process of the results, we have ranked websites of Turkish universities in terms of their web presence. This index is first attempt to determine the tools of reputation of Turkish academic websites and would be a basis for further studies to examine the relation between reputation and the online effectiveness of the universities.
BibTeX:

@article{ijeeee,
  author = {Mehmet Lutfi ARSLAN and Sadi Evren SEKER},
  title = {Web Based Reputation Index of Turkish Universities},
  journal = {International Journal of E-Education E-Business E-Management and E-Learning (IJEEEE)},
  year = {2014},
  volume = {4},
  pages = {197--203},
  url = {https://www.sadievrenseker.com/publications/ij4e.pdf}
}
SEKER SE, Altun O, Ayan U and MERT C (2014), “A Novel String Distance Function based on Most Frequent K Charactersining”, International Journal of Machine Learning and Computation (IJMLC)., January, 2014. Vol. 4, pp. 177-183.

Abstract: This study aims to publish a novel similarity metric to increase the speed of comparison operations. Also the new metric is suitable for distance-based operations among strings. Most of the simple calculation methods, such as string length are fast to calculate but doesn’t represent the string correctly. On the other hand the methods like keeping the histogram over all characters in the string are slower but good to represent the string characteristics in some areas, like natural language. We propose a new metric, easy to calculate and satisfactory for string comparison. Method is built on a hash function, which gets a string at any size and outputs the most frequent K characters with their frequencies. The outputs are open for comparison and our studies showed that the success rate is quite satisfactory for the text mining operations.
BibTeX:

@article{ijmlc,
  author = {Sadi Evren SEKER and Oguz Altun and Ugur Ayan and Cihan MERT},
  title = {A Novel String Distance Function based on Most Frequent K Charactersining},
  journal = {International Journal of Machine Learning and Computation (IJMLC)},
  year = {2014},
  volume = {4},
  pages = {177--183},
  url = {https://www.academia.edu/attachments/32729347/download_file}
}
SEKER SE and Diri B (2010), “Event Ordering for Turkish Natural Language Texts”, In Proceedings of the 1st Computer Science Student Workshop., February, 2010. , pp. 26-29.

BibTeX:

@inproceedings{cssw2010,
  author = {Sadi Evren SEKER and Banu Diri},
  title = {Event Ordering for Turkish Natural Language Texts},
  booktitle = {Proceedings of the 1st Computer Science Student Workshop},
  year = {2010},
  pages = {26--29},
  url = {http://research.sabanciuniv.edu/14162/1/3011200000231.pdf}
}
Seker SE and Diri B (2010), “TimeML and Turkish Temporal Logic”, In IC-AI. , pp. 881-887.

Abstract: Turkish is one of the widely used and relatively difficult natural language for machine processing. One of the challenges in Turkish is the temporal logic and processing the time of events.
For the Latin family natural languages, there are quite successful solutions like TimeML which is built on the Reichenbach tense analysis and Allen‟s temporal logic. Unfortunately, there is no previous work on Turkish languages up until now.
This paper covers the basic temporal models of Reichenbach and Allen and then proceeds to the improvement of these models to cover temporal logic behind Turkish natural language.
In order to test the success of this study, we have also created a corpus from child stories and tested the success of new implementation.
BibTeX:

@inproceedings{icai2010,
  author = {Sadi Evren Seker and Banu Diri},
  title = {TimeML and Turkish Temporal Logic},
  booktitle = {IC-AI},
  year = {2010},
  pages = {881-887},
  url = {https://www.academia.edu/1255771/TimeML_and_Turkish_Temporal_Logic}
}
SEKER SE and MERT C (2013), “A Novel Feature Hashing For Text Mining”, Journal of Technical Science & Technologies., June, 2013. Vol. 2, pp. 37-40.

Abstract: Because of the increasing studies on the big data, holding text as data source, the importance of feature hashing has a major role in the literature now. A usual way of text mining on big data, mostly requires a layer of feature hashing, which reduces the size of fea- ture vector. For example getting the word count yields hundreds of thousands of features in most of the cases and taking the pos- tagging would reduce this number into features about 50. By the feature hashing the size of feature vector reduces reasonably and the data mining processes like classification, clustering or associa- tion can run faster. And in some cases, executing some algorithms are impossible with current hardware, where parallel or distribut- ed programming takes into account. The feature hashing approaches, usually can be categorized into two groups. The first group deals with natural language pro- cessing (NLP) algorithms and tries to extract a relatively smarter hash results, which represents the input characteristics at maxi- mum or the mathematical hashing algorithms, which do not deal with the context or meaning of the text input and just processes the input for some binary output. For example POS-Tagging ap- proaches can carry on some features of the input to the output on the other hand hashing algorithms like MD5 or SHA-1 has no effect of input, where they only worry about the less collision on the output. This study focus on the second group of hashing algorithms and criticize the hashing algorithms using Feistel Network which are widely utilized in the text mining studies. We propose a new ap- proach which is mainly built on the substitution boxes (s-boxes), which is in the core of all Feistel Networks and processes the text faster than the other implementations.
BibTeX:

@article{jtst,
  author = {Sadi Evren SEKER and Cihan MERT},
  title = {A Novel Feature Hashing For Text Mining},
  journal = {Journal of Technical Science & Technologies},
  year = {2013},
  volume = {2},
  pages = {37--40},
  url = {http://www.academia.edu/4756079/A_Novel_Feature_Hashing_for_Text_Mining}
}
SEKER SE, MERT C, Al-Naami K, AYAN U and OZALP N (2013), “Performance Evaluation of a Regular Expression Crawler and Indexer”, In Intelligence and Security Informatics (ISI), Proceeding of 2013 IEEE International Conference (ISI 2013)., June, 2013. , pp. 272-273.

Abstract: Aim of this study is applying the ensemble classification methods over the stock market closing values, which can be assumed as time series and finding out the relation between the economy news. In order to keep the study back ground clear, the majority voting method has been applied over the three classification algorithms, which are the k-nearest neighborhood, support vector machine and the C4.5 tree. The results gathered from two different feature extraction methods are correlated with majority voting meta classifier (ensemble method) which is running over three classifiers. The results show the success rates are increased after the ensemble at least 2 to 3 percent success rate.
BibTeX:

@inproceedings{ISI13,
  author = {Sadi Evren SEKER and Cihan MERT and Khaled Al-Naami and Ugur AYAN and Nuri OZALP},
  title = {Performance Evaluation of a Regular Expression Crawler and Indexer},
  booktitle = {Intelligence and Security Informatics (ISI), Proceeding of 2013 IEEE International Conference (ISI 2013)},
  year = {2013},
  pages = {272--273}
}