similar text
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 7)

H-INDEX

5
(FIVE YEARS 0)

2021 ◽  
Vol 2095 (1) ◽  
pp. 011002

All papers published in this volume of Journal of Physics: Conference Series have been peer reviewed through processes administered by the Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a proceedings journal published by IOP Publishing. • Type of peer review: Double-blind • Conference submission management system: All the papers were submitted papers via the conference email: [email protected] • Number of submissions received: 226 • Number of submissions sent for review: 209 • Number of submissions accepted: 99 • Acceptance Rate (Number of Submissions Accepted/Number of Submissions Received X 100):43.8 • Average number of reviews per paper: 2 • Total number of reviewers involved:61 • Any additional info on review process: EAME 2021 uses a plagiarism software to detect instances of overlapping and similar text in submitted manuscripts. • Contact person for queries: Ms. Yolanda Xu, Advanced Science and Industry Research Center, Hong Kong, [email protected]


2021 ◽  
Vol 2068 (1) ◽  
pp. 011002

All papers published in this volume of Journal of Physics: Conference Series have been peer reviewed through processes administered by the Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a proceedings journal published by IOP Publishing. • Type of peer review: Double-blind • Conference submission management system: All the papers were submitted papers via the conference email: [email protected] • Number of submissions received: 132 • Number of submissions sent for review: 105 • Number of submissions accepted: 56 • Acceptance Rate (Number of Submissions Accepted / Number of Submissions Received X 100):42.4% • Average number of reviews per paper: 2 • Total number of reviewers involved:81 • Any additional info on review process: AMMS2021 uses a plagiarism software to detect instances of overlapping and similar text in submitted manuscripts. • Contact person for queries: Ms. Yolanda Xu, Advanced Science and Industry Research Center, Hong Kong, [email protected]


2021 ◽  
Vol 898 (1) ◽  
pp. 011002

All papers published in this volume of IOP Conference Series: Earth and Environmental Science have been peer reviewed through processes administered by the Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a proceedings journal published by IOP Publishing. • Type of peer review: Double-blind • Conference submission management system: All the papers were submitted papers via the conference email [email protected] • Number of submissions received: 65 • Number of submissions sent for review: 57 • Number of submissions accepted: 28 • Acceptance Rate (Number of Submissions Accepted/Number of Submissions Received X 100): 43.1% • Average number of reviews per paper: 2 • Total number of reviewers involved: 56 • Any additional info on review process: EPE2021 uses the iThenticate software to detect instances of overlapping and similar text in submitted manuscripts. • Contact person for queries: Ms. Yolanda Xu Advanced Science and Industry Research Center, Hong Kong [email protected]


2021 ◽  
Vol 18 (2) ◽  
pp. 419-439
Author(s):  
Yi Yin ◽  
Dan Feng ◽  
Zhan Shi ◽  
Lin Ouyang

One of the key functions of the method of text recommendation is to build a correlation analysis to all the text collection. At present, most of the text recommendation methods use the citation network, but less to consider the internal relations, which has become a challenge and an opportunity for the research of text recommendation. Therefore, we propose a new method to ameliorate the above problem based on the time series in this paper. We specify a certain text collection according to the interests of users and integrate the varied label values of the text, then we build the correlation coefficient between text and its related text with the differential analysis, finally the similarity degree of the text is calculated out by using the improved cosine similarity correlation matrix to promote a recommendation of similar text. Our experiments indicate that we are able to ensure the quality of text, with an improvement of accuracy by 8.63% as well as an improvement of recall rate by 5.25%.


Elastic search is a way to organize the data and make it easily accessible. It is a server based search on Lucene. It is a highly scalable, distributed and full-text search engine. Elastic search is developed in Java. It is published as open source under the terms of the Apache License. Elastic search is the most popular enterprise search engine. Elastic search includes all advances in speed, security, scalability, and hardware efficiency. Elastic search is a tool for querying written words. It can perform some other smart tasks, but its principal is returning text similar to a given query and statistical analyses of a quantity of text. Elasticsearch is a standalone database server, which is written in Java and using HTTP/JSON protocol,it’s takes data and optimized the data according to language based searches and stores it in a sophisticated format. Elastic search is very convenient, supporting clustering and leader selection out of the box. Whether it’s searching a database of trade products by description, finding similar text in a body of crawled web pages. In this manuscript elastic search capability of copied data identification and its removing techniques performance are analyzed


Data ◽  
2018 ◽  
Vol 3 (4) ◽  
pp. 66 ◽  
Author(s):  
Svitlana Petrasova ◽  
Nina Khairova ◽  
Włodzimierz Lewoniewski ◽  
Orken Mamyrbayev ◽  
Kuralay Mukhsina

Similar text fragments extraction from weakly formalized data is the task of natural language processing and intelligent data analysis and is used for solving the problem of automatic identification of connected knowledge fields. In order to search such common communities in Wikipedia, we propose to use as an additional stage a logical-algebraic model for similar collocations extraction. With Stanford Part-Of-Speech tagger and Stanford Universal Dependencies parser, we identify the grammatical characteristics of collocation words. With WordNet synsets, we choose their synonyms. Our dataset includes Wikipedia articles from different portals and projects. The experimental results show the frequencies of synonymous text fragments in Wikipedia articles that form common information spaces. The number of highly frequented synonymous collocations can obtain an indication of key common up-to-date Wikipedia communities.


2018 ◽  
Vol 14 (3) ◽  
pp. 69-89 ◽  
Author(s):  
Caiquan Xiong ◽  
Xuan Li ◽  
Yuan Li ◽  
Gang Liu

In an Online Argumentation Platform, a great deal of speech messages are produced. To find similar speech texts and extract their common summary is of great significance for improving the efficiency of argumentation and promoting consensus building. In this article, a method of speech text analysis is proposed. Firstly, a heuristic clustering algorithm is used to cluster the speech texts and obtain similar text sets. Then, an improved TextRank algorithm is used to extract a multi-document summary, and the results of the summary are fed back to experts (i.e. participants). The method of multi-document summarization is based on TextRank, which takes into account the position of sentences in paragraphs, the weight of the key sentence, and the length of the sentence. Finally, a prototype system is developed to verify the validity of the method using the four evaluation parameters of recall rate, accuracy rate, F-measure, and user feedback. The experimental results show that the method has a good performance in the system.


2018 ◽  
Vol 2 (1) ◽  
pp. 11
Author(s):  
Muhammad Wali ◽  
Safrizal Safrizal

a b s t r a c tThis research is intended to make a coding technique with a function similar to a plagiarism detector similar_text as text. By using the text function is similar, this research resulted in a detection of the document against the 10 (ten) of selected journal with, the document is the document with extension doc, docx, pdf, and txt. The document will be converted into html form and would henceforth be done making the string with the marker dots (.) And the comma (,) will be fetched a new string. By specifying a percentage string similarity of 90% then produced a text, in this case plagiarism is detected journal journal 1 with string similarity percentage of 48%. The use of similar texts can be classified as coding techniques for detection of plagiarism on its anti plagiarism detection applications.Keywords: Similar text, PHP, application, Plagiarism a b s t r a kPenelitian ini dimaksudkan untuk membuat sebuah teknik pengkodean dengan fungsi mirip similar_text sebagai pendeteksi plagiat sebuah teks. Dengan menggunakan fungsi teks yang serupa, penelitian ini menghasilkan sebuah deteksi dokumen terhadap 10 (sepuluh) jurnal yang dipilih dengan bahasa indonesia, dokumen yang merupakan dokumen dengan ekstensi doc, docx, pdf, dan txt. Dokumen tersebut akan dikonversi ke dalam bentuk html dan selanjutnya akan dilakukan pembuatan string dengan penanda titik (.) Dan koma (,) akan terambil sebuah string baru. Dengan menentukan persentase kemiripan string sebesar 90% maka dihasilkan sebuah teks, pada kasus ini jurnal yang terdeteksi plagiat adalah jurnal 1 dengan persentase kemiripan string sebesar 48%. Penggunaan teks serupa dapat digolongkan sebagai teknik pengkodean untuk deteksi plagiarisme pada penggunaaan aplikasi deteksi anti plagiatKata Kunci: Similar text, PHP, Aplikasi, Plagiarisme


Sign in / Sign up

Export Citation Format

Share Document