A Method for New Word Extraction on Chinese Large-scale Query Logs

Author(s):  
Rui Sun ◽  
Peng Jin ◽  
Juan Lai
Keyword(s):  
2021 ◽  
Vol 7 (5) ◽  
pp. 76
Author(s):  
Giuseppe Amato ◽  
Paolo Bolettieri ◽  
Fabio Carrara ◽  
Franca Debole ◽  
Fabrizio Falchi ◽  
...  

This paper describes in detail VISIONE, a video search system that allows users to search for videos using textual keywords, the occurrence of objects and their spatial relationships, the occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and meet users’ needs. The peculiarity of our approach is that we encode all information extracted from the keyframes, such as visual deep features, tags, color and object locations, using a convenient textual encoding that is indexed in a single text retrieval engine. This offers great flexibility when results corresponding to various parts of the query (visual, text and locations) need to be merged. In addition, we report an extensive analysis of the retrieval performance of the system, using the query logs generated during the Video Browser Showdown (VBS) 2019 competition. This allowed us to fine-tune the system by choosing the optimal parameters and strategies from those we tested.


2021 ◽  
pp. 1-17
Author(s):  
Qian Guo ◽  
Wei Chen ◽  
Huaiyu Wan

Abstract Personalized search is a promising way to improve the quality of web search, and it has attracted much attention from both academic and industrial communities. Much of the current related research is based on commercial search engine data, which can not be released publicly for such reasons as privacy protection and information security. This leads to a serious lack of accessible public datasets in this field. The few available datasets though released to the public have not become widely used in academia due to the complexity of the processing process. The lack of datasets together with the difficulties of data processing have brought obstacles to fair comparison and evaluation of personalized search models. In this paper, we constructed a large-scale dataset AOL4PS to evaluate personalized search methods, collected and processed from AOL query logs. We present the complete and detailed data processing and construction process. Specifically, to address the challenges of processing time and storage space demands brought by massive data volumes, we optimized the process of dataset construction and proposed an improved BM25 algorithm. Experiments are performed on AOL4PS with some classic and state-of-the-art personalized search methods, and the experiment results demonstrate that AOL4PS can measure the effect of personalized search models. AOL4PS is publicly available at http://github.com/wanhuaiyu/AOL4PS.


2016 ◽  
Vol 12 (8) ◽  
pp. 737-744 ◽  
Author(s):  
John Paparrizos ◽  
Ryen W. White ◽  
Eric Horvitz

Introduction: People’s online activities can yield clues about their emerging health conditions. We performed an intensive study to explore the feasibility of using anonymized Web query logs to screen for the emergence of pancreatic adenocarcinoma. The methods used statistical analyses of large-scale anonymized search logs considering the symptom queries from millions of people, with the potential application of warning individual searchers about the value of seeking attention from health care professionals. Methods: We identified searchers in logs of online search activity who issued special queries that are suggestive of a recent diagnosis of pancreatic adenocarcinoma. We then went back many months before these landmark queries were made, to examine patterns of symptoms, which were expressed as searches about concerning symptoms. We built statistical classifiers that predicted the future appearance of the landmark queries based on patterns of signals seen in search logs. Results: We found that signals about patterns of queries in search logs can predict the future appearance of queries that are highly suggestive of a diagnosis of pancreatic adenocarcinoma. We showed specifically that we can identify 5% to 15% of cases, while preserving extremely low false-positive rates (0.00001 to 0.0001). Conclusion: Signals in search logs show the possibilities of predicting a forthcoming diagnosis of pancreatic adenocarcinoma from combinations of subtle temporal signals revealed in the queries of searchers.


1999 ◽  
Vol 173 ◽  
pp. 243-248
Author(s):  
D. Kubáček ◽  
A. Galád ◽  
A. Pravda

AbstractUnusual short-period comet 29P/Schwassmann-Wachmann 1 inspired many observers to explain its unpredictable outbursts. In this paper large scale structures and features from the inner part of the coma in time periods around outbursts are studied. CCD images were taken at Whipple Observatory, Mt. Hopkins, in 1989 and at Astronomical Observatory, Modra, from 1995 to 1998. Photographic plates of the comet were taken at Harvard College Observatory, Oak Ridge, from 1974 to 1982. The latter were digitized at first to apply the same techniques of image processing for optimizing the visibility of features in the coma during outbursts. Outbursts and coma structures show various shapes.


1994 ◽  
Vol 144 ◽  
pp. 29-33
Author(s):  
P. Ambrož

AbstractThe large-scale coronal structures observed during the sporadically visible solar eclipses were compared with the numerically extrapolated field-line structures of coronal magnetic field. A characteristic relationship between the observed structures of coronal plasma and the magnetic field line configurations was determined. The long-term evolution of large scale coronal structures inferred from photospheric magnetic observations in the course of 11- and 22-year solar cycles is described.Some known parameters, such as the source surface radius, or coronal rotation rate are discussed and actually interpreted. A relation between the large-scale photospheric magnetic field evolution and the coronal structure rearrangement is demonstrated.


2000 ◽  
Vol 179 ◽  
pp. 205-208
Author(s):  
Pavel Ambrož ◽  
Alfred Schroll

AbstractPrecise measurements of heliographic position of solar filaments were used for determination of the proper motion of solar filaments on the time-scale of days. The filaments have a tendency to make a shaking or waving of the external structure and to make a general movement of whole filament body, coinciding with the transport of the magnetic flux in the photosphere. The velocity scatter of individual measured points is about one order higher than the accuracy of measurements.


Author(s):  
Simon Thomas

Trends in the technology development of very large scale integrated circuits (VLSI) have been in the direction of higher density of components with smaller dimensions. The scaling down of device dimensions has been not only laterally but also in depth. Such efforts in miniaturization bring with them new developments in materials and processing. Successful implementation of these efforts is, to a large extent, dependent on the proper understanding of the material properties, process technologies and reliability issues, through adequate analytical studies. The analytical instrumentation technology has, fortunately, kept pace with the basic requirements of devices with lateral dimensions in the micron/ submicron range and depths of the order of nonometers. Often, newer analytical techniques have emerged or the more conventional techniques have been adapted to meet the more stringent requirements. As such, a variety of analytical techniques are available today to aid an analyst in the efforts of VLSI process evaluation. Generally such analytical efforts are divided into the characterization of materials, evaluation of processing steps and the analysis of failures.


Author(s):  
V. C. Kannan ◽  
A. K. Singh ◽  
R. B. Irwin ◽  
S. Chittipeddi ◽  
F. D. Nkansah ◽  
...  

Titanium nitride (TiN) films have historically been used as diffusion barrier between silicon and aluminum, as an adhesion layer for tungsten deposition and as an interconnect material etc. Recently, the role of TiN films as contact barriers in very large scale silicon integrated circuits (VLSI) has been extensively studied. TiN films have resistivities on the order of 20μ Ω-cm which is much lower than that of titanium (nearly 66μ Ω-cm). Deposited TiN films show resistivities which vary from 20 to 100μ Ω-cm depending upon the type of deposition and process conditions. TiNx is known to have a NaCl type crystal structure for a wide range of compositions. Change in color from metallic luster to gold reflects the stabilization of the TiNx (FCC) phase over the close packed Ti(N) hexagonal phase. It was found that TiN (1:1) ideal composition with the FCC (NaCl-type) structure gives the best electrical property.


Author(s):  
J. Liu ◽  
N. D. Theodore ◽  
D. Adams ◽  
S. Russell ◽  
T. L. Alford ◽  
...  

Copper-based metallization has recently attracted extensive research because of its potential application in ultra-large-scale integration (ULSI) of semiconductor devices. The feasibility of copper metallization is, however, limited due to its thermal stability issues. In order to utilize copper in metallization systems diffusion barriers such as titanium nitride and other refractory materials, have been employed to enhance the thermal stability of copper. Titanium nitride layers can be formed by annealing Cu(Ti) alloy film evaporated on thermally grown SiO2 substrates in an ammonia ambient. We report here the microstructural evolution of Cu(Ti)/SiO2 layers during annealing in NH3 flowing ambient.The Cu(Ti) films used in this experiment were prepared by electron beam evaporation onto thermally grown SiO2 substrates. The nominal composition of the Cu(Ti) alloy was Cu73Ti27. Thermal treatments were conducted in NH3 flowing ambient for 30 minutes at temperatures ranging from 450°C to 650°C. Cross-section TEM specimens were prepared by the standard procedure.


Sign in / Sign up

Export Citation Format

Share Document