scholarly journals Building a Sample Frame of SMEs Using Patent, Search Engine, and Website Data

2021 ◽  
Vol 37 (1) ◽  
pp. 1-30
Author(s):  
Sanjay K. Arora ◽  
Sarah Kelley ◽  
Sarvothaman Madhavan

Abstract This research outlines the process of building a sample frame of US SMEs. The method starts with a list of patenting organizations and defines the boundaries of the population and subsequent frame using free to low-cost data sources, including search engines and websites. Generating high-quality data is of key importance throughout the process of building the frame and subsequent data collection; at the same time, there is too much data to curate by hand. Consequently, we turn to machine learning and other computational methods to apply a number of data matching, filtering, and cleaning routines. The results show that it is possible to generate a sample frame of innovative SMEs with reasonable accuracy for use in subsequent research: Our method provides data for 79% of the frame. We discuss implications for future work for researchers and NSIs alike and contend that the challenges associated with big data collections require not only new skillsets but also a new mode of collaboration.

2020 ◽  
pp. 81-83
Author(s):  
Samsudeen S ◽  
Salomi M

The paper survey helps to diminish the start-up complex of knowledge assortment and clear analytics for factual modeling & course improvement for probability connected by engine vehicles. We tend to seem that the writing is isolated into 2 totally different inquire concerning areas: (a) discerning/illustrative methods which endeavor in order to urge it and assess clatter hazard supported distinctive powerful conditions, and (b) improvement strategies which center by minimizing clatter probability by route, path-selection and break design. Interpretation based on inquire concerning results of the 2 streams are restricted to beat the problem that tends to show freely accessible high-quality data sources (diverse take into account plans, result factors, and indicator factors) and communicative instructive strategies (information summarization, visualization, and measuring decrease) which are used for understanding safer-routing and provides code to encourage data collection/exploration by practitioners/res


Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4486 ◽  
Author(s):  
Mohan Li ◽  
Yanbin Sun ◽  
Yu Jiang ◽  
Zhihong Tian

In sensor-based systems, the data of an object is often provided by multiple sources. Since the data quality of these sources might be different, when querying the observations, it is necessary to carefully select the sources to make sure that high quality data is accessed. A solution is to perform a quality evaluation in the cloud and select a set of high-quality, low-cost data sources (i.e., sensors or small sensor networks) that can answer queries. This paper studies the problem of min-cost quality-aware query which aims to find high quality results from multi-sources with the minimized cost. The measurement of the query results is provided, and two methods for answering min-cost quality-aware query are proposed. How to get a reasonable parameter setting is also discussed. Experiments on real-life data verify that the proposed techniques are efficient and effective.


2012 ◽  
Vol 144 (5) ◽  
pp. 727-731
Author(s):  
Isabelle Létourneau ◽  
Maxim Larrivée ◽  
Antoine Morin

AbstractAssessing biodiversity is essential in conservation biology but the resources needed are often limited. Citizen science, by which volunteers gather data at low cost, represents a potential solution for the lack of resources if it produces usable data for scientific means. Scientific inventories for butterflies are often performed with a Pollard transect, a standardised surveying technique that generates high-quality data. General microhabitat surveys (GMSs) are potentially more appealing to amateurs participating in citizen science projects because they are less constrained. We compare estimates of butterfly species richness acquired by Pollard transects to those obtained by GMSs. We demonstrate that GMSs allow surveyors to detect more butterfly species and a more complete portrait of local butterfly assemblages for the same number of individuals captured.


2018 ◽  
Vol 7 (2) ◽  
pp. 535-541 ◽  
Author(s):  
Louisa Scholz ◽  
Alvaro Ortiz Perez ◽  
Benedikt Bierer ◽  
Jürgen Wöllenstein ◽  
Stefan Palzer

Abstract. The availability of datasets providing information on the spatial and temporal evolution of greenhouse gas concentrations is of high relevance for the development of reliable climate simulations. However, current gas detection technologies do not allow for obtaining high-quality data at intermediate spatial scales with high temporal resolution. In this regard the deployment of a wireless gas sensor network equipped with in situ gas analysers may be a suitable approach. Here we present a novel, non-dispersive infrared absorption spectroscopy (NDIR) device that can possibly act as a central building block of a sensor node to provide high-quality data of carbon dioxide (CO2) concentrations under field conditions at a high measurement rate. Employing a gas-based, photoacoustic detector we demonstrate that miniaturized, low-cost, and low-power consuming CO2 sensors may be built. The performance is equal to that of standard NDIR devices but at a much reduced optical path length. Because of the spectral properties of the photoacoustic detector, no cross-sensitivities to humidity exist.


HardwareX ◽  
2020 ◽  
Vol 8 ◽  
pp. e00138
Author(s):  
Audun D. Myers ◽  
Joshua R. Tempelman ◽  
David Petrushenko ◽  
Firas A. Khasawneh

Author(s):  
Mary Kay Gugerty ◽  
Dean Karlan

Without high-quality data, even the best-designed monitoring and evaluation systems will collapse. Chapter 7 introduces some the basics of collecting high-quality data and discusses how to address challenges that frequently arise. High-quality data must be clearly defined and have an indicator that validly and reliably measures the intended concept. The chapter then explains how to avoid common biases and measurement errors like anchoring, social desirability bias, the experimenter demand effect, unclear wording, long recall periods, and translation context. It then guides organizations on how to find indicators, test data collection instruments, manage surveys, and train staff appropriately for data collection and entry.


2019 ◽  
Vol 14 (3) ◽  
pp. 338-366
Author(s):  
Kashif Imran ◽  
Evelyn S. Devadason ◽  
Cheong Kee Cheok

This article analyzes the overall and type of developmental impacts of remittances for migrant-sending households (HHs) in districts of Punjab, Pakistan. For this purpose, an HH-based human development index is constructed based on the dimensions of education, health and housing, with a view to enrich insights into interactions between remittances and HH development. Using high-quality data from a HH micro-survey for Punjab, the study finds that most migrant-sending HHs are better off than the HHs without this stream of income. More importantly, migrant HHs have significantly higher development in terms of housing in most districts of Punjab relative to non-migrant HHs. Thus, the government would need policy interventions focusing on housing to address inequalities in human development at the district-HH level, and subsequently balance its current focus on the provision of education and health.


2017 ◽  
Vol 47 (1) ◽  
pp. 46-55 ◽  
Author(s):  
S Aqif Mukhtar ◽  
Debbie A Smith ◽  
Maureen A Phillips ◽  
Maire C Kelly ◽  
Renate R Zilkens ◽  
...  

Background: The Sexual Assault Resource Center (SARC) in Perth, Western Australia provides free 24-hour medical, forensic, and counseling services to persons aged over 13 years following sexual assault. Objective: The aim of this research was to design a data management system that maintains accurate quality information on all sexual assault cases referred to SARC, facilitating audit and peer-reviewed research. Methods: The work to develop SARC Medical Services Clinical Information System (SARC-MSCIS) took place during 2007–2009 as a collaboration between SARC and Curtin University, Perth, Western Australia. Patient demographics, assault details, including injury documentation, and counseling sessions were identified as core data sections. A user authentication system was set up for data security. Data quality checks were incorporated to ensure high-quality data. Results: An SARC-MSCIS was developed containing three core data sections having 427 data elements to capture patient’s data. Development of the SARC-MSCIS has resulted in comprehensive capacity to support sexual assault research. Four additional projects are underway to explore both the public health and criminal justice considerations in responding to sexual violence. The data showed that 1,933 sexual assault episodes had occurred among 1881 patients between January 1, 2009 and December 31, 2015. Sexual assault patients knew the assailant as a friend, carer, acquaintance, relative, partner, or ex-partner in 70% of cases, with 16% assailants being a stranger to the patient. Conclusion: This project has resulted in the development of a high-quality data management system to maintain information for medical and forensic services offered by SARC. This system has also proven to be a reliable resource enabling research in the area of sexual violence.


Sign in / Sign up

Export Citation Format

Share Document