Connected and Autonomous Vehicles (CAVs) are introduced to improve individuals’ quality of life by offering a wide range of services. They collect a huge amount of data and exchange them with each other and the infrastructure. The collected data usually includes sensitive information about the users and the surrounding environment. Therefore, data security and privacy are among the main challenges in this industry. Blockchain, an emerging distributed ledger, has been considered by the research community as a potential solution for enhancing data security, integrity, and transparency in Intelligent Transportation Systems (ITS). However, despite the emphasis of governments on the transparency of personal data protection practices, CAV stakeholders have not been successful in communicating appropriate information with the end users regarding the procedure of collecting, storing, and processing their personal data, as well as the data ownership. This article provides a vision of the opportunities and challenges of adopting blockchain in ITS from the “data transparency” and “privacy” perspective. The main aim is to answer the following questions: (1) Considering the amount of personal data collected by the CAVs, such as location, how would the integration of blockchain technology affect
of personal data processing concerning the data subjects (as this is one of the main principles in the existing data protection regulations)? (2) How can the trade-off between transparency and privacy be addressed in blockchain-based ITS use cases?
The abundance of data collected by sensors in Internet of Things devices and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this article, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.
Phishing is the fraudulent attempt to obtain sensitive information by disguising oneself as a trustworthy entity in digital communication. It is a type of cyber attack often successful because users are not aware of their vulnerabilities or are unable to understand the risks. This article presents a systematic literature review conducted to draw a “big picture” of the most important research works performed on human factors and phishing. The analysis of the retrieved publications, framed along the research questions addressed in the systematic literature review, helps in understanding how human factors should be considered to defend against phishing attacks. Future research directions are also highlighted.
Technological advancements in data science have offered us affordable storage and efficient algorithms to query a large volume of data. Our health records are a significant part of this data, which is pivotal for healthcare providers and can be utilized in our well-being. The clinical note in electronic health records is one such category that collects a patient’s complete medical information during different timesteps of patient care available in the form of free-texts. Thus, these unstructured textual notes contain events from a patient’s admission to discharge, which can prove to be significant for future medical decisions. However, since these texts also contain sensitive information about the patient and the attending medical professionals, such notes cannot be shared publicly. This privacy issue has thwarted timely discoveries on this plethora of untapped information. Therefore, in this work, we intend to generate synthetic medical texts from a private or sanitized (de-identified) clinical text corpus and analyze their utility rigorously in different metrics and levels. Experimental results promote the applicability of our generated data as it achieves more than
accuracy in different pragmatic classification problems and matches (or outperforms) the original text data.
Background: The corporate social responsibility (CSR) disclosure was made mandatory in Malaysia in 2007 with the introduction of the CSR Framework by Bursa Malaysia. Since then, the practice of CSR disclosure is growing, as Malaysia joins global efforts towards sustainable development. Despite increased research on CSR; limited studies are assessing the relationship of specific dimensions – environmental, community, workplace and marketplace, towards dividend payout, which is crucial to investment and corporate financial decision making. Method: The study involved 32 Malaysian public listed finance companies as of 2017. It deployed data from annual reports and databases. Additionally, the study used content analysis to measure the CSR disclosure score, and dividend payout was calculated from the database. Results: There was a significant correlation between community and workplace dimensions with dividend payout. Despite the absence of significant results, the regression analysis showed a positive relationship between community and workplace dimensions with dividend payout. Besides, there was an inverse relationship between the environmental and marketplace dimension with dividend payout. The results indicated that active involvement in the community dimension resulted from an immediate positive impact towards brand equity, attracting current and new customers, and therefore improving the earning levels and dividend payout. Additionally, greater participation in the workplace dimension solidifies employees' engagement and motivation, improves the productivity level, which can be translated into enhanced earning levels and dividend payout. Meanwhile, participation in environmental and marketplace dimensions requires a longer period to yield an impact, higher development expenditure, and involve sensitive information that might benefit competitors. Hence, companies tend to utilise internal funding instead of redistributing the wealth through dividend payout. Conclusion: The study contributes to the literature of CSR by explaining the relationship of specific dimensions of environmental, community, workplace, and marketplace towards dividend payout using the evidence from the emerging economy.
A major challenge of prospective cohort studies is attrition in follow-up surveys. This study investigated attrition in a prospective cohort comprised of medical graduates in China. We described status of attrition, identified participants with higher possibility of attrition, and examined if attrition affect the estimation of the key outcome measures.
The cohort study recruited 3,620 new medical graduates from four medical universities in central and western China between 2015 and 2019. Online follow-up surveys were conducted on an annual basis. Follow-up status was defined as complete (meaning that the participant completed all the follow-up surveys) and incomplete, while incomplete follow-up was further divided into ‘always-out’, ‘rejoin’ and ‘other’. Multivariable logistic and linear regressions were used to examine factors predicting attrition and the influence on the outcome measures of career development.
2364 (65.3%) participants completed all follow-up surveys. For those with incomplete follow-up, 520 (14.4%) were ‘always-out’, 276 (7.6%) rejoined in the 2020 survey. Willingness to participate in residency training (OR=0.80, 95%CI[0.66 - 0.98]) and willingness to provide sensitive information in the baseline survey predicted a lower rate of attrition (providing scores for university entrance exam OR=0.82, 95%CI[0.69 - 0.97]]; providing contact information (OR=0.46, 95%CI[0.32 - 0.66]); providing household income (OR=0.60, 95%CI[0.43 - 0.84]). Participants with compulsory rural service (OR=1.52, 95%CI[1.05 - 2.19]) and those providing university entrance scores (OR=1.64, 95%CI[1.15-2.33)) were more likely to rejoin in the follow-up survey. These factors associated with follow-up status did not have significant impact on key outcome measures of career development.
Graduates who were unwilling to participate in residency training or not providing sensitive information should be targeted early in the cohort study to reduce attrition. More information about the study should be provided to those graduates early to facilitate their understanding of the meaning in participation. On the contrary, medical graduates with compulsory rural service and those who provided university entrance scores were more likely to rejoin in the cohort. The research team should invest more effort in contacting those graduates and returned them to the cohort.
We examine whether proprietary costs drive R&D-active firms’ choice of private loan structure. We find that R&D-active firms are more likely to choose single-lender over multi-lender private loan financing. This is consistent with the theory that high-ability entrepreneurs protect their proprietary knowledge by communicating it to a single lender while disclosing generic and less sensitive information to the public. This propensity, however, significantly decreases after the enactment of the American Inventor’s Protection Act (AIPA), which accelerated public disclosure of firms’ patent details in filings with the US Patent and Trademark Office. This accelerated public disclosure potentially caused R&D information to spill over to rivals, increasing the proprietary costs of single-lender borrowers. AIPA enactment also increased the spread on R&D-active firms’ single-lender loans. These findings contribute to the voluntary disclosure and financing-choice literature by linking R&D-active firms’ choice of single-lender financing to the proprietary costs of public disclosure.
With the presence of the Internet and the frequent use of mobile devices to send several transactions that involve personal and sensitive information, it becomes of great importance to consider the security aspects of mobile devices. And with the increasing use of mobile applications that are utilized for several purposes such as healthcare or banking, those applications have become an easy and attractive target for attackers who want to get access to mobile devices and obtain users’ sensitive information. Developing a secure application is very important; otherwise, attackers can easily exploit vulnerabilities in mobile applications which lead to serious security issues such as information leakage or injecting applications with malicious programs to access user data. In this paper, we survey the literature on application security on mobile devices, specifically mobile devices running on the Android platform, and exhibit security threats in the Android system. In addition, we study many reverse-engineering tools that are utilized to exploit vulnerabilities in applications. We demonstrate several reverse-engineering tools in terms of methodology, security holes that can be exploited, and how to use these tools to help in developing more secure applications.
With the prompt revolution and emergence of smart, self-reliant, and low-power devices, Internet of Things (IoT) has inconceivably expanded and impacted almost every real-life application. Nowadays, for example, machines and devices are now fully reliant on computer control and, instead, they have their own programmable interfaces, such as cars, unmanned aerial vehicles (UAVs), and medical devices. With this increased use of IoT, attack capabilities have increased in response, which became imperative that new methods for securing these systems be developed to detect attacks launched against IoT devices and gateways. These attacks are usually aimed at accessing, changing, or destroying sensitive information; extorting money from users; or interrupting normal business processes. In this research, we present new efficient and generic top-down architecture for intrusion detection, and classification in IoT networks using non-traditional machine learning is proposed in this article. The proposed architecture can be customized and used for intrusion detection/classification incorporating any IoT cyber-attack datasets, such as CICIDS Dataset, MQTT dataset, and others. Specifically, the proposed system is composed of three subsystems: feature engineering (FE) subsystem, feature learning (FL) subsystem, and detection and classification (DC) subsystem. All subsystems have been thoroughly described and analyzed in this article. Accordingly, the proposed architecture employs deep learning models to enable the detection of slightly mutated attacks of IoT networking with high detection/classification accuracy for the IoT traffic obtained from either real-time system or a pre-collected dataset. Since this work employs the system engineering (SE) techniques, the machine learning technology, the cybersecurity of IoT systems field, and the collective corporation of the three fields have successfully yielded a systematic engineered system that can be implemented with high-performance trajectories.
AbstractCo-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural heritage institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy, copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learning to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate tedious and time-consuming tasks. In this article, we focus on sensitivity review as a practical solution to unlock digital archives that would allow archival institutions to make non-sensitive information available. This promise to make archives more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, "black box" approaches that make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can deliver its promise to make digital archival collections more accessible, but it also creates new challenges - particularly in terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of making digital archives more accessible.