scholarly journals THE DIGITAL CLOSET: PLATFORM CENSORSHIP AND LGBTQIA+ INDEPENDENCE ONLINE

Author(s):  
Alexander Paul Monea

This presentation draws on data from my forthcoming book with MIT Press to demonstrate how heteronormative and cisnormative bias pervade Silicon Valley culture, get embedded in benchmark datasets and machine learning algorithms, and get formalized in company policies and labor practices surrounding content moderation. The presentation begins with an examination of workplace culture at Google, gaining insights from Department of Labor investigations, testimonials from previous employees, and informal surveys and discourse analysis conducted by employees during the circulation of James Damore's infamous 'Google memo'. The presentation then moves on to examine bias embedded in benchmark datasets like WordNet and ImageNet, both of which served as the training datasets for Google's Image Recognition algorithms (like GoogLeNet). Lastly, the presentation turns to Facebook's heteronormative and cisnormative content moderation policies and the outsourced labor practices it uses to institute what Facebook has described as 'human algorithms' to review content in accordance with these policies. Throughout the presentation I demonstrate that we can piece together information about proprietary code by looking to leaked documents, public records, press releases, open-source code, and benchmark datasets, all of which, in this instance, instigate a systemic heteronormative and cisnormative bias that is increasingly being embedded in the internet.

Telecom IT ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. 50-55
Author(s):  
D. Saharov ◽  
D. Kozlov

The article deals with the СoAP Protocol that regulates the transmission and reception of information traf-fic by terminal devices in IoT networks. The article describes a model for detecting abnormal traffic in 5G/IoT networks using machine learning algorithms, as well as the main methods for solving this prob-lem. The relevance of the article is due to the wide spread of the Internet of things and the upcoming update of mobile networks to the 5g generation.


2021 ◽  
pp. 1-31
Author(s):  
Sarah E. Lageson ◽  
Elizabeth Webster ◽  
Juan R. Sandoval

Digitization and the release of public records on the Internet have expanded the reach and uses of criminal record data in the United States. This study analyzes the types and volume of personally identifiable data released on the Internet via two hundred public governmental websites for law enforcement, criminal courts, corrections, and criminal record repositories in each state. We find that public disclosures often include information valuable to the personal data economy, including the full name, birthdate, home address, and physical characteristics of arrestees, detainees, and defendants. Using administrative data, we also estimate the volume of data disclosed online. Our findings highlight the mass dissemination of pre-conviction data: every year, over ten million arrests, 4.5 million mug shots, and 14.7 million criminal court proceedings are digitally released at no cost. Post-conviction, approximately 6.5 million current and former prisoners and 12.5 million people with a felony conviction have a record on the Internet. While justified through public records laws, such broad disclosures reveal an imbalance between the “transparency” of data releases that facilitate monitoring of state action and those that facilitate monitoring individual people. The results show how the criminal legal system increasingly distributes Internet privacy violations and community surveillance as part of contemporary punishment.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 656
Author(s):  
Xavier Larriva-Novo ◽  
Víctor A. Villagrá ◽  
Mario Vega-Barbas ◽  
Diego Rivera ◽  
Mario Sanz Rodrigo

Security in IoT networks is currently mandatory, due to the high amount of data that has to be handled. These systems are vulnerable to several cybersecurity attacks, which are increasing in number and sophistication. Due to this reason, new intrusion detection techniques have to be developed, being as accurate as possible for these scenarios. Intrusion detection systems based on machine learning algorithms have already shown a high performance in terms of accuracy. This research proposes the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm. This research uses for its evaluation two benchmark datasets, namely UGR16 and the UNSW-NB15, and one of the most used datasets, KDD99. The preprocessing techniques were evaluated in accordance with scalar and normalization functions. All of these preprocessing models were applied through different sets of characteristics based on a categorization composed by four groups of features: basic connection features, content characteristics, statistical characteristics and finally, a group which is composed by traffic-based features and connection direction-based traffic characteristics. The objective of this research is to evaluate this categorization by using various data preprocessing techniques to obtain the most accurate model. Our proposal shows that, by applying the categorization of network traffic and several preprocessing techniques, the accuracy can be enhanced by up to 45%. The preprocessing of a specific group of characteristics allows for greater accuracy, allowing the machine learning algorithm to correctly classify these parameters related to possible attacks.


Author(s):  
Stephanie Do ◽  
Dan Nathan-Roberts

Although online sex work has become more accessible to people of all socio-economic statuses, labor practices and work safety have not improved since the widespread use of the internet. One way that we can help empower sex workers is to understand their motivations and experiences when using the internet. In a survey conducted by Sanders et al. (2017), the highest crime that 56.2% sex workers experienced was being threatened or harassed through texts, calls, and emails. Because there is no theory application to date on this marginalized group, three theories were proposed. This literature review highlights the need to explore why sex workers, as end-users, should be included in the user cybersecurity defense conversation, such as the cybercrimes that they face, their relationship with law enforcement, and what other factors affect their safety.


Author(s):  
Pavel Karpov ◽  
Guillaume Godin ◽  
Igor Tetko

We present SMILES-embeddings derived from internal encoder state of a Transformer model trained to canonize SMILES as a Seq2Seq problem. Using CharNN architecture upon the embeddings results in a higher quality QSAR/QSPR models on diverse benchmark datasets including regression and classification tasks. The proposed Transformer-CNN method uses SMILES augmentation for training and inference, and thus the prognosis grounds on an internal consensus. Both the augmentation and transfer learning based on embedding allows the method to provide good results for small datasets. We discuss the reasons for such effectiveness and draft future directions for the development of the method. The source code and the embeddings are available on https://github.com/bigchem/transformer-cnn, whereas the OCHEM environment (https://ochem.eu) hosts its on-line implementation.


Author(s):  
Gandhali Malve ◽  
Lajree Lohar ◽  
Tanay Malviya ◽  
Shirish Sabnis

Today the amount of information in the internet growth very rapidly and people need some instruments to find and access appropriate information. One of such tools is called recommendation system. Recommendation systems help to navigate quickly and receive necessary information. Many of us find it difficult to decide which movie to watch and so we decided to make a recommender system for us to better judge which movie we are more likely to love. In this project we are going to use Machine Learning Algorithms to recommend movies to users based on genres and user ratings. Recommendation system attempt to predict the preference or rating that a user would give to an item.


Agent technology has developed into a sturdy instrument for e-commerce approach in recent years. The use of agent technology in e-commerce systems may address traditional e-commerce weaknesses, respond to the intelligence and individual needs of users, and significantly improve the efficiency of online transactions. There are some weaknesses in the system designed in this paper. The system will be less efficient in order to complete decentralization of the system. Every decentralized node needs to redundantly preserve a huge volume of information that not only takes up a lot of storage space however, it also makes cross-requesting and detail verification ineffective. This writing presents the evaluation of integrity of the e-commerce systems using block-chain and large amounts of data analysis. The fast growth of the Internet, in particular in the well-developed field of e-commerce, has advanced to digital marketing. In order to understand the common code generating conventional file to identify the associated event configuration, we will analyze Improved Practical Byzantine IPBF source code algorithms. The simulation shows the efficiency of the model.


2021 ◽  
Author(s):  
Atiq Rehman ◽  
Samir Brahim Belhaouari

Abstract Detection and removal of outliers in a dataset is a fundamental preprocessing task without which the analysis of the data can be misleading. Furthermore, the existence of anomalies in the data can heavily degrade the performance of machine learning algorithms. In order to detect the anomalies in a dataset in an unsupervised manner, some novel statistical techniques are proposed in this paper. The proposed techniques are based on statistical methods considering data compactness and other properties. The newly proposed ideas are found efficient in terms of performance, ease of implementation, and computational complexity. Furthermore, two proposed techniques presented in this paper use only a single dimensional distance vector to detect the outliers, so irrespective of the data’s high dimensions, the techniques remain computationally inexpensive and feasible. Comprehensive performance analysis of the proposed anomaly detection schemes is presented in the paper, and the newly proposed schemes are found better than the state-of-the-art methods when tested on several benchmark datasets.


Author(s):  
Darko Pevec ◽  
Zoran Bosnic ◽  
Igor Kononenko

Current machine learning algorithms perform well in many problem domains, but in risk-sensitive decision making – for example, in medicine and finance – experts do not rely on common evaluation methods that provide overall assessments of models because such techniques do not provide any information about single predictions. This chapter summarizes the research areas that have motivated the development of various approaches to individual prediction reliability. Based on these motivations, the authors describe six approaches to reliability estimation: inverse transduction, local sensitivity analysis, bagging variance, local cross-validation, local error modelling, and density-based estimation. Empirical evaluation of the benchmark datasets provides promising results, especially for use with decision and regression trees. The testing results also reveal that the reliability estimators exhibit different performance levels when used with different models and in different domains. The authors show the usefulness of individual prediction reliability estimates in attempts to predict breast cancer recurrence. In this context, estimating prediction reliability for individual predictions is of crucial importance for physicians seeking to validate predictions derived using classification and regression models.


Sign in / Sign up

Export Citation Format

Share Document