With the technology trend of hardware and workload consolidation for embedded systems and the rapid development of edge computing, there has been increasing interest in supporting parallel real-time tasks to better utilize the multi-core platforms while meeting the stringent real-time constraints. For parallel real-time tasks, the federated scheduling paradigm, which assigns each parallel task a set of dedicated cores, achieves good theoretical bounds by ensuring exclusive use of processing resources to reduce interferences. However, because cores share the last-level cache and memory bandwidth resources, in practice tasks may still interfere with each other despite executing on dedicated cores. Such resource interferences due to concurrent accesses can be even more severe for embedded platforms or edge servers, where the computing power and cache/memory space are limited. To tackle this issue, in this work, we present a holistic resource allocation framework for parallel real-time tasks under federated scheduling. Under our proposed framework, in addition to dedicated cores, each parallel task is also assigned with dedicated cache and memory bandwidth resources. Further, we propose a holistic resource allocation algorithm that well balances the allocation between different resources to achieve good schedulability. Additionally, we provide a full implementation of our framework by extending the federated scheduling system with Intel’s Cache Allocation Technology and MemGuard. Finally, we demonstrate the practicality of our proposed framework via extensive numerical evaluations and empirical experiments using real benchmark programs.
Memory models play an important role in verified compilation of imperative programming languages. A representative one is the block-based memory model of CompCert---the state-of-the-art verified C compiler. Despite its success, the abstraction over memory space provided by CompCert's memory model is still primitive and inflexible. In essence, it uses a fixed representation for identifying memory blocks in a global memory space and uses a globally shared state for distinguishing between used and unused blocks. Therefore, any reasoning about memory must work uniformly for the global memory; it is impossible to individually reason about different sub-regions of memory (i.e., the stack and global definitions). This not only incurs unnecessary complexity in compiler verification, but also poses significant difficulty for supporting verified compilation of open or concurrent programs which need to work with contextual memory, as manifested in many previous extensions of CompCert.
To remove the above limitations, we propose an enhancement to the block-based memory model based on nominal techniques; we call it the nominal memory model. By adopting the key concepts of nominal techniques such as atomic names and supports to model the memory space, we are able to 1) generalize the representation of memory blocks to any types satisfying the properties of atomic names and 2) remove the global constraints for managing memory blocks, enabling flexible memory structures for open and concurrent programs.
To demonstrate the effectiveness of the nominal memory model, we develop a series of extensions of CompCert based on it. These extensions show that the nominal memory model 1) supports a general framework for verified compilation of C programs, 2) enables intuitive reasoning of compiler transformations on partial memory; and 3) enables modular reasoning about programs working with contextual memory. We also demonstrate that these extensions require limited changes to the original CompCert, making the verification techniques based on the nominal memory model easy to adopt.
Cluster computing has attracted much attention as an effective way of solving large-scale problems. However, only a few attempts have been made to explore mobile computing clusters that can be easily built using commodity smartphones and tablets. To investigate the possibility of mobile cluster-based rendering of large datasets, we developed a mobile GPU ray tracer that renders nontrivial 3D scenes with many millions of triangles at an interactive frame rate on a small-scale mobile cluster. To cope with the limited processing power and memory space, we first present an effective 3D scene representation scheme suitable for mobile GPU rendering. Then, to avoid performance impairment caused by the high latency and low bandwidth of mobile networks, we propose using a static load balancing strategy, which we found to be more appropriate for the vulnerable mobile clustering environment than a dynamic strategy. Our mobile distributed rendering system achieved a few frames per second when ray tracing 1024 × 1024 images, using only 16 low-end smartphones, for large 3D scenes, some with more than 10 million triangles. Through a conceptual demonstration, we also show that the presented rendering scheme can be effectively explored for augmenting real scene images, captured or perceived by augmented and mixed reality devices, with high quality ray-traced images.
Recent climate change has brought extremely heavy rains and widescale flooding to many areas around the globe. However, previous flood prediction methods usually require a lot of computation to obtain the prediction results and impose a heavy burden on the unit cost of the prediction. This paper proposes the use of a deep learning model (DLM) to overcome these problems. We alleviated the high computational overhead of this approach by developing a novel framework for the construction of lightweight DLMs. The proposed scheme involves training a convolutional neural network (CNN) by using a radar echo map in conjunction with historical flood records at target sites and using Grad-Cam to extract key grid cells from these maps (representing regions with the greatest impact on flooding) for use as inputs in another DLM. Finally, we used real radar echo maps of five locations and the flood heights record to verify the validity of the method proposed in this paper. The experimental results show that our proposed lightweight model can achieve similar or even better prediction accuracy at all locations with only about 5~15% of the operation time and about 30~35% of the memory space of the CNN.
The modern era of classical silicon-based computing is at the edge of a number of technological challenges which include huge energy consumption, requirement of massive memory space, and generation of e-waste. The proposed alternative to this pitfall is nanocomputing, which was first exemplified in the form of DNA computing. Recently, DNA computing is gaining acceptance in the field of eco-friendly, unconventional, nature-inspired computation. The future of computing depends on making it renewable, as this can cause a drastic improvement in energy consumption. Thus, to save the natural resources and to stop the growing toxicity of the planet, reversibility is being imposed on DNA computing so that it can replace the traditional form of computation. This chapter reflects the foundation of DNA computing and renewability of this multidisciplinary domain that can be produced optimally and run from available natural resources.
In the machine learning and computer vision domain, images are represented using their features. Color, shape, and texture are some of the prominent types of features. Over time, the local features of an image have gained importance over the global features due to their high discerning ability in localized regions. The texture features are widely used in image indexing and content-based image retrieval. In the last two decades, various local texture features have been formulated. For a complete description of images, effective and efficient features are necessary. In this paper, we provide algorithms for 10 local texture feature extraction. These texture descriptors have been formulated since the year 2015. We have designed algorithms so that they are time efficient and memory space-efficient. We have implemented these algorithms and verified their output correctness.
The operating state of switch cabinet is significant for the reliability of the whole power system, collecting and monitoring its data through the wireless sensor network is an effective method to avoid accidents. This paper proposes a data compression method based on periodic transmission model under the condition of limited energy consumption and memory space resources in the complex environment of switch cabinet sensor networks. Then, the proposed method is rigorously and intuitively shown by theoretical derivation and algorithm flow chart. Finally, numerical simulations are carried out and compared with the original data. The comparisons of compression ratio and error results indicate that the improved algorithm has a better effect on the periodic sensing data with interference and can make sure the change trend of data by making certain timing sequence.
With the increasing popularity of the Internet of Things (IoT), the issue of its information security has drawn more and more attention. To overcome the resource constraint barrier for secure and reliable data transmission on the widely used IoT devices such as wireless sensor network (WSN) nodes, many researcher studies consider hardware acceleration of traditional cryptographic algorithms as one of the effective methods. Meanwhile, as one of the current research topics in the reduced instruction set computer (RISC), RISC-V provides a solid foundation for implementing domain-specific architecture (DSA). To this end, we propose an extended instruction scheme for the advanced encryption standard (AES) based on RISC-V custom instructions and present a coprocessor designed on the open-source core Hummingbird E203. The AES coprocessor uses direct memory access channels to achieve parallel data access and processing, which provides flexibility in memory space allocation and improves the efficiency of cryptographic components. Applications with embedded AES custom instructions running on an experimental prototype of the field-programmable gate array (FPGA) platform demonstrated a 25.3% to 37.9% improvement in running time over previous similar works when processing no less than 80 bytes of data. In addition, the application-specific integrated circuit (ASIC) experiments show that in most cases, the coprocessor only consumes up to 20% more power than the necessary AES operations.
In recent years, Internet of Things (IoT for short) research has become one of the top ten most popular research topics. IoT devices also embed many sensing chips for detecting physical signals from the outside environment. In the wireless sensing network (WSN for short), a human can wear several IoT devices around her/his body such as a smart watch, smart band, smart glasses, etc. These IoT devices can collect analog environment data around the user’s body and store these data into memory after data processing. Thus far, we have discovered that some IoT devices have resource limitations such as power shortages or insufficient memory for data computation and preservation. An IoT device such as a smart band attempts to upload a user’s body information to the cloud server by adopting the public-key crypto-system to generate the corresponding cipher-text and related signature for concrete data security; in this situation, the computation time increases linearly and the device can run out of memory, which is inconvenient for users. For this reason, we consider that, if the smart IoT device can perform encryption and signature simultaneously, it can save significant resources for the execution of other applications. As a result, our approach is to design an efficient, practical, and lightweight, blind sign-cryption (SC for short) scheme for IoT device usage. Not only can our methodology offer the sensed data privacy protection efficiently, but it is also fit for the above application scenario with limited resource conditions such as battery shortage or less memory space in the IoT device network.
The advent of 4G and 5G broadband wireless networks brings several challenges with respect to resource allocation in the networks. In an interconnected network of wireless devices, users, and devices, all compete for scarce resources which further emphasizes the fair and efficient allocation of those resources for the proper functioning of the networks. The purpose of this study is to discover the different factors that are involved in resource allocation in 4G and 5G networks. The methodology used was an empirical study using qualitative techniques by performing literature reviews on the state of art in 4G and 5G networks, analyze their respective architectures and resource allocation mechanisms, discover parameters, criteria and provide recommendations. It was observed that resource allocation is primarily done with radio resource in 4G and 5G networks, owing to their wireless nature, and resource allocation is measured in terms of delay, fairness, packet loss ratio, spectral efficiency, and throughput. Minimal consideration is given to other resources along the end-to-end 4G and 5G network architectures. This paper defines more types of resources, such as electrical energy, processor cycles and memory space, along end-to-end architectures, whose allocation processes need to be emphasized owing to the inclusion of software defined networking and network function virtualization in 5G network architectures. Thus, more criteria, such as electrical energy usage, processor cycle, and memory to evaluate resource allocation have been proposed. Finally, ten recommendations have been made to enhance resource allocation along the whole 5G network architecture.