scholarly journals Dynamic Ride-Hailing with Electric Vehicles

Author(s):  
Nicholas D. Kullman ◽  
Martin Cousineau ◽  
Justin C. Goodson ◽  
Jorge E. Mendoza

We consider the problem of an operator controlling a fleet of electric vehicles for use in a ride-hailing service. The operator, seeking to maximize profit, must assign vehicles to requests as they arise as well as recharge and reposition vehicles in anticipation of future requests. To solve this problem, we employ deep reinforcement learning, developing policies whose decision making uses [Formula: see text]-value approximations learned by deep neural networks. We compare these policies against a reoptimization-based policy and against dual bounds on the value of an optimal policy, including the value of an optimal policy with perfect information, which we establish using a Benders-based decomposition. We assess performance on instances derived from real data for the island of Manhattan in New York City. We find that, across instances of varying size, our best policy trained with deep reinforcement learning outperforms the reoptimization approach. We also provide evidence that this policy may be effectively scaled and deployed on larger instances without retraining.

2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 737
Author(s):  
Fengjie Sun ◽  
Xianchang Wang ◽  
Rui Zhang

An Unmanned Aerial Vehicle (UAV) can greatly reduce manpower in the agricultural plant protection such as watering, sowing, and pesticide spraying. It is essential to develop a Decision-making Support System (DSS) for UAVs to help them choose the correct action in states according to the policy. In an unknown environment, the method of formulating rules for UAVs to help them choose actions is not applicable, and it is a feasible solution to obtain the optimal policy through reinforcement learning. However, experiments show that the existing reinforcement learning algorithms cannot get the optimal policy for a UAV in the agricultural plant protection environment. In this work we propose an improved Q-learning algorithm based on similar state matching, and we prove theoretically that there has a greater probability for UAV choosing the optimal action according to the policy learned by the algorithm we proposed than the classic Q-learning algorithm in the agricultural plant protection environment. This proposed algorithm is implemented and tested on datasets that are evenly distributed based on real UAV parameters and real farm information. The performance evaluation of the algorithm is discussed in detail. Experimental results show that the algorithm we proposed can efficiently learn the optimal policy for UAVs in the agricultural plant protection environment.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Tiago Pereira ◽  
Maryam Abbasi ◽  
Bernardete Ribeiro ◽  
Joel P. Arrais

AbstractIn this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine $$A_{2A}$$ A 2 A and $$\kappa$$ κ opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.


2011 ◽  
Vol 2 (1) ◽  
pp. 108-110
Author(s):  
Sweta Chakraborty ◽  
Naomi Creutzfeldt-Banda

Saturday, 18 December 2010 was the first of a two day complete closure of all London area airports due to freezing temperatures and approximately five inches of snow. A week later on December 26th, New York City area airports closed in a similar manner from the sixth largest snowstorm in NYC history, blanketing the city approximately twenty inches of snow. Both storms grounded flights for days, and resulted in severe delays long after the snow stopped falling. Both London and NYC area airports produced risk communications to explain the necessity for the closures and delays. This short flash news report examines, in turn, the risk communications presented during the airport closures. A background is provided to understand how the risk perceptions differ between London and NYC publics. Finally, it compares and contrasts the perceptions of the decision making process and outcomes of the closures, which continue to accumulate economic and social impacts.


2020 ◽  
Vol 66 (7) ◽  
pp. 2929-2950 ◽  
Author(s):  
Clifford Stein ◽  
Van-Anh Truong ◽  
Xinshang Wang

We study a fundamental model of resource allocation in which a finite number of resources must be assigned in an online manner to a heterogeneous stream of customers. The customers arrive randomly over time according to known stochastic processes. Each customer requires a specific amount of capacity and has a specific preference for each of the resources with some resources being feasible for the customer and some not. The system must find a feasible assignment of each customer to a resource or must reject the customer. The aim is to maximize the total expected capacity utilization of the resources over the horizon. This model has application in services, freight transportation, and online advertising. We present online algorithms with bounded competitive ratios relative to an optimal off-line algorithm that knows all stochastic information. Our algorithms perform extremely well compared with common heuristics as demonstrated on a real data set from a large hospital system in New York City. This paper was accepted by Yinyu Ye, optimization.


2020 ◽  
Vol 55 (8) ◽  
pp. 1427-1430 ◽  
Author(s):  
Jennifer R. DeFazio ◽  
Anastasia Kahan ◽  
Erica M. Fallon ◽  
Cornelia Griggs ◽  
Sandra Kabagambe ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document