scholarly journals Tuning the Discount Factor in Order to Reach Average Optimality on Deterministic MDPs

Author(s):  
Filipo Studzinski Perotto ◽  
Laurent Vercouter
Keyword(s):  
2019 ◽  
Vol 118 (1) ◽  
pp. 42-47
Author(s):  
KwangSeok Han

Background/Objectives: This study investigated differences in the attitude of users according to type of scarcity message and price discount conditions to compose T-commerce sales messages and search for effective strategic plans. Methods/Statistical analysis: This study empirically verifies the difference in promotion attitude and purchase intention between the type of T-Commerce scarcity message (quantity limit message / time limit message) and the price discount policy (price discount / non-discount) message. For this purpose, 2 (scarcity type: limited quantity, limited time) X 2 (with or without price discount: price discount, no price discount) factor design between subjects was used.


Author(s):  
Holger Herz ◽  
Martin Huber ◽  
Tjaša Maillard-Bjedov ◽  
Svitlana Tyahlo

Abstract Differences in patience across language groups have recently received increased attention in the literature. We provide evidence on this issue by measuring time preferences of French and German speakers from a bilingual municipality in Switzerland where institutions are shared and socioeconomic conditions are very similar across the two language groups. We find that French speakers are significantly more impatient than German speakers, and differences are particularly pronounced when payments in the present are involved. Estimates of preference parameters of a quasi-hyperbolic discounting model suggest significant differences in both present bias (β) and the long-run discount factor (δ) across language groups.


Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 380
Author(s):  
Emanuele Cavenaghi ◽  
Gabriele Sottocornola ◽  
Fabio Stella ◽  
Markus Zanker

The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address real-world challenges related to sequential decision making. In this setting, an agent selects the best action to be performed at time-step t, based on the past rewards received by the environment. This formulation implicitly assumes that the expected payoff for each action is kept stationary by the environment through time. Nevertheless, in many real-world applications this assumption does not hold and the agent has to face a non-stationary environment, that is, with a changing reward distribution. Thus, we present a new MAB algorithm, named f-Discounted-Sliding-Window Thompson Sampling (f-dsw TS), for non-stationary environments, that is, when the data streaming is affected by concept drift. The f-dsw TS algorithm is based on Thompson Sampling (TS) and exploits a discount factor on the reward history and an arm-related sliding window to contrast concept drift in non-stationary environments. We investigate how to combine these two sources of information, namely the discount factor and the sliding window, by means of an aggregation function f(.). In particular, we proposed a pessimistic (f=min), an optimistic (f=max), as well as an averaged (f=mean) version of the f-dsw TS algorithm. A rich set of numerical experiments is performed to evaluate the f-dsw TS algorithm compared to both stationary and non-stationary state-of-the-art TS baselines. We exploited synthetic environments (both randomly-generated and controlled) to test the MAB algorithms under different types of drift, that is, sudden/abrupt, incremental, gradual and increasing/decreasing drift. Furthermore, we adapt four real-world active learning tasks to our framework—a prediction task on crimes in the city of Baltimore, a classification task on insects species, a recommendation task on local web-news, and a time-series analysis on microbial organisms in the tropical air ecosystem. The f-dsw TS approach emerges as the best performing MAB algorithm. At least one of the versions of f-dsw TS performs better than the baselines in synthetic environments, proving the robustness of f-dsw TS under different concept drift types. Moreover, the pessimistic version (f=min) results as the most effective in all real-world tasks.


2021 ◽  
pp. 101000
Author(s):  
David Newton ◽  
Emmanouil Platanakis ◽  
Dimitrios Stafylas ◽  
Charles Sutcliffe ◽  
Xiaoxia Ye

Author(s):  
Alain Jean-Marie ◽  
Mabel Tidball ◽  
Víctor Bucarey López

We consider a discrete-time, infinite-horizon dynamic game of groundwater extraction. A Water Agency charges an extraction cost to water users and controls the marginal extraction cost so that it depends not only on the level of groundwater but also on total water extraction (through a parameter [Formula: see text] that represents the degree of strategic interactions between water users) and on rainfall (through parameter [Formula: see text]). The water users are selfish and myopic, and the goal of the agency is to give them incentives so as to improve their total discounted welfare. We look at this problem in several situations. In the first situation, the parameters [Formula: see text] and [Formula: see text] are considered to be fixed over time. The first result shows that when the Water Agency is patient (the discount factor tends to 1), the optimal marginal extraction cost asks for strategic interactions between agents. The contrary holds for a discount factor near 0. In a second situation, we look at the dynamic Stackelberg game where the Agency decides at each time what cost parameter they must announce. We study theoretically and numerically the solution to this problem. Simulations illustrate the possibility that threshold policies are good candidates for optimal policies.


2019 ◽  
Vol 3 (1) ◽  
pp. 82-92
Author(s):  
Fatahurrazak Fatahurrazak

Penelitian ini bertujuan untuk melihak kelayakan bisnis pada Pembangunan Kompleks Industri Maritim (Sentra Industri) Pengolahan Rumput Laut di Kecamatan Moro, Kabupaten Karimun Provinsi Kepulauan Riau dimaksudkan untuk memacu pertumbuhan industri daerah dengan keterlibatan para pihak berkepentingan termasuk pengusaha industri kecil dan menengah sehingga memberikan manfaat bagi berbagai pihak. Penyusunan pola pengembangan sentra menggunakan konsepsi sebagaimana diatur dalam perundang-undangan diharapkan menghadirkan sentra dengan prinsip lingkungan bersih, industri hijau dengan memperhatikan sanitasi dan higienitas, keseimbangan dengan ruang terbuka dan efektivitas pemanfaatan ruang dan biaya. Metodologi dalam menghasilkan keluaran tersebut di atas dilakukan dengan mempelajari data primer dan data skunder yang didapat langsung dari lokasi dan pemerintah daerah Kabupaten Karimun. dokumen sekunder berupa RPJPD – RPJMD – RTRW Kabupaten Karimun, rencana strategi terkait pengembangan industri, infrastruktur, peran institusi maupun stakeholder terkait dalam pengembangan Sentra Industri pengolahan rumput laut, dan kebijakan ekonomi daerah yang relevan. Pengumpulan data dan informasi pula dilakukan dengan observasi langsung pada lokasi, wawancara, dan kuisioner. Alat analisis yang digunakan adalah Net Present Value dengan discount factor 15%, Net B/C, Internal Rate of Return, dan Payback Period. Dan hasilnya adalah NPV > 0, layak dilaksanakan, Net B/C > 1, layak dilaksanakan, IRR > discount factor 15%, layak dilaksanakan, dan Payback Period 4,18 tahun.


Sign in / Sign up

Export Citation Format

Share Document