Increasing Data TLB Resilience to Transient Errors

Evaluating instruction cache vulnerability to transient errors

Proceedings of the 2005 workshop on MEmory performance DEaling with Applications, systems and architectures - MEDEA '06 ◽

10.1145/1166133.1166136 ◽

2006 ◽

Cited By ~ 6

Author(s):

Jun Yan ◽

Wei Zhang

Keyword(s):

Instruction Cache ◽

Transient Errors

Download Full-text

An Area-Efficient Approach to Improving Register File Reliability against Transient Errors

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07) ◽

10.1109/ainaw.2007.78 ◽

2007 ◽

Cited By ~ 14

Author(s):

Mallik Kandala ◽

Wei Zhang ◽

Laurence T. Yang

Keyword(s):

Register File ◽

Efficient Approach ◽

Transient Errors ◽

Area Efficient

Download Full-text

Evaluating instruction cache vulnerability to transient errors

ACM SIGARCH Computer Architecture News ◽

10.1145/1327312.1327317 ◽

2007 ◽

Vol 35 (4) ◽

pp. 21-28 ◽

Cited By ~ 6

Author(s):

Jun Yan ◽

Wei Zhang

Keyword(s):

Instruction Cache ◽

Transient Errors

Download Full-text

Lost in transcription: transient errors in information transfer

Current Opinion in Microbiology ◽

10.1016/j.mib.2015.01.010 ◽

2015 ◽

Vol 24 ◽

pp. 80-87 ◽

Cited By ~ 19

Author(s):

Alasdair JE Gordon ◽

Dominik Satory ◽

Jennifer A Halliday ◽

Christophe Herman

Keyword(s):

Information Transfer ◽

Transient Errors

Download Full-text

Analyzing Radiation-Induced Transient Errors on SRAM-Based FPGAs by Propagation of Broadening Effect

IEEE Access ◽

10.1109/access.2019.2915136 ◽

2019 ◽

Vol 7 ◽

pp. 140182-140189 ◽

Cited By ~ 1

Author(s):

Corrado De Sio ◽

Sarah Azimi ◽

Luca Sterpone ◽

Boyang Du

Keyword(s):

Transient Errors ◽

Radiation Induced

Download Full-text

Understanding the evolution of conditions data access through Frontier for the ATLAS Experiment

EPJ Web of Conferences ◽

10.1051/epjconf/201921403020 ◽

2019 ◽

Vol 214 ◽

pp. 03020

Author(s):

Michal Svatos ◽

Alessandro De Salvo ◽

Alastair Dewhurst ◽

Emmanouil Vamvakopoulos ◽

Julio Lozano Bahilo ◽

...

Keyword(s):

Distributed Computing ◽

Monitoring System ◽

Data Access ◽

Computing System ◽

High Load ◽

Cascading Failure ◽

Cascading Failures ◽

Atlas Experiment ◽

Transient Errors ◽

Increasing Demand

The ATLAS Distributed Computing system uses the Frontier system to access the Conditions, Trigger, and Geometry database data stored in the Oracle Offline Database at CERN by means of the HTTP protocol. All ATLAS computing sites use Squid web proxies to cache the data, greatly reducing the load on the Frontier servers and the databases. One feature of the Frontier client is that in the event of failure, it retries with different services. While this allows transient errors and scheduled maintenance to happen transparently, it does open the system up to cascading failures if the load is high enough. Throughout LHC Run 2 there has been an ever increasing demand on the Frontier service. There have been multiple incidents where parts of the service failed due to high load. A significant improvement in the monitoring of the Frontier service wasrequired. The monitoring was needed to identify both problematic tasks, which could then be killed or throttled, and to identify failing site services as the consequence of a cascading failure is much higher. This presentation describes the implementation and features of the monitoring system.

Download Full-text

Design methodology for mitigating transient errors in analogue and mixed-signal circuits

IET Circuits Devices & Systems ◽

10.1049/iet-cds.2012.0053 ◽

2012 ◽

Vol 6 (6) ◽

pp. 447-456 ◽

Cited By ~ 1

Author(s):

S. Askari ◽

M. Nourani

Keyword(s):

Design Methodology ◽

Mixed Signal ◽

Transient Errors ◽

Mixed Signal Circuits

Download Full-text

Transient errors resiliency analysis technique for automotive safety critical applications

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2014 ◽

10.7873/date.2014.022 ◽

2014 ◽

Cited By ~ 1

Author(s):

Sujan Pandey ◽

Bart Vermeulen

Keyword(s):

Automotive Safety ◽

Safety Critical ◽

Transient Errors ◽

Analysis Technique

Download Full-text

Addressing network-on-chip router transient errors with inherent information redundancy

ACM Transactions on Embedded Computing Systems ◽

10.1145/2485984.2485993 ◽

2013 ◽

Vol 12 (4) ◽

pp. 1-21 ◽

Cited By ~ 3

Author(s):

Qiaoyan Yu ◽

Meilin Zhang ◽

Paul Ampadu

Keyword(s):

Network On Chip ◽

Information Redundancy ◽

Transient Errors ◽

On Chip

Download Full-text

Evaluating the security threat of firewall data corruption caused by instruction transient errors

Proceedings International Conference on Dependable Systems and Networks ◽

10.1109/dsn.2002.1028938 ◽

2003 ◽

Cited By ~ 1

Author(s):

Shuo Chen ◽

Jun Xu ◽

R.K. Iyer ◽

K. Whisnant

Keyword(s):

Security Threat ◽

Transient Errors ◽

Data Corruption

Download Full-text