Data dodania

Guest lectures

We would like to invite all doctoral students to the next guest lectures, which will be given by Dr. Duong Nguyen from the University of Wyoming, USA.
 

„Resilient distributed systems”

Date: 15/05/2026
Time: 11.45 - 15.00
Place: Technical Library, Piotrowo 2, room L.02.10 (ground floor)
Language: English

Abstract:

Ensuring that a program behaves as intended by its designer is both a fundamental requirement and a significant challenge. Even for traditional sequential programs, this process can require exploring an exponentially large state space. In distributed systems, this challenge is further compounded by factors beyond the designer’s control, including component failures and communication uncertainty.

Despite these difficulties, mission-critical distributed systems underpin much of modern infrastructure - from communication networks and online transactions to traffic management and monitoring. Therefore, designing correct distributed systems is not only desirable but also a practical necessity.

Part 1: Foundations and Fault Models in Distributed Systems
This lecture explains the inevitability of faults in practical distributed systems and surveys common fault types. We then develop a formal framework for modeling faults, program executions, and desired correctness properties, providing a foundation for rigorous system design.

Part 2: Fault Tolerance Mechanisms and Strategies
Building on the first lecture, this session explores algorithmic and architectural approaches for fault detection and tolerance such as the protocols underlying the operation of TCP, automating the addition of fault-tolerance to existing program, and self-stabilization.

 

Short bio of the speaker:

Dr. Duong Nguyen is an assistant professor at the University of Wyoming. He got his BSc, MSc, PhD, and postdoctoral training from Hanoi University of Science and Technology, Purdue University, Michigan State University, and Georgetown University, respectively.

His research interests include distributed computing and system (including cloud computing and distributed machine learning), runtime monitoring, fault-tolerance, self-stabilization, and formal methods. His current research projects focus on designing efficient and resilient algorithmic solutions for large-scale distributed graph computations and distributed machine learning in uncertain and dynamic environments.

His work has been published in leading journals in distributed computing such as IEEE Transactions on Parallel and Distributed Systems (IF 6.0), Distributed Computing (IF 2.1), and highly selective conferences such as International Conference on Distributed Computing and Networking, International Symposium on Reliable Distributed Systems, and Runtime Verification. His research has been supported by funding from NSF and NASA.

English