Deadlock detected! — HPC Network Research

"Multicast loops are bad since the same multicast packet will go around and around, inevitably creating a black hole that will destroy the Earth in a firey conflagration." —OpenSM

Publications and Talks

Bibtex External

M. Mubarak, N. Jain, J. Domke, N. Wolfe, C. Ross, K. Li, A. Bhatele, C. D. Carothers, K. L. Ma, and R. B. Ross, "Toward Reliable Validation of HPC Interconnect Simulations," in Proceedings of the 2017 Winter Simulation Conferencei, WSC ’17, (Las Vegas, NV, USA), p. 15, IEEE Press, Dec. 2017. Accepted at WSC ’17.

External

J. Domke "Existing De-Facto Standards for Interconnects: InfiniBand, GigE & OmniPath" in 32th ISC High Performance (ISC'17), Frankfurt, Germany, June 2017. Invited Talk.

Bibtex External

J. Domke "Routing on the Channel Dependency Graph: A New Approach to Deadlock-Free, Destination-Based, High-Performance Routing for Lossless Interconnection Networks," at Technische Universität Dresden, Dresden, Germany, June 2017. Dissertation.

Bibtex External

N. Wolfe, M. Mubarak, N. Jain, J. Domke, A. Bhatele, C. D. Carothers, and R. B. Ross, "Preliminary Performance Analysis of Multi-rail Fat-tree Networks," in 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid ’17, (Madrid, Spain), pp. 258–261, IEEE Press, May 2017. Short paper.

Paper Slides Bibtex External

J. Domke and T. Hoefler, "Scheduling-Aware Routing for Supercomputers," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’16, (Piscataway, NJ, USA), pp. 13:1-13:12, IEEE Press, 2016.

Paper Slides Bibtex External

J. Domke, T. Hoefler, and S. Matsuoka, "Routing on the Dependency Graph: A New Approach to Deadlock-Free High-Performance Routing," in Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC ’16, (New York, NY, USA), pp. 3-14, ACM, 2016.

Bibtex External

D. Wang, J. Domke, J. Mao, X. Shi, and D. M. Ricciuto, "A scalable framework for the global offline community land model ensemble simulation," Int. J. Comput. Sci. Eng., vol. 12, pp. 73-85, Feb. 2016.

Paper Bibtex External

K. A. Brown, J. Domke, and S. Matsuoka, "Hardware-Centric Analysis of Network Performance for MPI Applications," in 21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015, Melbourne, Australia, December 14-17, 2015, pp. 692-699, 2015.

Extended Abstract Poster Bibtex External

J. Domke, "Increasing Fabric Utilization with Job-Aware Routing," 2015. Poster presented at International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15).

Extended Abstract Poster Bibtex External

K. A. Brown, J. Domke, and S. Matsuoka, "Tracing Data Movements within MPI Collectives," Poster presented at 21st European MPI Users' Group Meeting (EuroMPI/ASIA '14).

Paper Slides Bibtex External

J. Domke, T. Hoefler, and S. Matsuoka, "Fail-in-place Network Design: Interaction Between Topology, Routing Algorithm and Failures," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '14, (Piscataway, NJ, USA), pp. 597-608, IEEE Press, 2014.

Paper Slides Bibtex External

J. Domke and D. Wang, "Runtime Tracing of the Community Earth System Model: Feasibility Study and Benefits," Procedia Computer Science, vol. 9, pp. 1950-1958, 2012. Proceedings of the International Conference on Computational Science, ICCS 2012.

Paper Slides Bibtex External

J. Domke, T. Hoefler, and W. E. Nagel, "Deadlock-Free Oblivious Routing for Arbitrary Topologies," in Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS), (Washington, DC, USA), pp. 613-624, IEEE Computer Society, May 2011.

Paper Bibtex

J. Mueller, T. Schneider, J. Domke, R. Geyer, M. Haesing, T. Hoefler, S. Hoehlig, G. Juckeland, A. Lumsdaine, M. Mueller, and W. Nagel, "Cluster Challenge 2008: Optimizing Cluster Configuration and Applications to Maximize Power Efficiency," in Proceedings of the 10th LCI International Conference on High-Performance Clustered Computing, Mar. 2009. LCI’09 2nd Best Paper Award.

Research

DFSSSP Routing

BenchIT

Nue Routing

Topology Generator

SAR Routing

CODES Simulator

Teaching at TU Dresden

Courses in Winter Semester 2016/2017:   Ø

Courses in summer semester 2016:

Courses in Winter Semester 2015/2016:   Ø

Courses in summer semester 2015:

Courses in Winter Semester 2014/2015:   Ø

About Me

Jens is a postdoctoral researcher of the Matsuoka Laboratory at the Tokyo Institute of Technology, Tokyo, Japan. He received his doctoral degree from the Technische Universität Dresden, Germany, in 2017 for his work on HPC routing algorithms and interconnects. Jens started his career in HPC in 2008, after he and a team of five students of the TU Dresden and Indiana University, won the Student Cluster Competition at SC08. Since then, he published several peer-reviewed journal and conference articles. Jens contributed the DFSSSP routing algorithm to the subnet manager of InfiniBand. His research focus is on interconnects, topologies, and routing algorithms for HPC systems. Furthermore, he is interested in SDN networks, scheduling algorithms for parallel architectures, and performance evaluation and optimization of parallel applications.

Contact