BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220812T074334Z
LOCATION:Singapore Room
DTSTART;TZID=Europe/Stockholm:20220627T114500
DTEND;TZID=Europe/Stockholm:20220627T121500
UID:submissions.pasc-conference.org_PASC22_sess173_pap104@linklings.com
SUMMARY:Reducing Communication in the Conjugate Gradient Method: A Case St
 udy on High-Order Finite Elements
DESCRIPTION:Paper\n\nReducing Communication in the Conjugate Gradient Meth
 od: A Case Study on High-Order Finite Elements\n\nKarp, Jansson, Podobas, 
 Schlatter, Markidis\n\nCurrently, a major bottleneck for several scientifi
 c computations is communication, both communication between different proc
 essors, so-called horizontal communication, and vertical communication bet
 ween different levels of the memory hierarchy. With this bottleneck in min
 d, we target a notoriously communication-bound solver at the core of many 
 high-performance applications, namely the conjugate gradient method (CG). 
 To reduce the communication we present lower bounds on the vertical data m
 ovement in CG and go on to make a CG solver with reduced data movement. Us
 ing our theoretical analysis we apply our CG solver on a high-performance 
 discretization used in practice, the spectral element method (SEM). Guided
  by our analysis, we show that for the Poisson equation on modern GPUs we 
 can improve the performance by 30% by both rematerializing the discrete sy
 stem and by reformulating the system to work on unique degrees of freedom.
  In order to investigate how horizontal communication can be reduced, we c
 ompare CG to two communication-reducing techniques, namely <em>communicati
 on-avoiding</em><br />and <em>pipelined</em> CG. We strong scale up to 409
 6 CPU cores and showcase performance improvements of upwards of 70% for<br
  />pipelined CG compared to standard CG when applied on SEM at scale.We sh
 ow that in addition to improving the scaling capabilities of the solver, i
 nitial measurements indicate that the convergence of SEM is largely unaffe
 cted by pipelined CG.\n\nDomain: Climate, Weather and Earth Sciences
END:VEVENT
END:VCALENDAR
