BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220812T074334Z
LOCATION:Foyer 2nd Floor
DTSTART;TZID=Europe/Stockholm:20220628T090000
DTEND;TZID=Europe/Stockholm:20220628T110000
UID:submissions.pasc-conference.org_PASC22_sess181_pos152@linklings.com
SUMMARY:P33 - Topology Aware Collective Communication based on Cyclic Shif
 t and Recursive Exchange
DESCRIPTION:Poster\n\nP33 - Topology Aware Collective Communication based 
 on Cyclic Shift and Recursive Exchange\n\nJocksch, Karakasis\n\nThe cyclic
  shift and recursive exchange algorithms for collective communication on p
 arallel computers were comprehensively investigated recently [1]. With sui
 table parameters of the schemes determined with a benchmark at installatio
 n time and a heuristic at runtime implementations with high performance fo
 r the message passing interface (MPI) can be obtained. For the collective 
 communication patterns reduce_scatter, allgatherv, and allreduce on hybrid
  shared and distributed computers the topology is mostly addressed using a
  hierarchical approach. We show how a combination of the cyclic shift and 
 recursive exchange algorithms can match the topology of multi CPU or GPU p
 er node architectures. The algorithm applied is recursive exchange with hi
 gher radix and different factors for each step where the steps are perform
 ed with cyclic shift and multiple ports per node are used. Out of many alg
 orithmic options the communication is arranged such that the largest data 
 volumes occur for the fast shared memory while over the network smaller vo
 lumes are sent. Comparisons with the hierarchical implementation are made 
 for persistent collective communication, but our approach is not limited t
 o this case. [1] "An optimisation of allreduce communication in message-pa
 ssing systems" A.Jocksch, N.Ohana, E.Lanti, E.Koutsaniti, V.Karakasis, L.V
 illard, Parallel Comput. 2021
END:VEVENT
END:VCALENDAR
