BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20220812T074334Z
LOCATION:Foyer 2nd Floor
DTSTART;TZID=Europe/Stockholm:20220628T090000
DTEND;TZID=Europe/Stockholm:20220628T110000
UID:submissions.pasc-conference.org_PASC22_sess181_pos163@linklings.com
SUMMARY:P38 - Designing a Modern 3D FFT Library for HPC with Data-Centric 
 Parallel Programming
DESCRIPTION:Poster\n\nP38 - Designing a Modern 3D FFT Library for HPC with
  Data-Centric Parallel Programming\n\nAndersson, Markidis\n\nFFT is one of
  the essential algorithms in Scientific Computing and many applications, f
 rom CFD to MD, rely on fast implementations of it. There are many FFT libr
 aries; one such library is the classic FFTW, which is still considered as 
 the state-of-the-art library. FFTW solves n-dimensional FFTs on many-core 
 systems by efficiently optimizing the algorithm to the current hardware. F
 or Nvidia GPUs, the de facto standard is cuFFT available from CUDA Toolkit
 . Several 3D FFT libraries utilize cuFFT and FFTW for the 1D calculation w
 hile controlling decomposition and communication for higher dimensions. Da
 Ce is a parallel programming framework that uses SDFGs as a transformable 
 intermediate representation. We propose a modern FFT library written in Da
 Ce that is portable and optimizable to most HPC hardware, such as multi-co
 re CPUs, GPUs and FPGAs. We aim to leverage the SDFGs and their transforms
  for better hardware optimization by using the many parallelisms found in 
 FFT algorithms, from SIMD vectorization on CPU to efficient inter-node 2D 
 decompositions of 3D FFTs. Using DaCe, we aim to be faster and more portab
 le than FFTW and cuFFT while still being simple to maintain and develop. F
 inally, we demonstrate the new library within GROMACS molecular dynamics c
 ode.
END:VEVENT
END:VCALENDAR
