BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20211207T055413Z
LOCATION:242
DTSTART;TZID=America/Chicago:20211115T110000
DTEND;TZID=America/Chicago:20211115T113000
UID:submissions.supercomputing.org_SC21_sess340_ws_espm103@linklings.com
SUMMARY:Accelerating Messages by Avoiding Copies in an Asynchronous Task-B
 ased Programming Model
DESCRIPTION:Workshop\n\nAccelerating Messages by Avoiding Copies in an Asy
 nchronous Task-Based Programming Model\n\nBhat, White, Ramos, Kale\n\nTask
 -based programming models promise improved communication performance for i
 rregular, fine-grained, and load imbalanced applications. They do so by re
 laxing some of the messaging semantics of stricter models and taking advan
 tage of those at the lower-levels of the software stack. For example, whil
 e MPI's two-sided communication model guarantees in-order delivery, requir
 es matching sends to receives, and has the user schedule communication, ta
 sk-based models generally favor the runtime system scheduling all executio
 n based on the dependencies and message deliveries as they happen. The mes
 saging semantics are critical to enabling high performance. \n\nIn this pa
 per, we build on previous work that added zero copy semantics to Converse/
 LRTS. We examine the messaging semantics of Charm++ as it relates to large
  message buffers, identify shortcomings, and define new communication APIs
  to address them. Our work enables in-place communication semantics in the
  context of point-to-point messaging, broadcasts, transmission of read-onl
 y variables at program startup, and for migration of chares. We showcase t
 he performance of our new communication APIs using benchmarks for Charm++ 
 and Adaptive MPI, which result in nearly 90% latency improvement and 2x lo
 wer peak memory usage.\n\nTag: Architectures, Big Data, Cloud and Distribu
 ted Computing, Extreme Scale Computing, Heterogeneous Systems, Parallel Pr
 ogramming Languages and Models, Parallel Programming Systems, Quantum Comp
 uting, Scientific Computing, System Software and Runtime Systems\n\nRegist
 ration Category: Workshop Reg Pass
END:VEVENT
END:VCALENDAR
