The coming of exascale computer architectures such as Argonne National Laboratory's A21 and the presence of very different pre-exascale machines such as the fastest present machine the Tianhu SunwayLight and the DOE Summit and Sierra raises the challenge of being able to port codes across very different computer architectures. One alternative to conventional porting is to consider an approach based upon Asynchronous Many Task (AMT) Runtime Systems such as the Uintah framework considered here. Uintah structures the problem as a series of tasks that are executed by the runtime via a task scheduler. The central challenge in porting a large AMT runtime like Uintah is thus to consider how to devise an appropriate scheduler and how to write tasks to take advantage of a particular architecture.
While the Uintah approach already ports to current large platforms for real applications, It will be shown how an asynchronous Sunway-specific scheduler, based upon MPI and athreads, may be written and how individual task-code for a typical but model structured-grid fluid-flow problem. The lessons drawn from this are that an interface is needed to make the porting of the many hundreds of loops in large codes to different architectures.
With this in mind we look at the use of the Kokkos portability library for forthcoming architectures based on GPUs. This involves both improving how Kokkos works on GPUs and re-writing of loops in Kokkos. Preliminary results are presented for the use of Kokkos on key Uintah Kernels on both GPUs and Intel KNL architectectures. At the same time in doing so we look forward to how the codes can evolve to future architectures beyond those existing today.