FTI: High performance fault tolerance interface for hybrid systems L Bautista-Gomez, S Tsuboi, D Komatitsch, F Cappello, N Maruyama, ... Proceedings of 2011 international conference for high performance computing …, 2011 | 449 | 2011 |
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer T Shimokawabe, T Aoki, T Takaki, T Endo, A Yamanaka, N Maruyama, ... Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 264 | 2011 |
Statistical power modeling of GPU kernels using performance counters H Nagasaka, N Maruyama, A Nukada, T Endo, S Matsuoka International conference on green computing, 115-122, 2010 | 262 | 2010 |
Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers N Maruyama, T Nomura, K Sato, S Matsuoka Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 259 | 2011 |
Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs HR Zohouri, N Maruyama, A Smith, M Matsuda, S Matsuoka SC'16: Proceedings of the International Conference for High Performance …, 2016 | 223 | 2016 |
An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code T Shimokawabe, T Aoki, C Muroi, J Ishida, K Kawano, T Endo, A Nukada, ... SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 178 | 2010 |
Design and modeling of a non-blocking checkpointing system K Sato, N Maruyama, K Mohror, A Moody, T Gamblin, BR de Supinski, ... SC'12: Proceedings of the International Conference on High Performance …, 2012 | 148 | 2012 |
CUDA vs OpenACC: Performance case studies with kernel benchmarks and a memory-bound CFD application T Hoshino, N Maruyama, S Matsuoka, R Takaki 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid …, 2013 | 136 | 2013 |
Trends in data locality abstractions for HPC systems D Unat, A Dubey, T Hoefler, J Shalf, M Abraham, M Bianco, ... IEEE Transactions on Parallel and Distributed Systems 28 (10), 3007-3020, 2017 | 122 | 2017 |
An efficient, model-based CPU-GPU heterogeneous FFT library Y Ogata, T Endo, N Maruyama, S Matsuoka 2008 IEEE international symposium on parallel and distributed processing, 1-10, 2008 | 122 | 2008 |
Problem diagnosis in large-scale computing environments AV Mirgorodskiy, N Maruyama, BP Miller Proceedings of the 2006 ACM/IEEE conference on Supercomputing, 88-es, 2006 | 119 | 2006 |
Scalable kernel fusion for memory-bound GPU applications M Wahib, N Maruyama SC'14: Proceedings of the International Conference for High Performance …, 2014 | 116 | 2014 |
Virtual clusters on the fly-fast, scalable, and flexible installation H Nishimura, N Maruyama, S Matsuoka Seventh IEEE International Symposium on Cluster Computing and the Grid …, 2007 | 114 | 2007 |
Optimizing stencil computations for NVIDIA Kepler GPUs N Maruyama, T Aoki Proceedings of the 1st international workshop on high-performance stencil …, 2014 | 101 | 2014 |
A user-level infiniband-based file system and checkpoint strategy for burst buffers K Sato, K Mohror, A Moody, T Gamblin, BR De Supinski, N Maruyama, ... 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2014 | 81 | 2014 |
Improving the computing efficiency of HPC systems using a combination of proactive and preventive checkpointing MS Bouguerra, A Gainaru, LB Gomez, F Cappello, S Matsuoka, ... 2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013 | 78 | 2013 |
Linpack evaluation on a supercomputer with heterogeneous accelerators T Endo, S Matsuoka, A Nukada, N Maruyama 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 78 | 2010 |
Distributed diskless checkpoint for large scale systems LAB Gomez, N Maruyama, F Cappello, S Matsuoka 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid …, 2010 | 69 | 2010 |
Improving strong-scaling of CNN training by exploiting finer-grained parallelism N Dryden, N Maruyama, T Benson, T Moon, M Snir, B Van Essen 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019 | 62 | 2019 |
A high-performance fault-tolerant software framework for memory on commodity gpus N Maruyama, A Nukada, S Matsuoka 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 61 | 2010 |