I need to write a simple program that does 2D arrays multiplication in CUDA, but in 2 ways: sequential processing and parallel processing. It requires to compare executing time between sequential processing and parallel processing.
I think console application is the best with the steps:
1. Initialize 2 of 2D arrays (enter width, height, default value of each element)
2. Execute multiply the arrays with sequential processing (execute, print result on screen then print executing time)
3. Execute multiply the arrays with parallel processing (execute, print result on screen then print executing time)
4. Print 2 executing time above.
Besure the program is written with CUDA C++