Nasm multithreading | Sololearn: Learn to code for FREE!

0

Nasm multithreading

I'd like to know how multithreading works in CPU and GPU, does anyone know?

12/28/2020 6:51:19 PM

nicolas turek

4 Answers

New Answer

+1

Unfortunately, I've never made an operating system. An operating system would use some special CPU opcodes that applications have no access to to schedule processes on other cores. I think these are called "Supervisor Mode" opcodes. Windows has different device drivers for different CPU's so each driver might implement a different technique for loading multiple concurrent processes. I didn't find exactly what instructions would be involved to load another process but they should be somewhere in Intel's recent generation x86 data sheet: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/8th-gen-core-family-datasheet-vol-1.pdf Here is a more complete reference for the supported opcodes in a recent generation Intel x86 CPU: https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4.html Another place to look for how a second process gets loaded and run in a multicore cpu would be source code for Linux. A GPU driver probably has something similar except that GPU drawing contexts could be a lot more simple. A multicore CPU is truly running multiple processes at exactly the same time and a CPU is basically where the operating system comes to life. An operating system could live without a GPU since one can run without outputting any graphics. A GPU is a peripheral device which should make it easier to manage than the CPU.

0

Whether you implement software using assembly and NASM or not makes no difference to what happens when multiple threads try to use the GPU at the same time. This discussion with NVidia about using CUDA might not apply to all graphics cards or your question but this seems like a good specific example to learn from: https://forums.developer.nvidia.com/t/gpu-sharing-among-different-application-with-different-cuda-context/53057/5 From what I gather, the operating system(ie. Windows) will manage all multithreading if the CPU doesn't have separate logical cores assigned for them. The GPU will remember some information about the various contexts that are currently being used. Every CPU thread or process in the operating system will have a set of GPU contexts. Every time a thread asks something to be processed in the GPU, it'll be queued up and run in the GPU the next time the GPU is available. The GPU will usually finish each drawing task to completion before starting another one. Preemptive context switching would be very slow when context switching could require replacing most of the discrete video RAM with data in regular RAM. Processing each task from start to finish minimizes this expensive context switching and maximizes throughput. If a drawing operation takes too long as if it froze, the GPU can give up and destroy the context which causes the thread to lose access to the GPU until it restarts. Even though the GPU focuses on 1 rendering task to completion, its 100's or 1000's of shader units and other cores split up the current drawing task to finish it as quickly as possible. Shader programs often run millions of times for each frame in a game. The 100's or 1000's of shader units and other cores team up in parallel to finish each frame as quickly as possible. Exactly how they team up can be very complicated and specific to the GPU.

0

I should have written at begin I need it for own "OS" - there won't be any operating system under my program I know how to program with threading in C++ I've tried CUDA a little, but I'm still not sure how threading works at the lowest level - how to say CPU to use other cores Things I'm asking about are how to say CPU to use other core from some line in code (eg. instructions - that's why I used tag nasm) and than how CPU tells GPU (or other internal component) what to do - my gues is it sends address to cell in memory where begins program for each core... But I don't really know so I'm asking Any documentation would be enough

0

Thanks, this surely could help a lot, your sheet should be good enough since I'm working on 32-bit