Output of C++ code | SoloLearn: Learn to code for FREE!


Output of C++ code

Hello again, Can someone explain this code. I understand n1, but not n2. How can sizeof(p) be 8 bytes, when it is an array of 9 ints? #include <iostream> using namespace std; int main() { int mas[9]={2,6,3,9,1,0,5,8,4}; int *p = new int[9]; for (int i=0; i<9; ++i) { p[i] = mas[i]; } int n1 = sizeof(mas)/sizeof(*mas); int n2 = sizeof(p)/sizeof(*p); cout << n1 << n2; return 0; }

1/25/2021 8:49:59 PM

Edward Finkelstein

13 Answers

New Answer


It is not an array! It's a pointer, and pointers store memory addresses, they have size 32 or 64 bits (8bytes) And please delete the memory you borrow when you don't need it anymore


Edward Finkelstein, The IA32 and X64 processors have Complex Instruction Set Computer (CISC) cores, as opposed to Reduced Instruction Set Computer (RISC) cores. This means that they have complex single instructions that can perform complex operations such as block moves. Essentially, if optimised correctly, memcpy() loads the source pointer into one register, the destination register into a second register, and the number of bytes to move into a third register. It then issues a single machine code instruction and off it goes transferring the data. Even with such a small array the difference is noticeable despite the fact that the function call, return and loading the registers takes extra time. Though the high precision timer is more accurate for timing loops this short. I like your code by the way. I may have to have a play with that tomorrow with some larger blocks and the high precision timer.


Part 1 I had a look at your code and yes you did grok the cout statements. The middle one is the memcpy() and the outer two are loops. Notice how the first and second timings are different for the for-loops. This is possibly due to caching of instructions and memory. When I ran your code on my local machine I couldn't get a reading higher than 0.00005 msecs for anything! The SoloLearn server is multitasking multiple users at the same time so will process the code slower than a local machine. Regarding optimization, because there are multiple ways you can express the same high level code in assembly language a compiler generally strikes a balance between the number of instructions (the size of the program) and the execution speed. These optimization settings can be changed using command line switches to favour program size or execution speed. Usually the compiler will generate reasonable assembly code. The code generated is often as good as, or better, than an average assembly language programmer but not as good as hand optimised assembly language by a skilled assembly language programmer. It depends on how well the back end of the compiler was written. continued...


Part 2 Since C/C++ are stack based languages variables are passed between functions on the stack using a stack frame, a data structure constructed on the stack. The function accesses the arguments using relative offsets from the stack pointer. This is why C requires all variables to be declared before use - so it knows the size, and thus the offset, on the stack frame. In our loop example the pointers must be loaded from the stack frame into registers as well as the count, data is transferred from the source to destination using a movs or movsb instruction and the count adjusted. This is repeated until all bytes are transferred i.e. count = 0. If the count is not zero the code loops back and repeats the mov instruction. An optimised routine would do the same thing but load the count into the counter register and invoke a single rep movsb instruction. This single instruction will transfer everything by performing the mov, decrementing the counter and repeating until the counter is zero without the need for additional instructions. Back in the stone age of computing these were the sort of things we had to pay attention to since clock speeds were around 4 MHz and CPU throughput was around 1 MIPS on average. These days with cpu caches and multiple cores it's not so critical. Usually this kind of optimization is only worried about if the application is not performing as well as expected. I'm an old fossil (pushing 60) from an electronics background. When computerised equipment first came out it was dropped in my lap and I was told "It's electronic, you deal with it". So much of my knowledge is from dealing with low level code at the actual hardware interface level. While PC programmers just say "buy more ram" or "upgrade your equipment" to run the code that isn't an option for embedded systems programmers. The hardware is fixed and you have to make the new code fit. Every processor cycle and byte counts.


That's a terribly inefficient, and naive, way to duplicate an array. Use memcpy it's much faster. http://www.cplusplus.com/reference/cstring/memcpy/ memcpy(p, mas, sizeof(int) * 9); instead of the for loop.


Edward Finkelstein, the for loop is broken down into a group of instructions that use variables on the stack frame and use more instructions than memcpy(). If the memcpy() function is heavily optimised it will be using register variables instead of the stack frame. As your program shows it can be as much as twice as fast. Now imagine doing this for a 12 MB image that you are processing.


Martin Taylor, Thanks, learned something new today


Martin Taylor It appears you are right: https://code.sololearn.com/crxx92841Pm3/# How did you know it? Why is memcpy more efficient?


Martin Taylor, Then it makes sense why memcpy takes a shorter time since there is only one transfer as opposed to 9, in this case? But I don't know what you mean by optimized, since this appears to be a standard function from string.h, I would assume it is optimized, or else no one would use it? Yes, the clock_t type and clock() function are pretty nifty, I came across them when trying to compare recursive vs iterative solution for Fibonacci, if you input 30 below, you see there is a major difference. https://code.sololearn.com/cr0bbz1x5t65/# https://code.sololearn.com/cr0bbz1x5t65/# Btw, you seem very knowledgeable, what is your background? I am an undergrad physics student, graduating this May. update: it seems I flopped the cout statements, now it seems for-loop is faster?


Martin Taylor "When I ran your code on my local machine I couldn't get a reading higher than 0.00005 msecs for anything! The SoloLearn server is multitasking multiple users at the same time so will process the code slower than a local machine." So I increased the array size to 2000 (https://code.sololearn.com/cRi071czkxxM/#) and now the difference is more apparent: https://code.sololearn.com/cCQdSZCyH562/#


Nice very nice discussed and agreed that we can all learn from each other. This is very educational for a lot of SoloLearn enthousiasts. Thanks for your efforts to teach πŸ‘πŸ€—


sizeof(p) will give you size of that pointer only not for whole integers stored inside.


Angelo, thanks. This is from a code challenge, but I agree there needs to be a delete [] p; at the end.