Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who used to be a hardware engineer, I found Figure 1 in the first section surprising. All modern OS run processes on independent virtual memory spaces, in order to ensure that processes don't collide with one another, or with the OS itself. But if figure 1 is to be believed, the kernel shares the same address space as the process. Is this a mistake on the part of the writer?

[1] http://www.cs.utexas.edu/users/witchel/372/lectures/15.Virtu...

[2] https://en.wikipedia.org/wiki/Virtual_address_space



It's pretty common to map the kernel memory address to the same fixed range in the virtual memory address space of every process. Typically the kernel portion resides on the higher range of the memory space. For the 32-bit 2/2 split setup, 0GB-2GB is reserved for user mode and 2GB-4GB for kernel. With 1/3 split, 0GB-3GB for user and 3GB-4GB for kernel.

This makes it easy to work on memory shared between user mode and kernel mode code since it's the same address space. Buffer passed from user mode to kernel mode is just a matter of passing down the virtual memory address pointer, no need to copy. The kernel code accessing it just accesses the lower range of the virtual memory space.

The kernel code mapped to the same fixed range in every process also makes it easy to call kernel routine from user mode. SysCall just elevates privilege to be in kernel mode and jumps to the kernel routine at the exact same address in every process. You can think of the kernel as a special library got "linked" into every process at the exact same location, along with all its data.

Although user mode and kernel mode are in the same memory address space, user mode cannot access kernel mode memory. Memory pages are protected with flags, like R/W (Read/Write), U/S (User/Supervisor). A S-marked page cannot be accessed by user mode code. Protection between kernel and user mode is still in place.


No, it's not a mistake. Each process has it's own virtual address space, but part of that space is typically used to map operating system code and data. The reserved part is anywhere from a quarter to a half on a 4GB 32-bit system (I have no idea on a 64bit). Those pages are marked kernel-only. The upper half to three-quarters will be different for each process.

One reason for this is so that you can easily trap to the kernel (for an interrupt, a system call, an exception, etc.) without changing the page tables at all -- you just change the cpu mode bit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: