maps memory maps to executables and library files

1.Classroom

maps   Memory maps to executables and library files

Get The Hang

$sudo cat /proc/971/maps
00400000-00408000 r-xp 00000000 08:01 7422174                            /sbin/syslogd
00607000-00608000 rw-p 00007000 08:01 7422174                            /sbin/syslogd
00608000-00609000 rw-p 00000000 00:00 0
01429000-0144a000 rw-p 00000000 00:00 0                                  [heap]
7fd360fa5000-7fd360fb0000 r-xp 00000000 08:01 1466895                    /lib/x86_64-linux-gnu/libnss_files-2.13.so
7fd360fb0000-7fd3611af000 ---p 0000b000 08:01 1466895                    /lib/x86_64-linux-gnu/libnss_files-2.13.so
7fd3611af000-7fd3611b0000 r--p 0000a000 08:01 1466895                    /lib/x86_64-linux-gnu/libnss_files-2.13.so
7fd3611b0000-7fd3611b1000 rw-p 0000b000 08:01 1466895                    /lib/x86_64-linux-gnu/libnss_files-2.13.so
7fd3611b1000-7fd3611bb000 r-xp 00000000 08:01 1466891                    /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7fd3611bb000-7fd3613ba000 ---p 0000a000 08:01 1466891                    /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7fd3613ba000-7fd3613bb000 r--p 00009000 08:01 1466891                    /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7fd3613bb000-7fd3613bc000 rw-p 0000a000 08:01 1466891                    /lib/x86_64-linux-gnu/libnss_nis-2.13.so
7fd3613bc000-7fd3613d1000 r-xp 00000000 08:01 1466681                    /lib/x86_64-linux-gnu/libnsl-2.13.so
7fd3613d1000-7fd3615d0000 ---p 00015000 08:01 1466681                    /lib/x86_64-linux-gnu/libnsl-2.13.so
7fd3615d0000-7fd3615d1000 r--p 00014000 08:01 1466681                    /lib/x86_64-linux-gnu/libnsl-2.13.so
7fd3615d1000-7fd3615d2000 rw-p 00015000 08:01 1466681                    /lib/x86_64-linux-gnu/libnsl-2.13.so
7fd3615d2000-7fd3615d4000 rw-p 00000000 00:00 0
7fd3615d4000-7fd3615db000 r-xp 00000000 08:01 1466617                    /lib/x86_64-linux-gnu/libnss_compat-2.13.so
7fd3615db000-7fd3617da000 ---p 00007000 08:01 1466617                    /lib/x86_64-linux-gnu/libnss_compat-2.13.so
7fd3617da000-7fd3617db000 r--p 00006000 08:01 1466617                    /lib/x86_64-linux-gnu/libnss_compat-2.13.so
7fd3617db000-7fd3617dc000 rw-p 00007000 08:01 1466617                    /lib/x86_64-linux-gnu/libnss_compat-2.13.so
7fd3617dc000-7fd361956000 r-xp 00000000 08:01 1466612                    /lib/x86_64-linux-gnu/libc-2.13.so
7fd361956000-7fd361b56000 ---p 0017a000 08:01 1466612                    /lib/x86_64-linux-gnu/libc-2.13.so
7fd361b56000-7fd361b5a000 r--p 0017a000 08:01 1466612                    /lib/x86_64-linux-gnu/libc-2.13.so
7fd361b5a000-7fd361b5b000 rw-p 0017e000 08:01 1466612                    /lib/x86_64-linux-gnu/libc-2.13.so
7fd361b5b000-7fd361b60000 rw-p 00000000 00:00 0
7fd361b60000-7fd361b7f000 r-xp 00000000 08:01 1466901                    /lib/x86_64-linux-gnu/ld-2.13.so
7fd361d58000-7fd361d5b000 rw-p 00000000 00:00 0
7fd361d7c000-7fd361d7d000 rw-p 00000000 00:00 0
7fd361d7d000-7fd361d7f000 rw-p 00000000 00:00 0
7fd361d7f000-7fd361d80000 r--p 0001f000 08:01 1466901                    /lib/x86_64-linux-gnu/ld-2.13.so
7fd361d80000-7fd361d81000 rw-p 00020000 08:01 1466901                    /lib/x86_64-linux-gnu/ld-2.13.so
7fd361d81000-7fd361d82000 rw-p 00000000 00:00 0
7fff96afb000-7fff96b1c000 rw-p 00000000 00:00 0                          [stack]
7fff96bbc000-7fff96bbd000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
sudo: pam_mount.c:417: modify_pm_count: Assertion `user != ((void *)0)' failed.
Aborted
$

mca_disable_dma – channel to disable DMA on

1. mca_disable_dma – channel to disable DMA on

void mca_disable_dma(unsigned int dmanr);
dmanr  DMA channel

2. Classroom

First  of  all,  DMA  (per  se) is  almost  entirely  obsolete.  As
originally defined,  DMA controllers  depended on thefact  that the
bus had  separate lines  to assert for  memory read/write,  and I/O
read/write. The DMA controller  took advantage of that by asserting
both  a memory  read and  I/O  write (or  vice versa)  at the  same
time. The DMA controller then generated successive addresses on the
bus, and  data was read from  memory and written to  an output port
(or vice versa) each bus cycle.

The  PCI bus,  however, does  not  have separate  lines for  memory
read/write and  I/O read/write. Instead,  it encodes one  (and only
one) command for  any given transaction. Instead of  using DMA, PCI
normally does  bus-masteringtransfers. This means instead  of a DMA
controller that transfers memory between the I/O device and memory,
the I/O device itself transfers data directly to or from memory.

As for what else  the CPU can do at the time,  it all depends. Back
when  DMA was  common, the  answer was  usually "not  much"  -- for
example,  under early  versions of  Windows, reading  or  writing a
floppy disk (which  did use the DMA controller)  pretty much locked
up the system for the duration.

Nowadays,  however, the memory  typically has  considerably greater
bandwidth than the  I/O bus, so even while  a peripheral is reading
or writing memory, there's usually  a fair amount of bandwidth left
over for the CPU to use.  In addition, a modern CPU typically has a
fair large cache, so it  can often execute some instruction without
using main memory at all.

source : http://stackoverflow.com/questions/5150719/direct-memory-access-dma-how-does-it-work

[PATCH] udf: Fix deadlock when converting file from in-ICB one to normal one

[PATCH] udf: Fix deadlock when converting file from in-ICB one to normal one

During  BKL removal,  conversion  of files  from  in-ICB format  to
normal format got broken.  We call ->writepage with i_data_sem held
but  udf_get_block()  also acquires  i_data_sem  thus creating  A-A
deadlock.

We  fix   the  problem   by  dropping  i_data_sem   before  calling
->writepage() which is safe since i_mutex still protects us against
any  changes in  the  file.  Also fix  pagelock  - i_data_sem  lock
inversion  in   udf_expand_file_adinicb()  by  dropping  i_data_sem
before calling find_or_create_page().

Big Kernel Lock

Linux  contains  a  global   kernel  lock,  kernel_flag,  that  was
originally introduced  in kernel 2.0  as the only SMP  lock. During
2.2 and 2.4, much work went  into removing the global lock from the
kernel and replacing it  with finer-grained localized locks. Today,
the global  lock's use  is minimal. It  still exists,  however, and
developers need to be aware of it.

The global kernel lock is called  the big kernel lock or BKL. It is
a  spinning  lock  that  is recursive;  therefore  two  consecutive
requests for it will not deadlock  the process (as they would for a
spinlock).  Further,  a  process  can  sleep  and  even  enter  the
scheduler while  holding the  BKL. When a  process holding  the BKL
enters the  scheduler, the lock  is dropped so other  processes can
obtain it. These attributes of the BKL helped ease the introduction
of SMP  during the 2.0  kernel series. Today, however,  they should
provide plenty of reason not to use the lock.

source : http://www.linuxjournal.com/article/5833?page=0,2

mutex_init .

1. mutex_init

Initialize the mutex to unlocked state.

2. Classroom

The point of a mutex  is to synchronize two threads. When
you  have  two  threads  attempting to  access  a  single
resource, the general pattern  is to have the first block
of  code  attempting  access  to  set  the  mutex  before
entering the  code. When  the second code  block attempts
access,  it sees  the mutex  is set  and waits  until the
first block of code  is complete (and un-sets the mutex),
then continues.

source : http://stackoverflow.com/questions/34524/what-is-a-mutex

tcp_rmem – vector of 3 INTEGERs: min, default, max

ABOUT tcp_rmem

min: Minimal size of receive buffer used by TCP sockets.

It is  guaranteed to  each TCP socket,  even under  moderate memory
pressure.  Default: 1 page

default: initial size of receive  buffer used by TCP sockets.  This value  overrides net.core.rmem_
default  used by  other  protocols. Default: 87380  bytes. This value  results in window of  100535
with default setting  of tcp_adv_win_scale  and tcp_app_win:0 and  a bit less for default tcp_app_win.
See below about these variables.

max:  maximal  size of  receive  buffer  allowed for  automatically selected  receiver buffers  for  TCP
socket.  This  value does  not override  net.core.rmem_max.  Calling  setsockopt()  with SO_RCVBUF
disables automatic tuning of  that socket's receive buffer size, in which case this value is ignored.
Default: between 87380B and 4MB, depending on RAM size.

TYPICAL COMMANDLINE SESSION
[bash]
$cat /proc/sys/net/ipv4/tcp_rmem
4096 87380 2003296
$
[/bash]

TYPICAL NETWORK RELATED SOURCE CODE
[c]
/* the caller has to hold the sock lock */
static int rds_tcp_read_sock(struct rds_connection *conn, gfp_t gfp,
enum km_type km)
{
struct rds_tcp_connection *tc = conn->c_transport_data;
struct socket *sock = tc->t_sock;
read_descriptor_t desc;
struct rds_tcp_desc_arg arg;

/* It’s like glib in the kernel! */
arg.conn = conn;
arg.gfp = gfp;
arg.km = km;
desc.arg.data = &arg;
desc.error = 0;
desc.count = 1; /* give more than one skb per call */

tcp_read_sock(sock->sk, &desc, rds_tcp_data_recv);
rdsdebug("tcp_read_sock for tc %p gfp 0x%x returned %dn", tc, gfp,
desc.error);

return desc.error;
}
[/c]

RELATED KNOWLEDGE

TCP  performance depends  not upon  the transfer  rate  itself, but rather upon  the product  of the
transfer  rate and  the round-trip delay.  This "bandwidth*delay product"  measures the amount of data
that  would "fill the  pipe"; it  is the  buffer space  required at sender  and  receiver  to  obtain
maximum throughput  on  the  TCP connection over  the path, i.e., the amount  of unacknowledged data
that  TCP must  handle in  order to  keep the  pipeline  full.  TCP performance  problems  arise when
the  bandwidth*delay product  is large.  We refer to an Internet  path operating in this region as a
"long, fat  pipe", and a network  containing this path  as an "LFN" (pronounced "elephan(t)").
 

High-capacity  packet satellite  channels  (e.g., DARPA's  Wideband Net) are LFN's.   For example, a DS1
speed satellite  channel has a bandwidth*delay product of 10**6  bits or more; this corresponds to
100  outstanding  TCP segments  of  1200  bytes each.   Terrestrial fiber-optical paths will also fall
into the LFN class; for example, a cross-country  delay of  30 ms at  a DS3 bandwidth  (45Mbps) also
exceeds 10**6 bits.

LINK
https://tools.ietf.org/html/rfc1323

BusyBox – The Swiss Army Knife of Embedded Linux

UNIX Command 

$dpkg -L busybox
/.
/bin
/bin/busybox
/usr
/usr/share
/usr/share/man
/usr/share/man/man1
/usr/share/man/man1/busybox.1.gz
/usr/share/doc
/usr/share/doc/busybox
/usr/share/doc/busybox/copyright
/usr/share/doc/busybox/changelog.Debian.gz
$mount
/dev/sda1 on / type ext3 (rw,errors=remount-ro,commit=0)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,size=5242880,mode=755,size=5242880,mode=755)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=755,size=10%,mode=755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,size=20%,mode=1777,size=20%,mode=1777)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620,gid=5,mode=620)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
$busybox mount
rootfs on / type rootfs (rw)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,relatime,size=991460k,nr_inodes=2478100,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=199580k,mode=755)
/dev/disk/by-uuid/26cca090-8a72-4443-859f-7a67b7188357 on / type ext3 (rw,relatime,errors=remount-ro,commit=5,barrier=1,data=ordered)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,relatime,size=5120k,mode=755)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,relatime,size=399156k)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
$

UNIX Explanation

BusyBox combines tiny versions of many common UNIX utilities into a
single  small executable. It  provides minimalist  replacements for
most  of the  utilities you  usually find  in GNU  coreutils, util-
linux, etc.  The utilities in BusyBox generally  have fewer options
than their full-featured GNU cousins; however, the options that are
included provide  the expected  functionality and behave  very much
like their GNU counterparts.

Why ?


BusyBox  has  been   written  with  size-optimization  and  limited
resources in mind.  It is  also extremely modular so you can easily
include  or exclude commands  (or features)  at compile  time. This
makes  it easy  to customize  your  embedded systems.  To create  a
working system, just  add /dev, /etc, and a  Linux kernel.  BusyBox
provides  a fairly  complete  POSIX environment  for  any small  or
embedded system.

fs: Make write(2) interruptible by a fatal signal

Kernel Space
linux-fsdevel@vger.kernel.org :


Currently write(2)  to a file  is not interruptible by  any signal.
Sometimes this is  desirable, e.g. when you want  to quickly kill a
process  hogging your  disk. Also,  with commit  499d05ecf990 ("mm:
Make  task in balance_dirty_pages()  killable"), it's  necessary to
abort the  current write accordingly  to avoid it  quickly dirtying
lots more pages at unthrottled rate.

balance_dirty_pages_ratelimited_nr - balance dirty memory state

Processes which  are dirtying memory  should call in here  once for
each page which was  newly dirtied.  The function will periodically
check  the system's  dirty  state and  will  initiate writeback  if
needed.  On really  big machines, get_writeback_state is expensive,
so try to avoid calling it too often (ratelimiting). But once we're
over the dirty memory limit  we decrease the ratelimiting by a lot,
to  prevent individual  processes  from overshooting  the limit  by
(ratelimit_pages) each.

Classroom

A memory-mapped file is a  segment of virtual memory which has been
assigned a direct byte-for-byte  correlation with some portion of a
file or file-like resource. This  resource is typically a file that
is physically  present on-disk,  but can also  be a  device, shared
memory  object,  or  other   resource  that  the  operating  system
canreference  through   a  file  descriptor.   Once  present,  this
correlation  between   the  file  and  the   memory  space  permits
applications  to treat  the mapped  portion as  if it  were primary
memory.


source: http://en.wikipedia.org/wiki/Memory-mapped_file

User Space
write - write to a file descriptor


A successful return  from write() does not make  any guarantee that
data  has  been  committed  to   disk.   In  fact,  on  some  buggy
implementations,  it  does  not   even  guarantee  that  space  has
successfully been reserved  for the data.  The only  way to be sure
is to call fsync(2) after you are done writing all your data.  If a
write() is  interrupted by  a signal handler  before any  bytes are
written,  then  the call  fails  with the  error  EINTR;  if it  is
interrupted  after at  least one  byte has  been written,  the call
succeeds, andreturns the number of bytes written.


fsync, fdatasync - synchronize a file's in-core state with storage device

fsync() transfers  ("flushes") all modified in-core  data of (i.e.,
modified buffer cache  pages for) the file referred  to by the file
descriptor  fd  to the  disk  device  (or  other permanent  storage
device) where that file resides.   The call blocks until the device
reports that the transfer  has completed.  It also flushes metadata
information  associated  with  the  file  (see  stat(2)).   Calling
fsync() does not necessarily ensure that the entry in the directory
containing the  file has also  reached disk.  For that  an explicit
fsync() on a file descriptor for the directory is also needed.