2009年6月15日月曜日

20090609のHW

1.Report on your progress on your project.
interrupt coalecsingの設定が出来ないでいます。
onboardのNICではないNICで試してみます。また直接プログラムからinterrupt coalescingを変更する関数を呼び出して設定してみます。
測定用のツールとしてiperfにCPU時間のユーザ時間とシステム時間を取得するコードを組み込んでいます。


2.A socket has parts both in the kernel and in the user library. Find the definition of both in an OS of your choice, and describe the differences in the information.

[linux-kernel]
1202 asmlinkage long sys_socket(int family, int type, int protocol)
1203 {
1204 int retval;
1205 struct socket *sock;
1206
1207 retval = sock_create(family, type, protocol, &sock);
1208 if (retval < retval =" sock_map_fd(sock);">nsproxy->net_ns, family, type, protocol, res, 0);
1195 }

1079 static int __sock_create(struct net *net, int family, int type, int protocol,
1080 struct socket **res, int kern)
1081 {
1082 int err;
1083 struct socket *sock;
1084 const struct net_proto_family *pf;
1085
1086 /*
1087 * Check protocol is in range
1088 */
1089 if (family <>= NPROTO)
1090 return -EAFNOSUPPORT;
1091 if (type <>= SOCK_MAX)
1092 return -EINVAL;
1093
1094 /* Compatibility.
1095
1096 This uglymoron is moved from INET layer to here to avoid
1097 deadlock in module load.
1098 */
1099 if (family == PF_INET && type == SOCK_PACKET) {
1100 static int warned;
1101 if (!warned) {
1102 warned = 1;
1103 printk(KERN_INFO "%s uses obsolete (PF_INET,SOCK_PACKET)\n",
1104 current->comm);
1105 }
1106 family = PF_PACKET;
1107 }
1108
1109 err = security_socket_create(family, type, protocol, kern);
1110 if (err)
1111 return err;
1112
1113 /*
1114 * Allocate the socket and allow the family to set things up. if
1115 * the protocol is 0, the family is instructed to select an appropriate
1116 * default.
1117 */
1118 sock = sock_alloc();
1119 if (!sock) {
1120 if (net_ratelimit())
1121 printk(KERN_WARNING "socket: no more sockets\n");
1122 return -ENFILE; /* Not exactly a match, but its the
1123 closest posix thing */
1124 }
1125
1126 sock->type = type;
1127
1128 #if defined(CONFIG_KMOD)
1129 /* Attempt to load a protocol module if the find failed.
1130 *
1131 * 12/09/1996 Marcin: But! this makes REALLY only sense, if the user
1132 * requested real, full-featured networking support upon configuration.
1133 * Otherwise module support will break!
1134 */
1135 if (net_families[family] == NULL)
1136 request_module("net-pf-%d", family);
1137 #endif
1138
1139 rcu_read_lock();
1140 pf = rcu_dereference(net_families[family]);
1141 err = -EAFNOSUPPORT;
1142 if (!pf)
1143 goto out_release;
1144
1145 /*
1146 * We will call the ->create function, that possibly is in a loadable
1147 * module, so we have to bump that loadable module refcnt first.
1148 */
1149 if (!try_module_get(pf->owner))
1150 goto out_release;
1151
1152 /* Now protected by module ref count */
1153 rcu_read_unlock();
1154
1155 err = pf->create(net, sock, protocol);
1156 if (err <>ops->owner))
1164 goto out_module_busy;
1165
1166 /*
1167 * Now that we're done with the ->create function, the [loadable]
1168 * module can have its refcnt decremented
1169 */
1170 module_put(pf->owner);
1171 err = security_socket_post_create(sock, family, type, protocol, kern);
1172 if (err)
1173 goto out_sock_release;
1174 *res = sock;
1175
1176 return 0;
1177
1178 out_module_busy:
1179 err = -EAFNOSUPPORT;
1180 out_module_put:
1181 sock->ops = NULL;
1182 module_put(pf->owner);
1183 out_sock_release:
1184 sock_release(sock);
1185 return err;
1186
1187 out_release:
1188 rcu_read_unlock();
1189 goto out_sock_release;
1190 }




3.
Write a program that, when run, prints out its own source code.

char*a="char*a=%c%s%c;
main()
{
printf(a,34,a,34);
}
";
main()
{
printf(a,34,a,34);
}


20090602のHW

1.Report on your progress on your project.
2.Estimate how long it would take to swap out an entire process on your machine.
a.How fast is your disk in megabytes/second, roughly?
[実行結果]
$ sudo hdparm -t /dev/sda1

/dev/sda1:
Timing buffered disk reads: 140 MB in 3.03 seconds = 46.19 MB/sec

b.Pick a process on your system (say, Word or Firefox). How big is it, in MB of memory?
[実行結果]
$ ps aux | grep firefox
cream 6571 10.9 8.3 350620 170416 ? Sl 13:26 14:41 /usr/lib/firefox-3.0.11/firefox
cream 13600 0.0 0.0 2816 788 pts/0 S+ 15:40 0:00 grep firefox

使用仮想メモリ
350620 kバイト
使用物理メモリ
170416 kバイト

c.Divide. How long will it take to write out the whole process, assuming that it can be written linearly at full disk bandwidth?
170.416/46.19 = 3.689 sec

3.Go back and rerun your memory copy experiments for sizes up to 100MB or so, and produce a graph with error bars and a linear fit. What is your Y intercept (the fixed, overhead cost) and your slope (the per-unit cost)? Tell me why you believe the linear fit does or does not represent the actual cost of the operation






4.Now run up to sizes much larger than your physical memory. What happens? Graph the output. (Note: this may take a long time to run!)
物理メモリが2Gなので3Gを指定して実行しました。
その結果、処理が遅くなったあと動かなくなりました。
しかし、物理メモリ以下の要領である1Gの前に動かなくなったため、理由がよく分かりません。
途中まで取れていた結果のグラフです。

これを見ると途中から既存のアプリケーションのswapを行っているため、理論値よりもかなり遅くなっている。

2009年6月2日火曜日

0526のHomework

1.Experimentally construct a rough memory map for an application on your operating system.
  1. Write a program that prints out (in hexadecimal) the addresses of the following:
    1. main()
    2. a variable on the outermost stack frame (main()'s stack frame)
    3. a variable on the stack frame of a recursively-called function called to a depth of five times
    4. a statically-defined but uninitialized variable
    5. a statically-defined, initialized variable
    6. several large chunks of malloc()ed memory
    7. a library routine, such as strcpy()
    8. a system call wrapper, such as the one for write()
[プログラム]
1 #include
2 #include
3 #include
4 #include
5
6 void recursive_function();
7
8 static cnt = 5;
9
10 int main(int argc, char **argv)
11 {
12 static uninitial;
13 static initial=0;
14 int *mem, a;
15 mem = (int *)malloc(16);
16 char *src = "hello";
17 char *dst;
18
19 printf(" main() : %p\n", main);
20 printf(" argc : %p\n a: %p\n", &argc, &a);
21 printf(" recursive_function : %p\n", recursive_function);
22 printf(" uninitial : %p\n", &uninitial);
23 printf(" initial : %p\n", &initial);
24 printf(" malloc : %p\n", &mem);
25 printf(" strcpy : %p\n", strcpy(dst, src));
26 printf("system call wrapper : %p\n", fork);
27
28 }
29
30 void
31 recursive_function()
32 {
33 if (cnt < 0) {
34 return ;
35 }
36 cnt--;
37 recursive_function();
38 }
[実行結果]
main() : 0x8048414
argc : 0xbff2e450
a : 0xbff2e428
recursive_function : 0x804850d
uninitial : 0x8049808
initial : 0x8049804
malloc : 0x804a020
strcpy : 0xb7f73ff4
system call wrapper : 0x8048378

B. Take that information and draw a memory map for your OS. It should indicate which direction the stack and the heap grow in. An ASCII picture is okay, or you can use a drawing program of some sort if you wish.
  1. How big is the distance between your stack and your heap?
  2. Was your program compiled with static libraries or shared libraries?
1.メモリ番地を計算すると0xbf829a26の距離が有ります

2.strcpyは共有ライブラリで、他の関数(main, recursive)は静的ライブラリ(?)


2.
A.デバイスドライバ内で記述されているinterrupt coalescing
/* Get the coalescing parameters, and put them in the cvals
* structure. */
static int gfar_gcoalesce(struct net_device *dev, struct ethtool_coalesce *cvals)
{
struct gfar_private *priv = netdev_priv(dev);

if (!(priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_COALESCE))
return -EOPNOTSUPP;

if (NULL == priv->phydev)
return -ENODEV;

cvals->rx_coalesce_usecs = gfar_ticks2usecs(priv, priv->rxtime);
cvals->rx_max_coalesced_frames = priv->rxcount;

cvals->tx_coalesce_usecs = gfar_ticks2usecs(priv, priv->txtime);
cvals->tx_max_coalesced_frames = priv->txcount;

cvals->use_adaptive_rx_coalesce = 0;
cvals->use_adaptive_tx_coalesce = 0;

cvals->pkt_rate_low = 0;
cvals->rx_coalesce_usecs_low = 0;
cvals->rx_max_coalesced_frames_low = 0;
cvals->tx_coalesce_usecs_low = 0;
cvals->tx_max_coalesced_frames_low = 0;

/* When the packet rate is below pkt_rate_high but above
* pkt_rate_low (both measured in packets per second) the
* normal {rx,tx}_* coalescing parameters are used.
*/

/* When the packet rate is (measured in packets per second)
* is above pkt_rate_high, the {rx,tx}_*_high parameters are
* used.
*/
cvals->pkt_rate_high = 0;
cvals->rx_coalesce_usecs_high = 0;
cvals->rx_max_coalesced_frames_high = 0;
cvals->tx_coalesce_usecs_high = 0;
cvals->tx_max_coalesced_frames_high = 0;

/* How often to do adaptive coalescing packet rate sampling,
* measured in seconds. Must not be zero.
*/
cvals->rate_sample_interval = 0;

return 0;
}

/* Change the coalescing values.
* Both cvals->*_usecs and cvals->*_frames have to be > 0
* in order for coalescing to be active
*/
static int gfar_scoalesce(struct net_device *dev, struct ethtool_coalesce *cvals)
{
struct gfar_private *priv = netdev_priv(dev);

if (!(priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_COALESCE))
return -EOPNOTSUPP;

/* Set up rx coalescing */
if ((cvals->rx_coalesce_usecs == 0) ||
(cvals->rx_max_coalesced_frames == 0))
priv->rxcoalescing = 0;
else
priv->rxcoalescing = 1;

if (NULL == priv->phydev)
return -ENODEV;

/* Check the bounds of the values */
if (cvals->rx_coalesce_usecs > GFAR_MAX_COAL_USECS) {
pr_info("Coalescing is limited to %d microseconds\n",
GFAR_MAX_COAL_USECS);
return -EINVAL;
}

if (cvals->rx_max_coalesced_frames > GFAR_MAX_COAL_FRAMES) {
pr_info("Coalescing is limited to %d frames\n",
GFAR_MAX_COAL_FRAMES);
return -EINVAL;
}

priv->rxtime = gfar_usecs2ticks(priv, cvals->rx_coalesce_usecs);
priv->rxcount = cvals->rx_max_coalesced_frames;

/* Set up tx coalescing */
if ((cvals->tx_coalesce_usecs == 0) ||
(cvals->tx_max_coalesced_frames == 0))
priv->txcoalescing = 0;
else
priv->txcoalescing = 1;

/* Check the bounds of the values */
if (cvals->tx_coalesce_usecs > GFAR_MAX_COAL_USECS) {
pr_info("Coalescing is limited to %d microseconds\n",
GFAR_MAX_COAL_USECS);
return -EINVAL;
}

if (cvals->tx_max_coalesced_frames > GFAR_MAX_COAL_FRAMES) {
pr_info("Coalescing is limited to %d frames\n",
GFAR_MAX_COAL_FRAMES);
return -EINVAL;
}

priv->txtime = gfar_usecs2ticks(priv, cvals->tx_coalesce_usecs);
priv->txcount = cvals->tx_max_coalesced_frames;

if (priv->rxcoalescing)
gfar_write(&priv->regs->rxic,
mk_ic_value(priv->rxcount, priv->rxtime));
else
gfar_write(&priv->regs->rxic, 0);

if (priv->txcoalescing)
gfar_write(&priv->regs->txic,
mk_ic_value(priv->txcount, priv->txtime));
else
gfar_write(&priv->regs->txic, 0);

return 0;
}

B. HPのマシン2台、GbitのNICを持っている。またethtool -C eth0を実行出来る
ただまだcoalescingの設定変更は出来ていない。
Packetモニター用のThinkpadのPCのNICもGbitのNICを持っている。

C. milestone & schedule
6/2 マシンの性能確認&coalescingのデバイスドライバでの実装を見る
6/9 interrupt coalescingの設定方法を見つける&実験環境を整える
6/16 iperfへのCPU時間取得コードの実装&packet間の時間の測定
6/23 CPU時間の精度を求める&測定開始
6/30 coalescingを変更したときのCPU時間の測定とpacketロスの測定の終了(CPUに制限を与えた場合)
7/7 考察


3.
4 hours