Wt embedded » History » Version 11
Koen Deforche, 03/16/2010 08:55 AM
1 | 1 | Pieter Libin | h1. Wt embedded |
---|---|---|---|
2 | |||
3 | {{toc}} |
||
4 | |||
5 | Find here information on running Wt in resource constrained embedded systems: performance, code size, memory usage, and other info. |
||
6 | |||
7 | h2. General |
||
8 | |||
9 | Wt can easily be built for and deployed on embedded POSIX systems, such as embedded linux. |
||
10 | |||
11 | h3. Cross-building |
||
12 | |||
13 | Using CMake with a cross compilation environment: to be completed... |
||
14 | |||
15 | 8 | Koen Deforche | Instructions for cross compiling with cmake can be found on the "CMake Wiki":http://www.cmake.org/Wiki/CMake_Cross_Compiling. |
16 | 1 | Pieter Libin | |
17 | h3. Optimizing executable size |
||
18 | |||
19 | 8 | Koen Deforche | Points to consider when optimizing the executable size. |
20 | 1 | Pieter Libin | |
21 | 8 | Koen Deforche | For building boost: |
22 | * Use static build of boost, which allows the linker to strip away unused symbols |
||
23 | * Use the following compile flags for boost: |
||
24 | ** @-fvisibility=hidden -fvisibility-inlines-hidden@: to avoid exporting symbols in the executable |
||
25 | ** @-ffunction-sections -fdata-sections@: to allowing fine-grained garbage collection of unused functions/data |
||
26 | |||
27 | For building Wt: |
||
28 | * Choose build-type @MinSizeRel@ |
||
29 | * Extra compile falgs (@CMAKE_CXX_FLAGS@) |
||
30 | ** @-fvisibility=hidden -fvisibility-inlines-hidden@: to avoid exporting symbols in the executable |
||
31 | ** @-ffunction-sections -fdata-sections@: to allowing finegrained garbage collection of unused functions/data |
||
32 | ** @-DHAVE_GNU_REGEX@: to avoid the dependency on libboost_regex, when building on a system that is based on glibc or uClibc |
||
33 | ** @-DWT_NO_LAYOUT@: to avoid pulling in the Wt's layout managers, if you are not using any WLayout classes |
||
34 | ** @-WT_NO_SPIRIT@: to avoid depending on spirit to parse locale and cookies (if you don't need that) |
||
35 | ** @-DWT_NO_XSS_FILTER@: to avoid the extra (runtime) overhead of XSS filtering, usually not relevant for a trusted embedded platform |
||
36 | 1 | Pieter Libin | * Build static libraries (for libwt.a and libwthttp.a) |
37 | 8 | Koen Deforche | ** in CMake: @SHARED_LIBS:BOOL=OFF@ |
38 | 1 | Pieter Libin | * Disable build options you don't need and introduce extra dependencies (libz, openssl ?) |
39 | 8 | Koen Deforche | * Further tune your linker command: |
40 | ** Append @-v@ to the linker command used by CMake to see the raw @collect2@ command-line. |
||
41 | ** By default, shared/static libraries is all-or-nothing with CMake. However, you probably want to use system-wide versions of libstdc++, libm and libc depending on other applications on your device. |
||
42 | *** Use -Bdynamic in front of libraries you wish to link dynamically against |
||
43 | ** There are some other flags that you need to use to make sure the linker does not keep unused symbols: |
||
44 | *** Remove @-export-dynamic@ |
||
45 | *** Add @--gc-sections@ |
||
46 | 9 | Koen Deforche | * Strip your binary using @strip -s@. |
47 | 1 | Pieter Libin | * Optionally, when available for your platform, you may want to compress the size of your binary using the "Ultimate Packer for eXecutables (upx)":http://upx.sourceforge.net/. This typically reduces executable size further by 60-70%, without noticable run-time performance hits. |
48 | |||
49 | h3. Measuring performance |
||
50 | |||
51 | To report the run-time performance of Wt on a particular embedded platform, you must connect to the device using a local area connection (through at most one switch), and measure the time between transmission and reception of packets (using a packet sniffer). For the measurements, we use two examples that are included in the Wt distribution: "hello":http://www.webtoolkit.eu/wt/examples/hello/hello.wt (as an example of a minimal application), and "composer":http://www.webtoolkit.eu/wt/examples/composer/composer.wt (as an example of a simple, yet functional, application). |
||
52 | |||
53 | We propose to measure the time to create a new session, and the time of a small event. |
||
54 | |||
55 | |||
56 | h4. Runtime: new session |
||
57 | |||
58 | Wt starts a new session by serving a small page to determines browser capabilities, and then trigger a second call to get the "main page", that has all visible content. To compare the relative performance for a particular platform, you should measure this "load" time, as the total duration of these two requests. You should measure the time from sending the first request, to sending the third request. The third request is either a GET request for auxiliary content (CSS or images), a GET request to a Wt resource, or a POST request to load invisible content in the background. |
||
59 | |||
60 | |||
61 | h4. Runtime: event |
||
62 | |||
63 | We estimate the time needed to process a small event, such as a click on the "Greet me" button in hello, and "Save now" in composer, by measuring the total time for the packet exchange triggered by such an event. |
||
64 | |||
65 | |||
66 | h4. Memory usage: basis |
||
67 | |||
68 | Measuring memory usage is a tricky thing, since code and read-only data memory used by shared libraries is effectively shared between processes, while writable data segments are obviously private to each process. |
||
69 | |||
70 | 10 | Koen Deforche | Therefore, we use @pmap@ to study the memory in different segments. The basis RAM usage is divided between read-only segments, and writable segments. Only the latter are really constrained by physical RAM. We get the total writable size. by summing the size of all writable segments, indicated by pmap with a *w*. The total size reported by pmap and top, minus the size of all writable segments is then the read-only RAM usage. Thus, this number includes shared libraries, and thus overestimates actual RAM usage. |
71 | 1 | Pieter Libin | |
72 | h4. Memory usage: per session |
||
73 | |||
74 | Compare the memory usage after starting 10 sessions with base memory usage, and divide the difference by 10 to estimate the memory used by a single session. |
||
75 | |||
76 | h2. Platforms |
||
77 | |||
78 | h3. ARM926EJ-S |
||
79 | |||
80 | h4. Processor features |
||
81 | |||
82 | * Clock-speed: 200 MHz |
||
83 | * Linux BogoMIPS: 89.70 |
||
84 | * Caches: 8K instruction, 8K data |
||
85 | |||
86 | 8 | Koen Deforche | Configurations are ordered chronically, latest first. |
87 | 1 | Pieter Libin | |
88 | 8 | Koen Deforche | h4. Config 2: minimal (16/03/2010) |
89 | 1 | Pieter Libin | |
90 | 8 | Koen Deforche | h5. Setup |
91 | |||
92 | * *Wt version:* git (16/03/2010, >= Wt 3.1.1) |
||
93 | * *Target system:* Linux uclibc 2.6.23 |
||
94 | * *Build environment:* buildroot, arm-linux-gcc 4.2.1 |
||
95 | * *Options:* without multi-threading, libz and OpenSSL |
||
96 | 11 | Koen Deforche | * *Build type:* full static build, except for: libstdc++, libc, and libm |
97 | 8 | Koen Deforche | * *Runtime settings:* ./app.wt --docroot . --http-address 0.0.0.0 --no-compression |
98 | |||
99 | h5. Performance results |
||
100 | |||
101 | *Code size and RAM usage (in KBytes)* |
||
102 | |_.Program|_.Code size (strip)|_.Code size (strip + upx)|_.RAM: basis † (read-only)|_.RAM: basis (writable)|_.RAM: per session| |
||
103 | 9 | Koen Deforche | | hello| 1.214 | 362 | 2.544 | 228 | 14.8 | |
104 | | composer| 1.462 | 420 | 2.796 | 232 | 83.6 | |
||
105 | 8 | Koen Deforche | |
106 | † includes shared libraries ! |
||
107 | |||
108 | *Runtime-performance* |
||
109 | |_.Program |_.New session (http) |_.Event (http)| |
||
110 | 9 | Koen Deforche | | hello | 0.26 s | 0.07 s | |
111 | 8 | Koen Deforche | |composer| 0.69 s | 0.08 s | |
112 | |||
113 | h4. Config 1: minimal (18/03/2008) |
||
114 | 1 | Pieter Libin | |
115 | h5. Setup |
||
116 | |||
117 | * *Wt version:* CVS-snapshot 18/03/08 |
||
118 | * *Target system:* Linux uclibc 2.6.23 |
||
119 | * *Build environment:* buildroot, arm-linux-gcc 4.2.1 |
||
120 | * *Options:* with multi-threading, but without libz and OpenSSL |
||
121 | * *Build type:* full static build, except for: libc, libpthread, libdl, libstdc++, and libm |
||
122 | * *Build settings:* MinSizeRel, -DHAVE_GNU_REGEX |
||
123 | * *Runtime settings:* ./app.wt --docroot . --http-address 0.0.0.0 --threads=2 --no-compression |
||
124 | |||
125 | h5. Performance results |
||
126 | 3 | Pieter Libin | |
127 | 4 | Pieter Libin | *Code size and RAM usage (in KBytes)* |
128 | 6 | Pieter Libin | |_.Program|_.Code size (strip)|_.Code size (strip + upx)|_.RAM: basis † (read-only)|_.RAM: basis (writable)|_.RAM: per session| |
129 | 5 | Pieter Libin | | hello| 1.130 | 304 | 2.580 | 372 | 28| |
130 | | composer| 1.265 | 332 | 2.712 | 372 | 126| |
||
131 | 1 | Pieter Libin | |
132 | 3 | Pieter Libin | † includes shared libraries ! |
133 | 1 | Pieter Libin | |
134 | 5 | Pieter Libin | *Runtime-performance* |
135 | 7 | Pieter Libin | |_.Program |_.New session (http) |_.Event (http)| |
136 | 5 | Pieter Libin | | hello | 0.58 s | 0.15 s | |
137 | |composer| 1.8 s | 0.15 s | |