Project

General

Profile

Wt embedded » History » Version 31

Wim Dumon, 11/15/2022 02:18 PM

1 31 Wim Dumon
Wt embedded
2
===========
3 1 Pieter Libin
4
{{toc}}
5
6
Find here information on running Wt in resource constrained embedded systems: performance, code size, memory usage, and other information.
7
8 31 Wim Dumon
General
9
-------
10 1 Pieter Libin
11 22 Peter Mortensen
Wt can easily be built for and deployed on embedded POSIX systems, such as embedded Linux.
12 1 Pieter Libin
13 31 Wim Dumon
### Cross-building
14 1 Pieter Libin
15
Using CMake with a cross compilation environment: to be completed...
16
17 31 Wim Dumon
Instructions for cross compiling with cmake can be found on the [CMake Wiki](http://www.cmake.org/Wiki/CMake_Cross_Compiling).
18 1 Pieter Libin
19 31 Wim Dumon
Wt user Alistair of QuickForge has written a blog about cross-compiling for ARM on Windows, and uses Wt as an example in his blog post [Exploration of Cross-Compiling on Windows for ARM Linux Distributions](http://blog.quickforge.co.uk/2011/10/exploration-of-cross-compiling-on-windows-for-arm-linux-distributions/)
20 1 Pieter Libin
21 31 Wim Dumon
### Optimizing executable size
22 1 Pieter Libin
23
Points to consider when optimizing the executable size.
24 22 Peter Mortensen
25 1 Pieter Libin
For building Boost:
26
27 31 Wim Dumon
-   Use a static build of Boost, which allows the linker to strip away unused symbols
28
-   Use the following compile flags for Boost:
29
    -   `-fvisibility=hidden -fvisibility-inlines-hidden`: to avoid exporting symbols in the executable
30
    -   `-ffunction-sections -fdata-sections`: to allowing fine-grained garbage collection of unused functions/data
31
32 8 Koen Deforche
For building Wt:
33 1 Pieter Libin
34 31 Wim Dumon
-   Choose build-type `MinSizeRel`
35
-   Extra compile flags (`CMAKE_CXX_FLAGS`)
36
    -   `-fvisibility=hidden -fvisibility-inlines-hidden`: to avoid exporting symbols in the executable
37
    -   `-ffunction-sections -fdata-sections`: to allow fine-grained garbage collection of unused functions/data
38
    -   `-DHAVE_GNU_REGEX`: to avoid the dependency on libboost\_regex, when building on a system that is based on glibc or uClibc
39
    -   `-DWT_NO_LAYOUT`: to avoid pulling in the Wt's layout managers, if you are not using any WLayout classes
40
    -   `-DWT_NO_SPIRIT`: to avoid depending on spirit to parse locale and cookies (if you don't need that)
41
    -   `-DWT_NO_XSS_FILTER`: to avoid the extra (runtime) overhead of XSS filtering, usually not relevant for a trusted embedded platform
42
-   Build static libraries (for libwt.a and libwthttp.a)
43
    -   in CMake: `SHARED_LIBS:BOOL=OFF`
44
-   Disable build options you don't need and introduce extra dependencies (libz, openssl ?)
45
-   Further tune your linker command:
46
    -   Append `-v` to the linker command used by CMake to see the raw `collect2` command-line.
47
    -   By default, shared/static libraries is all-or-nothing with CMake. However, you probably want to use system-wide versions of libstdc, libm and libc depending on other applications on your device.
48
        -   Use -Bdynamic in front of libraries you wish to link dynamically against
49
    -   There are some other flags that you need to use to make sure the linker does not keep unused symbols:
50
        -   Remove `-export-dynamic`
51
        -   Add `--gc-sections`
52
-   Strip your binary using `strip -s`.
53
-   Optionally, when available for your platform, you may want to compress the size of your binary using the [Ultimate Packer for eXecutables (upx)](http://upx.sourceforge.net/). This typically reduces executable size further by 60-70%, without noticable run-time performance hits.
54 1 Pieter Libin
55 31 Wim Dumon
### Measuring performance
56 1 Pieter Libin
57 31 Wim Dumon
To report the run-time performance of Wt on a particular embedded platform, you must connect to the device using a local area connection (through at most one switch), and measure the time between transmission and reception of packets (using a packet sniffer). For the measurements, we use two examples that are included in the Wt distribution: [hello](http://www.webtoolkit.eu/wt/examples/hello/hello.wt) (as an example of a minimal application), and [composer](http://www.webtoolkit.eu/wt/examples/composer/composer.wt) (as an example of a simple, yet functional, application).
58
59 1 Pieter Libin
We propose to measure the time to create a new session, and the time of a small event.
60
61 31 Wim Dumon
#### Runtime: new session
62 1 Pieter Libin
63
Wt starts a new session by serving a small page to determines browser capabilities, and then triggers a second call to get the "main page", that has all visible content. To compare the relative performance for a particular platform, you should measure this "load" time, as the total duration of these two requests. You should measure the time from sending the first request, to sending the third request. The third request is either a GET request for auxiliary content (CSS or images), a GET request to a Wt resource, or a POST request to load invisible content in the background.
64
65 31 Wim Dumon
#### Runtime: event
66 1 Pieter Libin
67
We estimate the time needed to process a small event, such as a click on the "Greet me" button in hello, and "Save now" in composer, by measuring the total time for the packet exchange triggered by such an event.
68
69 31 Wim Dumon
#### Memory usage: basis
70 1 Pieter Libin
71
Measuring memory usage is a tricky thing, since code and read-only data memory used by shared libraries is effectively shared between processes, while writable data segments are obviously private to each process.
72
73 31 Wim Dumon
Therefore, we use `pmap` to study the memory in different segments. The basis RAM usage is divided between read-only segments, and writable segments. Only the latter are really constrained by physical RAM. We get the total writable size by summing the size of all writable segments, indicated by pmap with a **w**. The total size reported by pmap and top, minus the size of all writable segments is then the read-only RAM usage. Thus, this number includes shared libraries, and thus overestimates actual RAM usage.
74 1 Pieter Libin
75 31 Wim Dumon
#### Memory usage: per session
76 1 Pieter Libin
77 22 Peter Mortensen
Compare the memory usage after starting 10 sessions with base memory usage, and divide the difference by 10 to estimate the memory used by a single session.
78 1 Pieter Libin
79 31 Wim Dumon
Platforms
80
---------
81 22 Peter Mortensen
82 31 Wim Dumon
### ARM926EJ-S
83 16 Koen Deforche
84 31 Wim Dumon
#### Processor features
85 22 Peter Mortensen
86 31 Wim Dumon
-   Clock-speed: 200 MHz
87
-   Linux BogoMIPS: 89.70
88
-   Caches: 8K instructions, 8K data
89 16 Koen Deforche
90
Configurations are ordered chronically, latest first.
91
92 31 Wim Dumon
#### Configuration 3: minimal (15/12/2010)
93 16 Koen Deforche
94 31 Wim Dumon
##### Setup
95 16 Koen Deforche
96 31 Wim Dumon
-   **Wt version:** Git (15/12/2010, > Wt 3.1.7)
97
-   **Target system:** Linux uclibc 2.6.23
98
-   **Build environment:** buildroot, arm-linux-gcc 4.2.1
99
-   **Options:** without multi-threading, libz and OpenSSL
100
-   **Build type:** full static build, except for: libstdc, libc, and libm
101
-   **Runtime settings:** ./app.wt ---docroot . ---http-address 0.0.0.0 ---no-compression
102 8 Koen Deforche
103 31 Wim Dumon
##### Performance results
104 8 Koen Deforche
105 31 Wim Dumon
**Runtime-performance**  
106
\|*.Program \|*.New session (http) \|\_.Event (http)\|  
107
\| hello \| 0.19 s \| 0.06 s \|  
108
\|composer\| 0.60 s \| 0.07 s \|
109 8 Koen Deforche
110 31 Wim Dumon
#### Configuration 2: minimal (16/03/2010)
111 8 Koen Deforche
112 31 Wim Dumon
##### Setup
113 8 Koen Deforche
114 31 Wim Dumon
-   **Wt version:** Git (16/03/2010, >= Wt 3.1.1)
115
-   **Target system:** Linux uclibc 2.6.23
116
-   **Build environment:** buildroot, arm-linux-gcc 4.2.1
117
-   **Options:** without multi-threading, libz and OpenSSL
118
-   **Build type:** full static build, except for: libstdc, libc, and libm
119
-   **Runtime settings:** ./app.wt ---docroot . ---http-address 0.0.0.0 ---no-compression
120 8 Koen Deforche
121 31 Wim Dumon
##### Performance results
122 8 Koen Deforche
123 31 Wim Dumon
**Code size and RAM usage (in KBytes)**  
124
\|*.Program\|*.Code size (strip)\|*.Code size (strip + upx)\|*.RAM: basis † (read-only)\|*.RAM: basis (writable)\|*.RAM: per session\|  
125
\| hello\| 1214 \| 362 \| 2544 \| 228 \| 14.8 \|  
126
\| composer\| 1462 \| 420 \| 2796 \| 232 \| 83.6 \|
127 1 Pieter Libin
128 31 Wim Dumon
† includes shared libraries !
129 1 Pieter Libin
130 31 Wim Dumon
**Runtime-performance**  
131
\|*.Program \|*.New session (http) \|\_.Event (http)\|  
132
\| hello \| 0.26 s \| 0.07 s \|  
133
\|composer\| 0.69 s \| 0.08 s \|
134 1 Pieter Libin
135 31 Wim Dumon
#### Configuration 1: minimal (18/03/2008)
136 1 Pieter Libin
137 31 Wim Dumon
##### Setup - Original Test Page
138 4 Pieter Libin
139 31 Wim Dumon
-   **Wt version:** CVS-snapshot 18/03/08
140
-   **Target system:** Linux uclibc 2.6.23
141
-   **Build environment:** buildroot, arm-linux-gcc 4.2.1
142
-   **Options:** with multi-threading, but without libz and OpenSSL
143
-   **Build type:** full static build, except for: libc, libpthread, libdl, libstdc, and libm
144
-   **Build settings:** MinSizeRel, -DHAVE\_GNU\_REGEX
145
-   **Runtime settings:** ./app.wt ---docroot . ---http-address 0.0.0.0 ---threads=2 ---no-compression
146 7 Pieter Libin
147 31 Wim Dumon
##### Performance results
148 28 Hanna Losek
149 31 Wim Dumon
**Code size and RAM usage (in KBytes)**  
150
\|*.Program\|*.Code size (strip)\|*.Code size (strip + upx)\|*.RAM: basis † (read-only)\|*.RAM: basis (writable)\|*.RAM: per session\|  
151
\| hello\| 1130 \| 304 \| 2580 \| 372 \| 28\|  
152
\| composer\| 1265 \| 332 \| 2712 \| 372 \| 126\|
153 1 Pieter Libin
154 31 Wim Dumon
† includes shared libraries !
155 1 Pieter Libin
156 31 Wim Dumon
**Runtime-performance**  
157
\|*.Program \|*.New session (http) \|\_.Event (http)\|  
158
\| hello \| 0.58 s \| 0.15 s \|  
159
\|composer\| 1.8 s \| 0.15 s \|