1 | |
---|
2 | ARB compiles with vectorization (enabled by -O3 in NDEBUG or RELEASE mode). |
---|
3 | The vectorization check is automatically activated for gcc versions >= 4.9.0 |
---|
4 | Compiler output of earlier versions is not suitable for automated checks. |
---|
5 | |
---|
6 | To ensure that code does not degrade (i.e. does not fail to vectorize), the |
---|
7 | compiler output is searched for successful vectorizations and the result is |
---|
8 | checked against special comments in code. |
---|
9 | |
---|
10 | Currently there are 2 types of comments for vectorization-checks: |
---|
11 | |
---|
12 | IRRELEVANT_LOOP |
---|
13 | LOOP_VECTORIZED[[<count>][<conditions>]]* |
---|
14 | |
---|
15 | Use IRRELEVANT_LOOP to mark a loop where vectorization occurs (maybe only |
---|
16 | sometimes) and where it isn't mandatory. No error or warning will show if |
---|
17 | vectorization fails. |
---|
18 | |
---|
19 | Note: You have to use IRRELEVANT_LOOP for conditional code (e.g. unittest code) |
---|
20 | |
---|
21 | |
---|
22 | Use LOOP_VECTORIZED for loops that SHALL get vectorized (i.e. loops with |
---|
23 | relevant impact on performance). If vectorization fails for a loop commented |
---|
24 | with LOOP_VECTORIZED, compilation will fail with an error. |
---|
25 | |
---|
26 | The optional argument <count> allows to specify the amount of performed |
---|
27 | vectorizations. This is relevant if the loop is part of a template, gets |
---|
28 | vectorized multiple times and it shall be ensured that the vectorizations |
---|
29 | happens for each instanciation of that template. |
---|
30 | |
---|
31 | The standard syntax of the <count> argument is "=<num>". |
---|
32 | This requires that <num> vectorizations take place. |
---|
33 | The default for <count> is "=1" (if not specified; only directly after LOOP_VECTORIZED). |
---|
34 | |
---|
35 | Sometimes the count depends on optimization level (esp. whether a |
---|
36 | function gets inlined or not). You may specify '=<num>|<num>' to allow |
---|
37 | several different amounts of accepted vectorizations. |
---|
38 | (Hint: test with DEBUG=0 and UNIT_TEST=0 to check under RELEASE conditions) |
---|
39 | |
---|
40 | The 2nd optional argument <conditions> allows to include/exclude compiler |
---|
41 | versions that succeed/fail to vectorize that loop with the specified count. |
---|
42 | |
---|
43 | The syntax is "["<cond>[","<cond>]*"]". |
---|
44 | <cond> has the following syntax: |
---|
45 | ["!"]("<"|">")["="]<version> |
---|
46 | |
---|
47 | <version> is a specific compiler version (e.g. "4.9.3" or "493" or "5"). |
---|
48 | |
---|
49 | Multiple conditions can be specified in one <cond>, e.g. "!>=5<7" excludes |
---|
50 | all compilers of 5.x and 6.x series. |
---|
51 | |
---|
52 | Examples for <cond>: |
---|
53 | |
---|
54 | 493 only version 4.9.3 |
---|
55 | !493 all but 4.9.3 |
---|
56 | >493 all versions after 4.9.3 |
---|
57 | >=5<7 5.x and 6.x-series |
---|
58 | !>=8<9 all but 8.x series |
---|
59 | |
---|
60 | Example: // LOOP_VECTORIZED[!493,!531] |
---|
61 | => expects vectorization occurs for all compiler versions but 4.9.3 and 5.3.1 |
---|
62 | |
---|
63 | Warning: <conditions> reports true only if ALL <cond> are true! |
---|
64 | => // LOOP_VECTORIZED[493,531] |
---|
65 | will NEVER be true!!! |
---|
66 | |
---|
67 | To chain conditions by OR operator, use |
---|
68 | // LOOP_VECTORIZED[493][531] |
---|
69 | |
---|
70 | Included code: |
---|
71 | |
---|
72 | Vectorization checks do not work very well with included code (e.g. inline |
---|
73 | code from headers): IRRELEVANT_LOOP behaves normal, but LOOP_VECTORIZED may |
---|
74 | differ for different includers. 'LOOP_VECTORIZED=*' might work, but is not |
---|
75 | recommended, because if code starts to fail vectorization, no warning will |
---|
76 | show up! If possible better move code from header into source. |
---|
77 | |
---|
78 | Recompilation needed for complete check: |
---|
79 | |
---|
80 | To force a complete check of expected vectorizations call target |
---|
81 | 'clean_checked_vect' before build! Previous failures (only) in |
---|
82 | vectorization-check will not delete the generated object file, i.e. w/o |
---|
83 | change of the affected source file no re-compilation will happen. |
---|
84 | |
---|
85 | Alternative ways to use vectorization checker: |
---|
86 | |
---|
87 | Set ../Makefile@TRACE_MISSED_VECTORIZATIONS |
---|
88 | to 1 to dump details about performed and failed vectorizations. |
---|
89 | |
---|
90 | Set ../Makefile@VECTORIZATION_CHECK_CANDIDATES |
---|
91 | to 1 to dump candidates from unchecked files. |
---|
92 | |
---|
93 | Related stuff: |
---|
94 | |
---|
95 | If you encounter problems caused by vectorization, you may use |
---|
96 | ../TEMPLATES/attributes.h@__ATTR__DONT_VECTORIZE |
---|
97 | |
---|
98 | Some pointers (if you need to adapt vectorization checking): |
---|
99 | |
---|
100 | - global vectorization toggles: |
---|
101 | ../Makefile@DISABLE_VECTORIZE_CHECK |
---|
102 | (automatically disabled if sanitizer enabled) |
---|
103 | ../Makefile@TRACE_MISSED_VECTORIZATIONS |
---|
104 | ../Makefile@VECTORIZATION_CHECK_CANDIDATES |
---|
105 | |
---|
106 | - the build maintains a list of source-files to be checked using |
---|
107 | target 'vectorize_checks' (called via target 'depends'). |
---|
108 | The list is stored in ./vectorized.source |
---|
109 | and is generated by ./Makefile@vectorize_checks |
---|
110 | |
---|
111 | - the actual check is performed by ./postcompile.pl@parse_expected_vectorizations |
---|