Fig. 23. A fragment of our Supple re-implementation of the Microsoft Ribbon (a) rendered fora wide window (b) Supple automatically provides the size adaptations (enlarging commonly-used functionality based on the user trace, which are manually designed in the original version of the MS Ribbon; (c) unlike the manually designed Ribbon, the Supple version allows users to add, delete, copy and move functionality in this example, New Container section was added, its contents copied via drag-and-drop operations from other parts of the interface and the Quick Style button was removed from the Drawing panel the customized Supple version of the Ribbon can still adapt to different size constraints. Because execution time is proportional to the number of nodes expanded (see Fig. 26) and is hardware-dependent, we omit this measure when comparing different algorithm variants, but we report it in the next subsection when we consider the scalability of the approach. Fig. 24 shows the performance of the three variable ordering heuristics across the range of screen size constraints for three interfaces of different levels of complexity the classroom controller (as the one in Fig. 20), a print dialog box (Fig. and a stereo controller interface (Fig. This figure illustrates the existence of narrow bands in the problem space where the algorithms perform up to several orders of magnitude worse than in other parts of the problem space. It also illustrates another important phenomenon the MRV and bottom-up heuristics tend to exhibit their worst performance in slightly different parts of the problem space. This is an important observation because it suggests that actual algorithm-independent phase transition phenomenon may not betaking place, and that combining these two approaches can result in an algorithm that performs orders of magnitude better in the worst case than either of the two approaches alone. Motivated by the above results, we implemented two variants of a parallel algorithm. The first, which will be referred to as Parallel, concurrently runs (in parallel threads) two searches driven by the two variable ordering heuristics whose regions of worst performance do not overlap the bottom-up and the MRV heuristics. The second, Parallel, runs three concurrent searches, one for each of the three heuristics. In both parallel algorithms, we expected to seethe benefit of the individual algorithms experiencing worst performance indifferent regions of the problem space. In addition, the parallel searches explore the solution space indifferent orders, coming across solutions at different times, but they share the bestCost variable (see Table 1) used for branch and bound pruning. Therefore, we expected that sharing of the cost of the best solution found so far by one of the searches will improve the pruning power of the others. Fig. 25 shows the performance of the two parallel algorithms on the three example interfaces introduced in Fig. 24. In each case, the best-performing variant from Fig. 24 is also shown for comparison. Note that unlike the previous figure, this one uses a logarithmic scale on the y-axes to highlight large differences in performance. The average-case performance in all instances remained the same as that of the single search, but, as expected, the worst-case performance improved dramatically by an order of magnitude in the case of the classroom interface and by nearly two orders of magnitude for the print dialog and the stereo interface. 7.4.2. Scalability Next, we investigate how our approach scales with the complexity of the interfaces it generates. We evaluated the two parallel algorithms with 11 different user interfaces, a number of which are used as examples throughout this article. In Table 2, we first report for each interface the number of elements in the functional specification (excluding those for which rendering is constrained through the same rendering constraint, and the number of possible interface renderings that the algorithm will consider. As in the previous section, for each interface, we measured the performance at 101 different points throughout the problem space. We measured both the number of nodes explored and the total time taken to produce the best solution. We report both the average case values (the median across all trials) and the worst-case numbers.