Historically, memory protection and privilege levels have been implemented in hardware: memory protection via base / limit registers, segments, or pages; and privilege levels via user / kernel mode bits or rings [41]. Recent mobile code systems, however, rely on software rather than hardware for protection. The switch to software mechanisms is being driven by two needs: portability and performance.
The first argument for software protection is portability. A user-level software product like a browser must coexist with a variety of operating systems. For a Web browser to use hardware protection, the operating system would have to provide access to the page tables and system calls, but such mechanisms are not available universally across platforms. Software protection allows a browser to have platform-independent security mechanisms.
Second, software protection offers significantly cheaper cross-domain
calls
. To estimate the magnitude of the performance difference, we
did two simple experiments. The results below should not be
considered highly accurate, since they mix measurements from different
architectures. However, the effect we are measuring is so large that
these small inaccuracies do not affect our conclusions.
First, we measured the
performance of a null call between Common Object Model (COM) objects
on a 180 MHz PentiumPro PC running Windows NT 4.0. When the two objects
are in different tasks, the call takes 230
sec; when the called
object is in a dynamically linked library in the same task, the call
takes only 0.2
sec -- a factor of 1000 difference. While COM is
a very different system from Java, the performance disparity exhibited
in COM would likely also appear if hardware protection were applied to
Java. This ratio appears to be growing even larger in newer
processors [36,1].
The time difference would be acceptable if cross-domain calls were very rare. But modern software structures, especially in mobile code systems, are leading to tighter binding between domains.
To illustrate this fact, we then instrumented the Sun Java Virtual
Machine (JVM) (JDK 1.0.2 interpreter running on a 167 MHz UltraSparc)
to measure the number of calls across trust boundaries, that is, the
number of procedure calls in which the caller is either more trusted
or less trusted than the callee. We measured two workloads.
CaffeineMark
is
a widely used Java performance benchmark, and SunExamples is the sum
over 29 of the 35 programs
on Sun's sample Java programs
page
.
(Some of Sun's programs run forever; we cut these off after 30
seconds.) Note that our measurements almost certainly underestimate
the rate of boundary crossings that would be seen on a more current
JVM implementation that used a just-in-time (JIT) compiler to execute
much more quickly.
Figure 1 shows the results. Both
workloads show roughly 30000 boundary crossings per CPU-second. Using
our measured cost of 0.2
sec for a local COM call, we
estimate that these programs spend roughly 0.5% of their time
crossing trust boundaries in the Java system with software protection
(although the actual cost in Java may be significantly cheaper).
| Workload | time | crossings | crossings/sec |
| CaffeineMark | 40.9 | 1135975 | 26235 |
| SunExamples | 138.7 | 5483828 | 35637 |
Using our measured cost of 230
sec for a cross-task COM call, we can
estimate the cost of boundary crossings for both workloads
in a hypothetical Java implementation that used hardware protection.
The cost is enormous: for CaffeineMark, about 6 seconds per useful
CPU-second of work, and for SunExamples, about 8 seconds per useful
CPU-second. In other words, a naïve implementation of hardware
protection would slow these programs down by a factor of 7 to 9. This
is unacceptable.
Why are there so many boundary crossings? Surely the designers did not strive to increase the number of crossings; most likely they simply ignored the issue, knowing that the cost of crossing a Java protection boundary was no more than a normal function call. What our experiments show, then, is that the natural structure of a mobile code system leads to very frequent boundary crossings.
Of course, systems using hardware protection are structured differently; the developers do what is necessary to improve performance (i.e., buffering calls together). In other words, they deviate from the natural structure in order to artificially reduce the number of boundary crossings. At best, this neutralizes the performance penalty of hardware protection at a cost in both development time and the usefulness and maintainability of the resulting system. At worst, the natural structure is sacrificed and a significant performance gap still exists.