Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

[R] Timings of function execution in R [was Re: R in Industry]

Douglas Bates

2007-02-08

Replies:

On 2/8/07, Albrecht, Dr. Stefan (AZ Private Equity Partner)
<stefan.albrecht@(protected):
> Dear all,
>
> Thanks a lot for your comments.
>
> I very well agree with you that writing efficient code is about optimisation. The most important rules I know would be:
> - vectorization
> - pre-definition of vectors, etc.
> - use matrix instead of data.frame
> - do not use named objects
> - use pure matrix instead of involved S4 (perhaps also S3) objects (can have enormous effects)
> - use function instead of expression
> - use compiled code
> - I guess indexing with numbers (better variables) is also much faster than with text (names) (see also above)
> - I even made, for example, my own min, max, since they are slow, e.g.,
>
> greaterOf <- function(x, y){
> # Returns for each element of x and y (numeric)
> # x or y may be a multiple of the other
>  z <- x > y
>  z*x + (!z)*y

That's an interesting function. I initially was tempted to respond
that you have managed to reinvent a specialized form of the ifelse
function but then I decided to do the timings just to check (always a
good idea). The enclosed timings show that your function is indeed
faster than a call to ifelse. A couple of comments:

- I needed to make the number of components in the vectors x and y
quite large before I could get reliable timings on the system I am
using.

- The recommended way of doing timings is with system.time function,
which makes an effort to minimize the effects of garbage collection on
the timings.

- Even when using system.time there is often a big difference in
timing between the first execution of a function call that generates a
large object and subsequent executions of the same function call.

[additional parts of the original message not relevant to this
discussion have been removed]
> x <- rnorm(1000000)
> y <- rnorm(1000000)
> system.time(r1 <- greaterOf(x, y))
 user system elapsed
0.255  0.023  0.278
> system.time(r1 <- greaterOf(x, y))
 user system elapsed
0.054  0.029  0.084
> system.time(r1 <- greaterOf(x, y))
 user system elapsed
0.057  0.028  0.086
> system.time(r1 <- greaterOf(x, y))
 user system elapsed
0.083  0.040  0.124
> system.time(r1 <- greaterOf(x, y))
 user system elapsed
0.099  0.026  0.124
> system.time(r2 <- ifelse(x > y, x, y))
 user system elapsed
0.805  0.109  0.913
> system.time(r2 <- ifelse(x > y, x, y))
 user system elapsed
0.723  0.113  0.835
> system.time(r2 <- ifelse(x > y, x, y))
 user system elapsed
0.641  0.116  0.757
> system.time(r2 <- ifelse(x > y, x, y))
 user system elapsed
0.647  0.111  0.757
> all.equal(r1,r2)
[1] TRUE
______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
©2008 r-help.com - Jax Systems, LLC, U.S.A.