Asked  9 Months ago    Answers:  5   Viewed   5 times

I would like to make a loop to plot data from multiple dataframes in R, using a a pre-existing ggplot function called myplot.

My ggplot function is defined as myplot and the only things I'd like to extract are titles. I know there are similar posts, but none provides a solution for a pre-existing ggplot function.

df1 <- diamonds[1:30,] 
df2 <- diamonds[31:60,]
df3 <- diamonds[61:90,]

myplot <- ggplot(df1, aes(x = x, y = y)) +
geom_point(color="grey") +
labs(title = "TITLE")

list <- c("df1","df2","df3")
titles <- c("df1","df2","df3")

Here is my try:

for (i in list) {
  myplot(plot_list[[i]])
  print(plot_list[[i]])
}

 Answers

3

You can create multiple ggplots in a loop with predifined function myplot() as follows:

list <- c("df1","df2","df3") #just one character vector as the titles are the same as the names of the data frames

myplot <- function(data, title){
  ggplot(data, aes(x = x, y = y)) +
    geom_point(color="grey") +
    labs(title = title)
}

for(i in list){
  print(myplot(get(i), i))
}

If you wanna work with 2 vectors giving the names if the data frames and of the titles you can do the following:

list <- c("df1","df2","df3")
titles <- c("Title 1","Plot 2","gg3") 

myplot <- function(data, title){
  ggplot(data, aes(x = x, y = y)) +
    geom_point(color="grey") +
    labs(title = title)
}

for(i in seq_along(list)){ #here could also be seq_along(titles) as they a re of the same length
  print(myplot(get(list[i]), titles[i]))
}
Sunday, August 29, 2021
1

Your first problem is that you're doing WAY too much in a few expressions. You need to break it down.

void MapArray(void * src, void * dest, void * (f)(void *), size_t n, size_t elem)
{
   unsigned int i = 0, j = 0;
   void * temp = malloc(elem);

   char* csrc = (char*)src;
   char* cdest = (char*)dest;
   char* ctemp = (char*)temp;
   for(i = 0; i<n; i++)
   {
       csrc++;
       cdest++;
       ctemp++;
       temp = f(csrc);
       for(j = 0; j < elem; j++)
       {
           cdest[i] = ctemp[i];
       }
   }
   free(temp);
}

Now your second problem. You malloc a buffer, then you .. assign to that pointer? Repeatedly? Then free only the last f call's result? This is totally unnecessary.

void MapArray(void * src, void * dest, void * (f)(void *), size_t n, size_t elem)
{
   unsigned int i = 0, j = 0;

   char* csrc = (char*)src;
   char* cdest = (char*)dest;
   for(i = 0; i<n; i++)
   {
       csrc++;
       cdest++;
       char* ctemp = (char*)f(csrc);
       for(j = 0; j < elem; j++)
       {
           cdest[i] = ctemp[i];
       }
   }
}

Now your third problem. You pass a pointer in - but only to char. You don't pass in a void*. This means that your function can't be generic - f can't be applied to anything. We need an array of void*s, so that the function can take any type as argument. We also need to take the size of the type as an argument so we know how far to move along dest.

void MapArray(void ** src, void * dest, void * (f)(void *), size_t n, size_t sizeofT)
{
    for(unsigned int i = 0; i < n; i++) {
        void* temp = f(src[n]);
        memcpy(dest, temp, sizeofT);
        dest = (char*)dest + sizeofT;
    }
}

We still have another problem - the memory of temp. We don't free it. Nor do we pass a user data argument into f, which would allow it to return heap-allocated memory that we don't need to free. The only way in which f can work is if it returned a static buffer.

void MapArray(void ** src, void * dest, void * (f)(void *, void*), void* userdata, size_t n, size_t sizeofT)
{
    for(unsigned int i = 0; i < n; i++) {
        void* temp = f(src[n], userdata);
        memcpy(dest, temp, sizeofT);
        dest = (char*)dest + sizeofT;
    }
}

Now f can operate on pretty much whatever it likes and hold whatever state it needs. But we still don't free the buffer. Now, f returns a simple struct that tells us if we need to free the buffer. This also allows us to free or not free the buffer on different calls of f.

typedef struct {
    void* data;
    int free;
} freturn;

void MapArray(void ** src, void * dest, freturn (f)(void *, void*), void* userdata, size_t n, size_t sizeofT)
{
    for(unsigned int i = 0; i < n; i++) {
        freturn thisreturn = f(src[n], userdata);
        void* temp = thisreturn.data;
        memcpy(dest, temp, sizeofT);
        dest = (char*)dest + sizeofT;
        if (thisreturn.free)
            free(temp);
    }
}

However, I still don't understand the purpose of this function. All of this to replace a simple for loop? The code that you're trying to replace is simpler than the code to call your function, and probably more efficient, and definitely more powerful (they can use continue/break, for example).

More than that, C really sucks for this kind of work. C++ is far better. It's pretty trivial there to apply a function to each member of an array, for example.

Tuesday, August 10, 2021
 
2

You can use for loop like this to iterate with a variable $TOP:

for ((i=1; i<=$TOP; i++))
do
   echo $i
   # rest of your code
done
Thursday, August 12, 2021
 
2

I'd use data.table for this. It makes it super easy and super fast joining on keys. There is even a really helpful roll = "nearest" argument for exactly the behaviour you are looking for (except in your example data it is not necessary because all times from df appear in logger). In the following example I renamed df$time to df$time1 to make it clear which column belongs to which table...

#  Load package
require( data.table )

#  Make data.frames into data.tables with a key column
ldt <- data.table( logger , key = "time" )
dt <- data.table( df , key = "time1" )

#  Join based on the key column of the two tables (time & time1)
#  roll = "nearest" gives the desired behaviour
#  list( obs , time1 , temp ) gives the columns you want to return from dt
ldt[ dt , list( obs , time1 , temp ) , roll = "nearest" ]
#          time obs      time1     temp
# 1: 1280248361   8 1280248361 18.07644
# 2: 1280248366   4 1280248366 21.88957
# 3: 1280248370   3 1280248370 19.09015
# 4: 1280248376   5 1280248376 22.39770
# 5: 1280248381   6 1280248381 24.12758
# 6: 1280248383  10 1280248383 22.70919
# 7: 1280248385   1 1280248385 18.78183
# 8: 1280248389   2 1280248389 18.17874
# 9: 1280248393   9 1280248393 18.03098
#10: 1280248403   7 1280248403 22.74372
Saturday, August 14, 2021
 
EastSw
 
5

Dynamically filling an object using a for loop is fine - what causes problems is when you dynamically build an object using a for loop (e.g. using cbind and rbind rows).

When you build something dynamically, R has to go and request new memory for the object in each loop, because it keeps increasing in size. This causes a for loop to slow down with every iteration as the object gets bigger.

When you create the object beforehand (e.g. a data.frame with the right number of rows and columns), and fill it in by index, the for loop doesn't have this problem.

One final thing to keep in mind is that for data.frames (and matrices) each column is stored as a vector in memory – so its usually more efficient to fill these in one column at a time.

With all that in mind we can revise your code as follows:

results <- data.frame(matrix(NA, nrow = length(seq(1:10)), 
                                 ncol = length(seq(1:10))))
for (rowIdx in 1:nrow(results)) {
  for (colIdx in 1:ncol(results)) {
    results[rowIdx, colIdx] <- 5 # or whatever value you want here
  }
}
Wednesday, January 12, 2022
 
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :