问题描述:

I am writing a script that needs to be running continuously storing information on a MySQL database.

However, at some point of the day I will like to produce some summary of the data being colected, but writing this in the same script will stop collecting data while doing these summaries. Here's a sketch of the problem:

while (1==1) {

# get data and store it on the relational database

# At some point of the day (or time interval) do some summaries

if (time == certain_time) {

source("analyze_data.R")

}

}

The problem is that I'll like the data collection not to stop, being executed by another core of the computer.

I have seen references to packages parallel and multicore but my impression is that they are useful to repetitive tasks applied over vectors or lists.

网友答案:

Do the logic outside of R:

Write 2 scripts; 1 with a while loop storing data, the other with a check. Run the while loop with one process and just leave it running.

Meanwhile, run your other (checking script) on demand to crunch the data. Or, put it in a cron job.

There are robust tools outside of R to handle this kind of thing; why do it inside R?

网友答案:

You can use parallel to fork a process but you are right that the program will wait eternally for all the forked processes to come back together before proceeding (that is kind of the use case of parallel).

Why not run two separate R programs, one that collects the data and one that grabs it? Then, you simply run one continuously in the background and the other at set times. The problem then becomes one of getting the data out of the continuous data gathering program and into the summary program.

相关阅读:
Top