r - How to return monthly minimum values for each year that are least 10 days apart -
i have daily timeseries , want minumum every month of every year, want make sure results should 10 days apart @ least. more specific lets explain on following sample dataframe.
>data years months days date b 1 2003 december 1 2003-12-01 10 10 2 2003 december 2 2003-12-02 10 10 3 2003 december 3 2003-12-03 10 10 4 2003 december 4 2003-12-04 10 10 5 2003 december 5 2003-12-05 10 10 6 2003 december 6 2003-12-06 10 10 7 2003 december 7 2003-12-07 10 10 8 2003 december 8 2003-12-08 3 10 9 2003 december 9 2003-12-09 10 10 10 2003 december 10 2003-12-10 10 10 11 2003 december 11 2003-12-11 10 10 12 2003 december 12 2003-12-12 10 4 13 2003 december 13 2003-12-13 10 10 14 2003 december 14 2003-12-14 10 10 15 2003 december 15 2003-12-15 10 10 16 2003 december 16 2003-12-16 10 10 17 2003 december 17 2003-12-17 10 10 18 2003 december 18 2003-12-18 10 10 19 2003 december 19 2003-12-19 10 10 20 2003 december 20 2003-12-20 10 10 21 2003 december 21 2003-12-21 10 10 22 2003 december 22 2003-12-22 10 10 23 2003 december 23 2003-12-23 10 10 24 2003 december 24 2003-12-24 10 10 25 2003 december 25 2003-12-25 10 10 26 2003 december 26 2003-12-26 10 10 27 2003 december 27 2003-12-27 10 10 28 2003 december 28 2003-12-28 10 10 29 2003 december 29 2003-12-29 10 10 30 2003 december 30 2003-12-30 10 10 31 2003 december 31 2003-12-31 10 10 32 2004 january 1 2004-01-01 10 10 33 2004 january 2 2004-01-02 10 10 34 2004 january 3 2004-01-03 10 10 35 2004 january 4 2004-01-04 10 10 36 2004 january 5 2004-01-05 10 10 37 2004 january 6 2004-01-06 10 10 38 2004 january 7 2004-01-07 10 10 39 2004 january 8 2004-01-08 10 10 40 2004 january 9 2004-01-09 10 10 41 2004 january 10 2004-01-10 10 10 42 2004 january 11 2004-01-11 10 10 43 2004 january 12 2004-01-12 10 10 44 2004 january 13 2004-01-13 10 10 45 2004 january 14 2004-01-14 10 10 46 2004 january 15 2004-01-15 10 10 47 2004 january 16 2004-01-16 10 10 48 2004 january 17 2004-01-17 10 10 49 2004 january 18 2004-01-18 10 10 50 2004 january 19 2004-01-19 10 10 51 2004 january 20 2004-01-20 10 10 52 2004 january 21 2004-01-21 10 10 53 2004 january 22 2004-01-22 10 10 54 2004 january 23 2004-01-23 10 10 55 2004 january 24 2004-01-24 10 10 56 2004 january 25 2004-01-25 5 4 57 2004 january 26 2004-01-26 10 10 58 2004 january 27 2004-01-27 10 10 59 2004 january 28 2004-01-28 10 10 60 2004 january 29 2004-01-29 10 10 61 2004 january 30 2004-01-30 10 10 62 2004 january 31 2004-01-31 10 10 63 2004 february 1 2004-02-01 10 10 64 2004 february 2 2004-02-02 5 4 65 2004 february 3 2004-02-03 10 10 66 2004 february 4 2004-02-04 10 10 67 2004 february 5 2004-02-05 10 10 68 2004 february 6 2004-02-06 10 10 69 2004 february 7 2004-02-07 10 10 70 2004 february 8 2004-02-08 10 10 71 2004 february 9 2004-02-09 7 6 72 2004 february 10 2004-02-10 10 10 73 2004 february 11 2004-02-11 10 10 74 2004 february 12 2004-02-12 10 10 75 2004 february 13 2004-02-13 10 10 76 2004 february 14 2004-02-14 10 10 77 2004 february 15 2004-02-15 10 10 78 2004 february 16 2004-02-16 10 10 79 2004 february 17 2004-02-17 10 10 80 2004 february 18 2004-02-18 10 10 81 2004 february 19 2004-02-19 10 10 82 2004 february 20 2004-02-20 10 10 83 2004 february 21 2004-02-21 10 10 84 2004 february 22 2004-02-22 10 10 85 2004 february 23 2004-02-23 10 10 86 2004 february 24 2004-02-24 10 10 87 2004 february 25 2004-02-25 10 10 88 2004 february 26 2004-02-26 10 10 89 2004 february 27 2004-02-27 10 10 90 2004 february 28 2004-02-28 10 10 91 2004 february 29 2004-02-29 10 10
i want aggregate()
min <- aggregate(data[5:6], by= list(data$months, data$years), fun = min) group.1 group.2 b december 2003 3 4 january 2004 5 4 february 2004 5 4
but instead feb minimum value each , b @ least 10 days apart previous months' min value.
so get:
group.1 group.2 b december 2003 3 4 january 2004 5 4 february 2004 7 6
any ideas?
this solution dozen lines. first split input data frame list of data frames ym
each of represents year/month. sapply on columns wish calculate minimums for. each column, iterate on ym
components such each component, i.e. each data.frame, subset s
, data frame of rows @ least 10 days after prior mindate
, calculate row of minimum, ix
, update mindate
, return result
:
ym <- split(df, format(df$date, "%y-%m")) sapply(c("a", "b"), function(col) { mindate <- min(df$date) - 10 result <- vector(length = length(ym)) for(i in seq_along(ym)) { s <- subset(ym[[i]], date >= mindate + 10) ix <- which.min(s[[col]]) mindate <- s$date[ix] result[i] <- min(s[[col]][ix]) } setnames(result, names(ym)) })
this gives:
b 2003-12 3 4 2004-01 5 4 2004-02 7 6
(we use "date"
, "a
" , "b"
columns of df
have reduced df
first.)
note: assumed data frame input:
df <- structure(list(years = c(2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2003l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l, 2004l), months = structure(c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l), .label = c("december", "february", "january"), class = "factor"), days = c(1l, 2l, 3l, 4l, 5l, 6l, 7l, 8l, 9l, 10l, 11l, 12l, 13l, 14l, 15l, 16l, 17l, 18l, 19l, 20l, 21l, 22l, 23l, 24l, 25l, 26l, 27l, 28l, 29l, 30l, 31l, 1l, 2l, 3l, 4l, 5l, 6l, 7l, 8l, 9l, 10l, 11l, 12l, 13l, 14l, 15l, 16l, 17l, 18l, 19l, 20l, 21l, 22l, 23l, 24l, 25l, 26l, 27l, 28l, 29l, 30l, 31l, 1l, 2l, 3l, 4l, 5l, 6l, 7l, 8l, 9l, 10l, 11l, 12l, 13l, 14l, 15l, 16l, 17l, 18l, 19l, 20l, 21l, 22l, 23l, 24l, 25l, 26l, 27l, 28l, 29l), date = structure(c(12387, 12388, 12389, 12390, 12391, 12392, 12393, 12394, 12395, 12396, 12397, 12398, 12399, 12400, 12401, 12402, 12403, 12404, 12405, 12406, 12407, 12408, 12409, 12410, 12411, 12412, 12413, 12414, 12415, 12416, 12417, 12418, 12419, 12420, 12421, 12422, 12423, 12424, 12425, 12426, 12427, 12428, 12429, 12430, 12431, 12432, 12433, 12434, 12435, 12436, 12437, 12438, 12439, 12440, 12441, 12442, 12443, 12444, 12445, 12446, 12447, 12448, 12449, 12450, 12451, 12452, 12453, 12454, 12455, 12456, 12457, 12458, 12459, 12460, 12461, 12462, 12463, 12464, 12465, 12466, 12467, 12468, 12469, 12470, 12471, 12472, 12473, 12474, 12475, 12476, 12477), class = "date"), = c(10l, 10l, 10l, 10l, 10l, 10l, 10l, 3l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 5l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 5l, 10l, 10l, 10l, 10l, 10l, 10l, 7l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l), b = c(10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 4l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 4l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 4l, 10l, 10l, 10l, 10l, 10l, 10l, 6l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l, 10l)), .names = c("years", "months", "days", "date", "a", "b"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90", "91"), class = "data.frame")
Comments
Post a Comment