Title: | Process Data from Wearable Light Loggers and Optical Radiation Dosimeters |
---|---|
Description: | Import, processing, validation, and visualization of personal light exposure measurement data from wearable devices. The package implements features such as the import of data and metadata files, conversion of common file formats, validation of light logging data, verification of crucial metadata, calculation of common parameters, and semi-automated analysis and visualization. |
Authors: | Johannes Zauner [aut, cre] |
Maintainer: | Johannes Zauner <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.5.3 |
Built: | 2025-02-20 10:27:11 UTC |
Source: | https://github.com/tscnlab/lightlogr |
Condenses a dataset
by aggregating the data to a single day per group, with
a resolution of choice unit
. aggregate_Date()
is opinionated in the sense
that it sets default handlers for each data type of numeric
, character
,
logical
, and factor
. These can be overwritten by the user. Columns that
do not fall into one of these categories need to be handled individually by
the user (...
argument) or will be removed during aggregation. If no unit
is specified the data will simply be aggregated to the most common interval
(dominant.epoch
) in every group. aggregate_Date()
is especially useful
for summary plots that show an average day.
aggregate_Date( dataset, Datetime.colname = Datetime, unit = "none", type = c("round", "floor", "ceiling"), date.handler = stats::median, numeric.handler = mean, character.handler = function(x) names(which.max(table(x, useNA = "ifany"))), logical.handler = function(x) mean(x) >= 0.5, factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))), datetime.handler = stats::median, ... )
aggregate_Date( dataset, Datetime.colname = Datetime, unit = "none", type = c("round", "floor", "ceiling"), date.handler = stats::median, numeric.handler = mean, character.handler = function(x) names(which.max(table(x, useNA = "ifany"))), logical.handler = function(x) mean(x) >= 0.5, factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))), datetime.handler = stats::median, ... )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
unit |
Unit of binning. See |
type |
One of |
date.handler |
A function that calculates the aggregated day for each
group. By default, this is set to |
numeric.handler , character.handler , logical.handler , factor.handler , datetime.handler
|
functions that handle the respective data types. The default handlers
calculate the |
... |
arguments given over to |
aggregate_Date()
splits the Datetime
column into a Date.data
and a Time.data
column. It will create subgroups for each Time.data
present in a group and aggregate each group into a single day, then remove
the sub grouping.
Use the ...
to create summary statistics for each group, e.g. maximum or
minimum values for each time point group.
Performing aggregate_Datetime()
with any unit
and then
aggregate_Date()
with a unit
of "none"
is equivalent to just using
aggregate_Date()
with that unit
directly (provided the other arguments
are set the same between the functions). Disentangling the two functions
can be useful to split the computational cost for very small instances of
unit
in large datasets. It can also be useful to apply different handlers
when aggregating data to the desired unit
of time, before further
aggregation to a single day, as these handlers as well as ...
are used
twice if the unit
is not set to "none"
.
A tibble
with aggregated Datetime
data, at maximum one day per
group. If the handler arguments capture all column types, the number of
columns will be the same as in the input dataset
.
library(ggplot2) #gg_days without aggregation sample.data.environment %>% gg_days() #with daily aggregation sample.data.environment %>% aggregate_Date() %>% gg_days() #with daily aggregation and a different time aggregation sample.data.environment %>% aggregate_Date(unit = "15 mins", type = "floor") %>% gg_days() #adding further summary statistics about the range of MEDI sample.data.environment %>% aggregate_Date(unit = "15 mins", type = "floor", MEDI_max = max(MEDI), MEDI_min = min(MEDI)) %>% gg_days() + geom_ribbon(aes(ymin = MEDI_min, ymax = MEDI_max), alpha = 0.5)
library(ggplot2) #gg_days without aggregation sample.data.environment %>% gg_days() #with daily aggregation sample.data.environment %>% aggregate_Date() %>% gg_days() #with daily aggregation and a different time aggregation sample.data.environment %>% aggregate_Date(unit = "15 mins", type = "floor") %>% gg_days() #adding further summary statistics about the range of MEDI sample.data.environment %>% aggregate_Date(unit = "15 mins", type = "floor", MEDI_max = max(MEDI), MEDI_min = min(MEDI)) %>% gg_days() + geom_ribbon(aes(ymin = MEDI_min, ymax = MEDI_max), alpha = 0.5)
Condenses a dataset
by aggregating the data to a given (shorter) interval
unit
. aggregate_Datetime()
is opinionated in the sense that it sets
default handlers for each data type of numeric
, character
, logical
, and
factor
. These can be overwritten by the user. Columns that do not fall into
one of these categories need to be handled individually by the user (...
argument) or will be removed during aggregation. If no unit is specified the
data will simply be aggregated to the most common interval
(dominant.epoch
), which is most often not an aggregation but a rounding.)
aggregate_Datetime( dataset, Datetime.colname = Datetime, unit = "dominant.epoch", type = c("round", "floor", "ceiling"), numeric.handler = mean, character.handler = function(x) names(which.max(table(x, useNA = "ifany"))), logical.handler = function(x) mean(x) >= 0.5, factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))), datetime.handler = mean, ... )
aggregate_Datetime( dataset, Datetime.colname = Datetime, unit = "dominant.epoch", type = c("round", "floor", "ceiling"), numeric.handler = mean, character.handler = function(x) names(which.max(table(x, useNA = "ifany"))), logical.handler = function(x) mean(x) >= 0.5, factor.handler = function(x) factor(names(which.max(table(x, useNA = "ifany")))), datetime.handler = mean, ... )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
unit |
Unit of binning. See |
type |
One of |
numeric.handler , character.handler , logical.handler , factor.handler , datetime.handler
|
functions that handle the respective data types. The default handlers
calculate the |
... |
arguments given over to |
A tibble
with aggregated Datetime
data. Usually the number of
rows will be smaller than the input dataset
. If the handler arguments
capture all column types, the number of columns will be the same as in the
input dataset
.
#dominant epoch without aggregation sample.data.environment %>% dominant_epoch() #dominant epoch with 5 minute aggregation sample.data.environment %>% aggregate_Datetime(unit = "5 mins") %>% dominant_epoch() #dominant epoch with 1 day aggregation sample.data.environment %>% aggregate_Datetime(unit = "1 day") %>% dominant_epoch()
#dominant epoch without aggregation sample.data.environment %>% dominant_epoch() #dominant epoch with 5 minute aggregation sample.data.environment %>% aggregate_Datetime(unit = "5 mins") %>% dominant_epoch() #dominant epoch with 1 day aggregation sample.data.environment %>% aggregate_Datetime(unit = "1 day") %>% dominant_epoch()
This function calculates the metrics proposed by Barroso et al. (2014) for light-dosimetry in the context of research on the non-visual effects of light. The following metrics are calculated:
barroso_lighting_metrics( Light.vector, Time.vector, epoch = "dominant.epoch", loop = FALSE, na.rm = FALSE, as.df = FALSE )
barroso_lighting_metrics( Light.vector, Time.vector, epoch = "dominant.epoch", loop = FALSE, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? Defaults to |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame be returned? If |
bright_threshold
The maximum light intensity for which at least six hours of measurements are at the same or higher level.
dark_threshold
The minimum light intensity for which at least eight hours of measurements are at the same or lower level.
bright_mean_level
The 20% trimmed mean of all light intensity measurements
equal or above the bright_threshold
.
dark_mean_level
The 20% trimmed mean of all light intensity measurements
equal or below the dark_threshold
.
bright_cluster
The longest continuous time interval above the bright_threshold
.
dark_cluster
The longest continuous time interval below the dark_threshold
.
circadian_variation
A measure of periodicity of the daily lighting schedule over a given set of days. Calculated as the coefficient of variation of input light data.
List or dataframe with the seven values: bright_threshold
, dark_threshold
,
bright_mean_level
, dark_mean_level
, bright_cluster
, dark_cluster
,
circadian_variation
. The output type of bright_cluster
, dark_cluster
,
is a duration object.
Barroso, A., Simons, K., & Jager, P. de. (2014). Metrics of circadian lighting for clinical investigations. Lighting Research & Technology, 46(6), 637–649. doi:10.1177/1477153513502664
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
dataset1 <- tibble::tibble( Id = rep("B", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), MEDI = c(rep(sample(seq(0,1,0.1), 60*8, replace = TRUE)), rep(sample(1:1000, 16, replace = TRUE), each = 60)) ) dataset1 %>% dplyr::reframe(barroso_lighting_metrics(MEDI, Datetime, as.df = TRUE))
dataset1 <- tibble::tibble( Id = rep("B", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), MEDI = c(rep(sample(seq(0,1,0.1), 60*8, replace = TRUE)), rep(sample(1:1000, 16, replace = TRUE), each = 60)) ) dataset1 %>% dplyr::reframe(barroso_lighting_metrics(MEDI, Datetime, as.df = TRUE))
This function finds the brightest or darkest continuous period of a given
timespan and calculates its mean
light level, as well as the timing of the period's
onset
, midpoint
, and offset
. It is defined as the period with the maximum
or minimum mean light level. Note that the data need to be regularly spaced
(i.e., no gaps) for correct results.
bright_dark_period( Light.vector, Time.vector, period = c("brightest", "darkest"), timespan = "10 hours", epoch = "dominant.epoch", loop = FALSE, na.rm = FALSE, as.df = FALSE )
bright_dark_period( Light.vector, Time.vector, period = c("brightest", "darkest"), timespan = "10 hours", epoch = "dominant.epoch", loop = FALSE, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
period |
String indicating the type of period to look for. Can be either
|
timespan |
The timespan across which to calculate. Can be either a
duration or a duration string, e.g.,
|
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? If |
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should the output be returned as a data frame? Defaults
to |
Assumes regular 24h light data. Otherwise, results may not be meaningful. Looping the data is recommended for finding the darkest period.
A named list with the mean
, onset
, midpoint
, and offset
of the
calculated brightest or darkest period, or if as.df == TRUE
a data frame
with columns named {period}_{timespan}_{metric}
. The output type corresponds
to the type of Time.vector
, e.g., if Time.vector
is HMS, the timing metrics
will be also HMS, and vice versa for POSIXct.
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
# Dataset with light > 250lx between 06:00 and 18:00 dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours", as.df = TRUE)) dataset1 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "7 hours", loop = TRUE, as.df = TRUE)) # Dataset with duration as Time.vector dataset2 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::dhours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours", as.df = TRUE)) dataset2 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "5 hours", loop = TRUE, as.df = TRUE))
# Dataset with light > 250lx between 06:00 and 18:00 dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours", as.df = TRUE)) dataset1 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "7 hours", loop = TRUE, as.df = TRUE)) # Dataset with duration as Time.vector dataset2 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::dhours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "brightest", "10 hours", as.df = TRUE)) dataset2 %>% dplyr::reframe(bright_dark_period(MEDI, Datetime, "darkest", "5 hours", loop = TRUE, as.df = TRUE))
This is a lower level function. It checks a given value against a threshold
for the states given by Brown et al. (2022). The function is vectorized. For
day
the threshold is a lower limit, for evening
and night
the threshold
is an upper limit.
Brown_check( value, state, Brown.day = "day", Brown.evening = "evening", Brown.night = "night", Brown.day.th = 250, Brown.evening.th = 10, Brown.night.th = 1 )
Brown_check( value, state, Brown.day = "day", Brown.evening = "evening", Brown.night = "night", Brown.day.th = 250, Brown.evening.th = 10, Brown.night.th = 1 )
value |
Illuminance value to check against the recommendation. needs to be numeric, can be a vector. |
state |
The state from Brown et al. (2022). Needs to be a character
vector with the same length as |
Brown.day , Brown.evening , Brown.night
|
The names of the states from Brown
et al. (2022). These are the default values ( |
Brown.day.th , Brown.evening.th , Brown.night.th
|
The thresholds for the
states from Brown et al. (2022). These are the default values ( |
A logical vector with the same length as value
that indicates
whether the value is within the recommended illuminance levels.
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
Other Brown:
Brown2reference()
,
Brown_rec()
,
sleep_int2Brown()
states <- c("day", "evening", "night", "day") values <- c(100, 10, 1, 300) Brown_check(values, states) Brown_check(values, states, Brown.day.th = 100)
states <- c("day", "evening", "night", "day") values <- c(100, 10, 1, 300) Brown_check(values, states) Brown_check(values, states, Brown.day.th = 100)
This is a lower level function. It sets the recommended illuminance/MEDI levels by Brown et al. (2022) for a given state. The function is vectorized.
Brown_rec( state, Brown.day = "day", Brown.evening = "evening", Brown.night = "night", Brown.day.th = 250, Brown.evening.th = 10, Brown.night.th = 1 )
Brown_rec( state, Brown.day = "day", Brown.evening = "evening", Brown.night = "night", Brown.day.th = 250, Brown.evening.th = 10, Brown.night.th = 1 )
state |
The state from Brown et al. (2022). Needs to be a character vector. |
Brown.day , Brown.evening , Brown.night
|
The names of the states from Brown
et al. (2022). These are the default values ( |
Brown.day.th , Brown.evening.th , Brown.night.th
|
The thresholds for the
states from Brown et al. (2022). These are the default values ( |
A dataframe with the same length as state
that contains the
recommended illuminance/MEDI levels.
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
Other Brown:
Brown2reference()
,
Brown_check()
,
sleep_int2Brown()
states <- c("day", "evening", "night") Brown_rec(states) Brown_rec(states, Brown.day.th = 100)
states <- c("day", "evening", "night") Brown_rec(states) Brown_rec(states, Brown.day.th = 100)
Adds several columns to a light logger dataset. It requires a column that contains the Brown states, e.g. "daytime", "evening", and "night". From that the function will add a column with the recommended illuminance, a column that checks if the illuminance of the dataset is within the recommended illuminance levels, and a column that gives a label to the reference.
Brown2reference( dataset, MEDI.colname = MEDI, Brown.state.colname = State.Brown, Brown.rec.colname = Reference, Reference.label = "Brown et al. (2022)", overwrite = FALSE, ... )
Brown2reference( dataset, MEDI.colname = MEDI, Brown.state.colname = State.Brown, Brown.rec.colname = Reference, Reference.label = "Brown et al. (2022)", overwrite = FALSE, ... )
dataset |
A dataframe that contains a column with the Brown states |
MEDI.colname |
The name of the column that contains the MEDI values which are used for checks against the Brown reference illuminance. Must be part of the dataset. |
Brown.state.colname |
The name of the column that contains the Brown states. Must be part of the dataset. |
Brown.rec.colname |
The name of the column that will contain the recommended illuminance. Must not be part of the dataset, otherwise it will throw an error. |
Reference.label |
The label that will be used for the reference. Expects
a |
overwrite |
If |
... |
Additional arguments that will be passed to |
On a lower level, the function uses Brown_rec()
and Brown_check()
to
create the required information.
A dataframe on the basis of the dataset
that contains the added
columns.
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
Other Brown:
Brown_check()
,
Brown_rec()
,
sleep_int2Brown()
#add Brown reference illuminance to some sample data testdata <- tibble::tibble(MEDI = c(100, 10, 1, 300), State.Brown = c("day", "evening", "night", "day")) Brown2reference(testdata)
#add Brown reference illuminance to some sample data testdata <- tibble::tibble(MEDI = c(100, 10, 1, 300), State.Brown = c("day", "evening", "night", "day")) Brown2reference(testdata)
This function calculates the centroid of light exposure as the mean of the time vector weighted in proportion to the corresponding binned light intensity.
centroidLE( Light.vector, Time.vector, bin.size = NULL, na.rm = FALSE, as.df = FALSE )
centroidLE( Light.vector, Time.vector, bin.size = NULL, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
bin.size |
Value specifying size of bins to average the light data over.
Must be either a duration or a duration string, e.g.,
|
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Single column data frame or vector.
Phillips, A. J. K., Clerx, W. M., O’Brien, C. S., Sano, A., Barger, L. K., Picard, R. W., Lockley, S. W., Klerman, E. B., & Czeisler, C. A. (2017). Irregular sleep/wake patterns are associated with poorer academic performance and delayed circadian and sleep/wake timing. Scientific Reports, 7(1), 3216. doi:10.1038/s41598-017-03171-4
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
# Dataset with POSIXct time vector dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Datetime, "2 hours") ) # Dataset with hms time vector dataset2 <- tibble::tibble( Id = rep("A", 24), Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Time, "2 hours") ) # Dataset with duration time vector dataset3 <- tibble::tibble( Id = rep("A", 24), Hour = lubridate::duration(0:23, "hours"), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset3 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Hour, "2 hours") )
# Dataset with POSIXct time vector dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Datetime, "2 hours") ) # Dataset with hms time vector dataset2 <- tibble::tibble( Id = rep("A", 24), Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Time, "2 hours") ) # Dataset with duration time vector dataset3 <- tibble::tibble( Id = rep("A", 24), Hour = lubridate::duration(0:23, "hours"), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset3 %>% dplyr::reframe( "Centroid of light exposure" = centroidLE(MEDI, Hour, "2 hours") )
Counts the Time differences (epochs) per group (in a grouped dataset)
count_difftime(dataset, Datetime.colname = Datetime)
count_difftime(dataset, Datetime.colname = Datetime)
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
a tibble
with the number of occurences of each time difference per
group
#get a dataset with irregular intervals filepath <- system.file("extdata/sample_data_LYS.csv", package = "LightLogR") dataset <- import$LYS(filepath) #count_difftime returns the number of occurences of each time difference #and is more comprehensive in terms of a summary than `gap_finder` or #`dominant_epoch` count_difftime(dataset) dominant_epoch(dataset) gap_finder(dataset) #irregular data can be regularized with `aggregate_Datetime` dataset %>% aggregate_Datetime(unit = "15 secs") %>% count_difftime()
#get a dataset with irregular intervals filepath <- system.file("extdata/sample_data_LYS.csv", package = "LightLogR") dataset <- import$LYS(filepath) #count_difftime returns the number of occurences of each time difference #and is more comprehensive in terms of a summary than `gap_finder` or #`dominant_epoch` count_difftime(dataset) dominant_epoch(dataset) gap_finder(dataset) #irregular data can be regularized with `aggregate_Datetime` dataset %>% aggregate_Datetime(unit = "15 secs") %>% count_difftime()
Create a Time-of-Day column in the dataset
create_Timedata( dataset, Datetime.colname = Datetime, Time.data = Time.data, output.dataset = TRUE )
create_Timedata( dataset, Datetime.colname = Datetime, Time.data = Time.data, output.dataset = TRUE )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Time.data |
Name of the newly created column. Expects a |
output.dataset |
should the output be a |
a data.frame
object identical to dataset
but with the added
column of Time-of-Day data, or a vector
with the Time-of-Day-data
sample.data.environment %>% create_Timedata()
sample.data.environment %>% create_Timedata()
cut_Datetime
is a wrapper around lubridate::round_date()
(and friends)
combined with dplyr::mutate()
, to create a new column in a light logger
dataset with a specified binsize. This can be "3 hours"
, "15 secs"
, or
"0.5 days"
. It is a useful step between a dataset and a visualization or
summary step.
cut_Datetime( dataset, unit = "3 hours", type = c("round", "floor", "ceiling"), Datetime.colname = Datetime, New.colname = Datetime.rounded, group_by = FALSE, ... )
cut_Datetime( dataset, unit = "3 hours", type = c("round", "floor", "ceiling"), Datetime.colname = Datetime, New.colname = Datetime.rounded, group_by = FALSE, ... )
dataset |
A light logger dataset. Expects a |
unit |
Unit of binning. See |
type |
One of |
Datetime.colname |
column name that contains the datetime. Defaults to
|
New.colname |
Column name for the added column in the |
group_by |
Should the data be grouped by the new column? Defaults to |
... |
Parameter handed over to |
a data.frame
object identical to dataset
but with the added
column of binned datetimes.
#compare Datetime and Datetime.rounded sample.data.environment %>% cut_Datetime() %>% dplyr::slice_sample(n = 5)
#compare Datetime and Datetime.rounded sample.data.environment %>% cut_Datetime() %>% dplyr::slice_sample(n = 5)
Create reference data from almost any other data that has a datetime column
and a data column. The reference data can even be created from subsets of the
same data. Examples are that one participant can be used as a reference for
all other participants, or that the first (second,...) day of every
participant data is the reference for any other day. This function needs to
be carefully handled, when the reference data time intervals are shorter than
the data time intervals. In that case, use aggregate_Datetime()
on the
reference data beforehand to lengthen the interval.
data2reference( dataset, Reference.data = dataset, Datetime.column = Datetime, Data.column = MEDI, Id.column = Id, Reference.column = Reference, overwrite = FALSE, filter.expression.reference = NULL, across.id = FALSE, shift.start = FALSE, length.restriction.seconds = 60, shift.intervals = "auto", Reference.label = NULL )
data2reference( dataset, Reference.data = dataset, Datetime.column = Datetime, Data.column = MEDI, Id.column = Id, Reference.column = Reference, overwrite = FALSE, filter.expression.reference = NULL, across.id = FALSE, shift.start = FALSE, length.restriction.seconds = 60, shift.intervals = "auto", Reference.label = NULL )
dataset |
A light logger dataset |
Reference.data |
The data that should be used as reference. By default
the |
Datetime.column |
Datetime column of the |
Data.column |
Data column in the |
Id.column |
Name of the |
Reference.column |
Name of the reference column that will be added to
the |
overwrite |
If |
filter.expression.reference |
Expression that is used to filter the
|
across.id |
Grouping variables that should be ignored when creating the
reference data. Default is |
shift.start |
If |
length.restriction.seconds |
Restricts the application of reference data
to a maximum length in seconds. Default is |
shift.intervals |
Time shift in seconds, that is applied to every data
point in the reference data. Default is |
Reference.label |
Label that is added to the reference data. If |
To use subsets of data, use the filter.expression.reference
argument to
specify the subsets of data. The across.id
argument specifies whether the
reference data should be used across all or some grouping variables (e.g.,
across participants). The shift.start
argument enables a shift of the
reference data start time to the start of the respective group.
and @examples for more information. The expression is evaluated
within dplyr::filter()
.
A dataset
with a new column Reference
that contains the reference
data.
library(dplyr) library(lubridate) library(ggplot2) gg_reference <- function(dataset) { dataset %>% ggplot(aes(x = Datetime, y = MEDI, color = Id)) + geom_line(linewidth = 1) + geom_line(aes(y = Reference), color = "black", size = 0.25, linetype = "dashed") + theme_minimal() + facet_wrap(~ Id, scales = "free_y") } #in this example, each data point is its own reference sample.data.environment %>% data2reference() %>% gg_reference() #in this example, the first day of each ID is the reference for the other days #this requires grouping of the Data by Day, which is then specified in across.id #also, shift.start needs to be set to TRUE, to shift the reference data to the #start of the groupings sample.data.environment %>% group_by(Id, Day = as_date(Datetime)) %>% data2reference( filter.expression.reference = as_date(Datetime) == min(as_date(Datetime)), shift.start = TRUE, across.id = "Day") %>% gg_reference() #in this example, the Environment Data will be used as a reference sample.data.environment %>% data2reference( filter.expression.reference = Id == "Environment", across.id = TRUE) %>% gg_reference()
library(dplyr) library(lubridate) library(ggplot2) gg_reference <- function(dataset) { dataset %>% ggplot(aes(x = Datetime, y = MEDI, color = Id)) + geom_line(linewidth = 1) + geom_line(aes(y = Reference), color = "black", size = 0.25, linetype = "dashed") + theme_minimal() + facet_wrap(~ Id, scales = "free_y") } #in this example, each data point is its own reference sample.data.environment %>% data2reference() %>% gg_reference() #in this example, the first day of each ID is the reference for the other days #this requires grouping of the Data by Day, which is then specified in across.id #also, shift.start needs to be set to TRUE, to shift the reference data to the #start of the groupings sample.data.environment %>% group_by(Id, Day = as_date(Datetime)) %>% data2reference( filter.expression.reference = as_date(Datetime) == min(as_date(Datetime)), shift.start = TRUE, across.id = "Day") %>% gg_reference() #in this example, the Environment Data will be used as a reference sample.data.environment %>% data2reference( filter.expression.reference = Id == "Environment", across.id = TRUE) %>% gg_reference()
Take a vector of Datetimes and create a sequence of Datetimes with a given
shift and interval. This is a helper function to create breaks for plotting,
e.g. in gg_days()
, and is best used in conjunction with
Datetime_limits()
. The function is a thin wrapper around seq()
.
Datetime_breaks(x, shift = lubridate::duration(12, "hours"), by = "1 day")
Datetime_breaks(x, shift = lubridate::duration(12, "hours"), by = "1 day")
x |
a vector of |
shift |
a |
by |
a |
a vector
of Datetimes
dataset <- c("2023-08-15", "2023-08-20") Datetime_breaks(dataset) Datetime_breaks(dataset, shift = 0) Datetime_breaks(dataset, by = "12 hours")
dataset <- c("2023-08-15", "2023-08-20") Datetime_breaks(dataset) Datetime_breaks(dataset, shift = 0) Datetime_breaks(dataset, by = "12 hours")
Take a vector of Datetimes
and return the start of the first and end of the
last day of data. The start
and the length
can be adjusted by
durations
, like lubridate::ddays()
. It is used in the gg_days()
function to return a sensible x-axis. This function is a thin wrapper around
lubridate::floor_date()
and lubridate::ceiling_date()
.
Datetime_limits( x, start = NULL, length = NULL, unit = "1 day", midnight.rollover = FALSE, ... )
Datetime_limits( x, start = NULL, length = NULL, unit = "1 day", midnight.rollover = FALSE, ... )
x |
a vector of |
start |
optional |
length |
optional |
unit |
a |
midnight.rollover |
a |
... |
other arguments passed to |
a 2 item vector
of Datetimes
with the (adjusted) start and end of
the input vector.
dataset <- c("2023-08-15", "2023-08-20") breaks <- Datetime_breaks(dataset) Datetime_limits(breaks) Datetime_limits(breaks, start = lubridate::ddays(1)) Datetime_limits(breaks, length = lubridate::ddays(2))
dataset <- c("2023-08-15", "2023-08-20") breaks <- Datetime_breaks(dataset) Datetime_limits(breaks) Datetime_limits(breaks, start = lubridate::ddays(1)) Datetime_limits(breaks, length = lubridate::ddays(2))
This function calculates the continuous disparity index as described in Fernández-Martínez et al. (2018).
disparity_index(Light.vector, na.rm = FALSE, as.df = FALSE)
disparity_index(Light.vector, na.rm = FALSE, as.df = FALSE)
Light.vector |
Numeric vector containing the light data. |
na.rm |
Logical. Should missing values be removed? Defaults to FALSE |
as.df |
Logical. Should the output be returned as a data frame? If |
Single column data frame or vector.
Fernández-Martínez, M., Vicca, S., Janssens, I. A., Carnicer, J., Martín-Vide, J., & Peñuelas, J. (2018). The consecutive disparity index, D: A measure of temporal variability in ecological studies. Ecosphere, 9(12), e02527. doi:10.1002/ecs2.2527
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = sample(0:1000, 24), ) dataset1 %>% dplyr::reframe( "Disparity index" = disparity_index(MEDI) )
dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = sample(0:1000, 24), ) dataset1 %>% dplyr::reframe( "Disparity index" = disparity_index(MEDI) )
Calculate the dominant epoch/interval of a dataset. The dominant epoch/interval is the epoch/interval that is most frequent in the dataset. The calculation is done per group, so that you might get multiple variables. If two or more epochs/intervals are equally frequent, the first one (shortest one) is chosen.
dominant_epoch(dataset, Datetime.colname = Datetime)
dominant_epoch(dataset, Datetime.colname = Datetime)
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
A tibble
with one row per group and a column with the
dominant.epoch
as a lubridate::duration()
. Also a column with the
group.indices
, which is helpful for referencing the dominant.epoch
across dataframes of equal grouping.
Other regularize:
gap_finder()
,
gap_handler()
,
gapless_Datetimes()
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8))) dataset #get the dominant epoch by group dataset %>% dplyr::group_by(Id) %>% dominant_epoch() #get the dominant epoch of the whole dataset dataset %>% dominant_epoch()
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8))) dataset #get the dominant epoch by group dataset %>% dplyr::group_by(Id) %>% dominant_epoch() #get the dominant epoch of the whole dataset dataset %>% dominant_epoch()
When data is imported through LightLogR
and a timezone applied, it is
assumed that the timestamps are correct - which is the case, e.g., if
timestamps are stored in UTC
, or they are in local time. Some if not most
measurement devices are set to local time before a recording interval starts.
If during the recording a daylight savings jump happens (in either
direction), the device might not adjust timestamps for this change. This
results in an unwanted shift in the data, starting at the time of the DST
jump and likely continues until the end of a file. dst_change_handler
is
used to detect such jumps within a group and apply the correct shift in the
data (i.e., the shift that should have been applied by the device).
important Note that this function is only useful if the time stamp in
the raw data deviates from the actual date-time. Note also, that this
function results in a gap during the DST jump, which should be handled by
gap_handler()
afterwards. It will also result in potentially double the
timestamps during the jum back from DST to standard time. This will result
in some inconsistencies with some functions, so we recommend to use
aggregate_Datetime()
afterwards with a unit
equal to the dominant epoch.
Finally, the function is not equipped to handle more than one jump per group.
The jump is based on whether the group starts out with DST or not. the
function will remove datetime rows with NA
values.
dst_change_handler( dataset, Datetime.colname = Datetime, filename.colname = NULL )
dst_change_handler( dataset, Datetime.colname = Datetime, filename.colname = NULL )
dataset |
dataset to be summarized, must be a |
Datetime.colname |
name of the column that contains the Datetime data, expects a |
filename.colname |
(optional) column name that contains the filename.
If provided, it will use this column as a temporary grouping variable
additionally to the |
The detection of a DST jump is based on the function lubridate::dst()
and jumps are only applied within a group. During import, this function is used if dst_adjustment = TRUE
is set and includes by default the filename as the grouping variable, additionally to Id
.
A tibble
with the same columns as the input dataset, but shifted
Other DST:
dst_change_summary()
#create some data that crosses a DST jump data <- tibble::tibble( Datetime = seq.POSIXt(from = as.POSIXct("2023-03-26 01:30:00", tz = "Europe/Berlin"), to = as.POSIXct("2023-03-26 03:00:00", tz = "Europe/Berlin"), by = "30 mins"), Value = 1) #as can be seen next, there is a gap in the data - this is necessary when #using a timezone with DST. data$Datetime #Let us say now, that the device did not adjust for the DST - thus the 03:00 #timestamp is actually 04:00 in local time. This can be corrected for by: data %>% dst_change_handler() %>% .$Datetime
#create some data that crosses a DST jump data <- tibble::tibble( Datetime = seq.POSIXt(from = as.POSIXct("2023-03-26 01:30:00", tz = "Europe/Berlin"), to = as.POSIXct("2023-03-26 03:00:00", tz = "Europe/Berlin"), by = "30 mins"), Value = 1) #as can be seen next, there is a gap in the data - this is necessary when #using a timezone with DST. data$Datetime #Let us say now, that the device did not adjust for the DST - thus the 03:00 #timestamp is actually 04:00 in local time. This can be corrected for by: data %>% dst_change_handler() %>% .$Datetime
Get a summary of groups where a daylight saving time change occurs.
dst_change_summary(dataset, Datetime.colname = Datetime)
dst_change_summary(dataset, Datetime.colname = Datetime)
dataset |
dataset to be summarized, must be a |
Datetime.colname |
name of the column that contains the Datetime data, expects a |
a tibble
with the groups where a dst change occurs. The column dst_start
is a boolean that indicates whether the start of this group occurs during daylight savings.
Other DST:
dst_change_handler()
sample.data.environment %>% dplyr::mutate(Datetime = Datetime + lubridate::dweeks(8)) %>% dst_change_summary()
sample.data.environment %>% dplyr::mutate(Datetime = Datetime + lubridate::dweeks(8)) %>% dst_change_summary()
This function calculates the duration spent above/below a specified threshold light level or within a specified range of light levels.
duration_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, epoch = "dominant.epoch", na.rm = FALSE, as.df = FALSE )
duration_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, epoch = "dominant.epoch", na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the time within the two thresholds will be calculated. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame with be returned? If |
A duration object as single value, or single column data frame.
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
N <- 60 # Dataset with epoch = 1min dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) # Dataset with epoch = 30s dataset2 <- tibble::tibble( Id = rep("B", N), Datetime = lubridate::as_datetime(0) + lubridate::seconds(seq(30, N * 30, 30)), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) dataset.combined <- rbind(dataset1, dataset2) dataset1 %>% dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250)) dataset1 %>% dplyr::reframe(duration_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE)) # Group by Id to account for different epochs dataset.combined %>% dplyr::group_by(Id) %>% dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250))
N <- 60 # Dataset with epoch = 1min dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) # Dataset with epoch = 30s dataset2 <- tibble::tibble( Id = rep("B", N), Datetime = lubridate::as_datetime(0) + lubridate::seconds(seq(30, N * 30, 30)), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) dataset.combined <- rbind(dataset1, dataset2) dataset1 %>% dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250)) dataset1 %>% dplyr::reframe(duration_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE)) # Group by Id to account for different epochs dataset.combined %>% dplyr::group_by(Id) %>% dplyr::reframe("TAT >250lx" = duration_above_threshold(MEDI, Datetime, threshold = 250))
This function smoothes the data using an exponential moving average filter with a specified decay half-life.
exponential_moving_average( Light.vector, Time.vector, decay = "90 min", epoch = "dominant.epoch" )
exponential_moving_average( Light.vector, Time.vector, decay = "90 min", epoch = "dominant.epoch" )
Light.vector |
Numeric vector containing the light data. Missing values are replaced by 0. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
decay |
The decay half-life controlling the exponential smoothing.
Can be either a duration or a string. If it is a string, it
needs to be a valid duration string, e.g., |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
A numeric vector containing the smoothed light data. The output has the same
length as Light.vector
.
Price, L. L. A. (2014). On the Role of Exponential Smoothing in Circadian Dosimetry. Photochemistry and Photobiology, 90(5), 1184-1192. doi:10.1111/php.12282
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
sample.data.environment.EMA = sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Datetime(length = lubridate::days(2)) %>% dplyr::mutate(MEDI.EMA = exponential_moving_average(MEDI, Datetime)) # Plot to compare results sample.data.environment.EMA %>% ggplot2::ggplot(ggplot2::aes(x = Datetime)) + ggplot2::geom_line(ggplot2::aes(y = MEDI), colour = "black") + ggplot2::geom_line(ggplot2::aes(y = MEDI.EMA), colour = "red")
sample.data.environment.EMA = sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Datetime(length = lubridate::days(2)) %>% dplyr::mutate(MEDI.EMA = exponential_moving_average(MEDI, Datetime)) # Plot to compare results sample.data.environment.EMA %>% ggplot2::ggplot(ggplot2::aes(x = Datetime)) + ggplot2::geom_line(ggplot2::aes(y = MEDI), colour = "black") + ggplot2::geom_line(ggplot2::aes(y = MEDI.EMA), colour = "red")
Filtering a dataset based on Dates or Datetimes may often be necessary prior
to calcuation or visualization. The functions allow for a filtering based on
simple strings
or Datetime
scalars, or by specifying a length. They also
support prior dplyr grouping, which is useful, e.g., when you only want to
filter the first two days of measurement data for every participant,
regardless of the actual date. If you want to filter based on times of the
day, look to filter_Time()
.
filter_Datetime( dataset, Datetime.colname = Datetime, start = NULL, end = NULL, length = NULL, length_from_start = TRUE, full.day = FALSE, tz = NULL, only_Id = NULL, filter.expr = NULL ) filter_Date(..., start = NULL, end = NULL)
filter_Datetime( dataset, Datetime.colname = Datetime, start = NULL, end = NULL, length = NULL, length_from_start = TRUE, full.day = FALSE, tz = NULL, only_Id = NULL, filter.expr = NULL ) filter_Date(..., start = NULL, end = NULL)
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
start , end
|
For
|
length |
Either a Period or Duration from lubridate. E.g., |
length_from_start |
A |
full.day |
A |
tz |
Timezone of the start/end times. If |
only_Id |
An expression of |
filter.expr |
Advanced filtering conditions. If not |
... |
Parameter handed over to |
a data.frame
object identical to dataset
but with only the
specified Dates/Times.
Other filter:
filter_Time()
Other filter:
filter_Time()
library(lubridate) library(dplyr) #baseline range.unfiltered <- sample.data.environment$Datetime %>% range() range.unfiltered #setting the start of a dataset sample.data.environment %>% filter_Datetime(start = "2023-08-31 12:00:00") %>% pull(Datetime) %>% range() #setting the end of a dataset sample.data.environment %>% filter_Datetime(end = "2023-08-31 12:00:00") %>% pull(Datetime) %>% range() #setting a period of a dataset sample.data.environment %>% filter_Datetime(end = "2023-08-31 12:00:00", length = days(2)) %>% pull(Datetime) %>% range() #setting only the period of a dataset sample.data.environment %>% filter_Datetime(length = days(2)) %>% pull(Datetime) %>% range() #advanced filtering based on grouping (second day of each group) sample.data.environment %>% #shift the "Environment" group by one day mutate( Datetime = ifelse(Id == "Environment", Datetime + ddays(1), Datetime) %>% as_datetime()) -> sample sample %>% summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - ")) #now we can use the `filter.expr` argument to filter from the second day of each group sample %>% filter_Datetime(filter.expr = Datetime > Datetime[1] + days(1)) %>% summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - ")) sample.data.environment %>% filter_Date(end = "2023-08-31")
library(lubridate) library(dplyr) #baseline range.unfiltered <- sample.data.environment$Datetime %>% range() range.unfiltered #setting the start of a dataset sample.data.environment %>% filter_Datetime(start = "2023-08-31 12:00:00") %>% pull(Datetime) %>% range() #setting the end of a dataset sample.data.environment %>% filter_Datetime(end = "2023-08-31 12:00:00") %>% pull(Datetime) %>% range() #setting a period of a dataset sample.data.environment %>% filter_Datetime(end = "2023-08-31 12:00:00", length = days(2)) %>% pull(Datetime) %>% range() #setting only the period of a dataset sample.data.environment %>% filter_Datetime(length = days(2)) %>% pull(Datetime) %>% range() #advanced filtering based on grouping (second day of each group) sample.data.environment %>% #shift the "Environment" group by one day mutate( Datetime = ifelse(Id == "Environment", Datetime + ddays(1), Datetime) %>% as_datetime()) -> sample sample %>% summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - ")) #now we can use the `filter.expr` argument to filter from the second day of each group sample %>% filter_Datetime(filter.expr = Datetime > Datetime[1] + days(1)) %>% summarize(Daterange = paste(min(Datetime), max(Datetime), sep = " - ")) sample.data.environment %>% filter_Date(end = "2023-08-31")
filter_Datetime_multiple()
is a wrapper around filter_Datetime()
or
filter_Date()
that allows the cumulative filtering of Datetimes
based on
varying filter conditions. It is most useful in conjunction with the
only_Id
argument, e.g., to selectively cut off dates depending on
participants (see examples)
filter_Datetime_multiple( dataset, arguments, filter_function = filter_Datetime, ... )
filter_Datetime_multiple( dataset, arguments, filter_function = filter_Datetime, ... )
dataset |
A light logger dataset |
arguments |
A list of arguments to be passed to |
filter_function |
The function to be used for filtering, either
|
... |
Additional arguments passed to the filter function. If the
|
A dataframe with the filtered data
arguments <- list( list(start = "2023-08-31", only_Id = quote(Id == "Participant")), list(end = "2023-08-31", only_Id = quote(Id == "Environment"))) #compare the unfiltered dataset sample.data.environment %>% gg_overview(Id.colname = Id) #compare the unfiltered dataset sample.data.environment %>% filter_Datetime_multiple(arguments = arguments, filter_Date) %>% gg_overview(Id.colname = Id)
arguments <- list( list(start = "2023-08-31", only_Id = quote(Id == "Participant")), list(end = "2023-08-31", only_Id = quote(Id == "Environment"))) #compare the unfiltered dataset sample.data.environment %>% gg_overview(Id.colname = Id) #compare the unfiltered dataset sample.data.environment %>% filter_Datetime_multiple(arguments = arguments, filter_Date) %>% gg_overview(Id.colname = Id)
Filter Times in a dataset.
filter_Time( dataset, Datetime.colname = Datetime, start = NULL, end = NULL, length = NULL )
filter_Time( dataset, Datetime.colname = Datetime, start = NULL, end = NULL, length = NULL )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
start , end , length
|
a
|
a data.frame
object identical to dataset
but with only the
specified Times.
Other filter:
filter_Datetime()
sample.data.environment %>% filter_Time(start = "4:00:34", length = "12:00:00") %>% dplyr::pull(Time.data) %>% range() %>% hms::as_hms()
sample.data.environment %>% filter_Time(start = "4:00:34", length = "12:00:00") %>% dplyr::pull(Time.data) %>% range() %>% hms::as_hms()
This functions calculates the number of times a given threshold light level is crossed.
frequency_crossing_threshold( Light.vector, threshold, na.rm = FALSE, as.df = FALSE )
frequency_crossing_threshold( Light.vector, threshold, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
threshold |
Single numeric value specifying the threshold light level to compare with. |
na.rm |
Logical. Should missing light values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Data frame or matrix with pairs of threshold and calculated values.
Alvarez, A. A., & Wildsoet, C. F. (2013). Quantifying light exposure patterns in young adult students. Journal of Modern Optics, 60(14), 1200–1208. doi:10.1080/09500340.2013.845700
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
N = 60 dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) dataset1 %>% dplyr::reframe("Frequency crossing 250lx" = frequency_crossing_threshold(MEDI, threshold = 250)) dataset1 %>% dplyr::reframe(frequency_crossing_threshold(MEDI, threshold = 250, as.df = TRUE))
N = 60 dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2), sample(250:1000, N / 2))), ) dataset1 %>% dplyr::reframe("Frequency crossing 250lx" = frequency_crossing_threshold(MEDI, threshold = 250)) dataset1 %>% dplyr::reframe(frequency_crossing_threshold(MEDI, threshold = 250, as.df = TRUE))
A list of tables containing gain and gain-ratios to normalize counts across different sensor gains.
gain.ratio.tables
gain.ratio.tables
gain.ratio.tables
A list containing two-column tibbles
gain table for the ambient light sensor TSL2585
A named character
vector specifying the version and date a sensor was added
Utility: Some sensors provide raw counts and gain levels as part of their output. In some cases it is desirable to compare counts between sensors, e.g., to gauge daylight outside by comparing UV counts to photopic counts (a high ratio of UV/Pho indicates outside daylight). Or to gauge daylight inside by comparing IR counts to photopic counts (a high ratio of IR/Pho with a low ratio of UV/Pho indicates daylight in the context of LED or fluorescent lighting)
Quickly check for implicit missing Datetime
data. Outputs a message with a
short summary, and can optionally return the gaps as a tibble
. Uses
gap_handler()
internally.
gap_finder( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", gap.data = FALSE, silent = FALSE, full.days = FALSE )
gap_finder( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", gap.data = FALSE, silent = FALSE, full.days = FALSE )
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
gap.data |
Logical. If |
silent |
Logical. If |
full.days |
If |
The gap_finder()
function is a wrapper around gap_handler()
with the
behavior
argument set to "gaps"
. The main difference is that
gap_finder()
returns a message with a short summary of the gaps in the
dataset, and that the tibble
with the gaps contains a column gap.id
that
indicates the gap number, which is useful to determine, e.g., the consecutive
number of gaps between measurement data.
Prints message with a short summary of the gaps in the dataset. If
gap.data = TRUE
, returns a tibble
of the gaps in the dataset.
Other regularize:
dominant_epoch()
,
gap_handler()
,
gapless_Datetimes()
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8)) + lubridate::hours(c(0,12,rep(0,4)))) %>% dplyr::group_by(Id) dataset #look for gaps assuming the epoch is the dominant epoch of each group gap_finder(dataset) #return the gaps as a tibble gap_finder(dataset, gap.data = TRUE) #assuming the epoch is 1 day, we have different gaps, and the datapoint at noon is now `irregular` gap_finder(dataset, epoch = "1 day")
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8)) + lubridate::hours(c(0,12,rep(0,4)))) %>% dplyr::group_by(Id) dataset #look for gaps assuming the epoch is the dominant epoch of each group gap_finder(dataset) #return the gaps as a tibble gap_finder(dataset, gap.data = TRUE) #assuming the epoch is 1 day, we have different gaps, and the datapoint at noon is now `irregular` gap_finder(dataset, epoch = "1 day")
Datasets from light loggers often have implicit gaps. These gaps are implicit in the sense that consecutive timestamps (Datetimes
) might not follow a regular epoch/interval. This function fills these implicit gaps by creating a gapless sequence of Datetimes
and joining it to the dataset. The gapless sequence is determined by the minimum and maximum Datetime
in the dataset (per group) and an epoch. The epoch can either be guessed from the dataset or specified by the user. A sequence of gapless Datetimes
can be created with the gapless_Datetimes()
function, whereas the dominant epoch in the data can be checked with the dominant_epoch()
function. The behaviour
argument specifies how the data is combined. By default, the data is joined with a full join, which means that all rows from the gapless sequence are kept, even if there is no matching row in the dataset.
gap_handler( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", behavior = c("full_sequence", "regulars", "irregulars", "gaps"), full.days = FALSE )
gap_handler( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", behavior = c("full_sequence", "regulars", "irregulars", "gaps"), full.days = FALSE )
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
behavior |
The behavior of the join of the |
full.days |
If |
A modified tibble
similar to dataset
but with handling of implicit gaps, depending on the behavior
argument:
"full_sequence"
adds timestamps to the dataset
that are missing based on a full sequence of Datetimes
(i.e., the gapless sequence). The dataset
is this equal (no gaps) or greater in the number of rows than the input. One column is added. is.implicit
indicates whether the row was added (TRUE
) or not (FALSE
). This helps differentiating measurement values from values that might be imputed later on.
"regulars"
keeps only rows from the gapless sequence that have a matching row in the dataset. This can be interpreted as a row-reduced dataset
with only regular timestamps according to the epoch
. In case of no gaps this tibble has the same number of rows as the input.
"irregulars"
keeps only rows from the dataset
that do not follow the regular sequence of Datetimes
according to the epoch
. In case of no gaps this tibble has 0 rows.
"gaps"
returns a tibble
of all implicit gaps in the dataset. In case of no gaps this tibble has 0 rows.
Other regularize:
dominant_epoch()
,
gap_finder()
,
gapless_Datetimes()
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8)) + lubridate::hours(c(0,12,rep(0,4)))) %>% dplyr::group_by(Id) dataset #assuming the epoch is 1 day, we can add implicit data to our dataset dataset %>% gap_handler(epoch = "1 day") #we can also check whether there are irregular Datetimes in our dataset dataset %>% gap_handler(epoch = "1 day", behavior = "irregulars") #to get to the gaps, we can use the "gaps" behavior dataset %>% gap_handler(epoch = "1 day", behavior = "gaps") #finally, we can also get just the regular Datetimes dataset %>% gap_handler(epoch = "1 day", behavior = "regulars")
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8)) + lubridate::hours(c(0,12,rep(0,4)))) %>% dplyr::group_by(Id) dataset #assuming the epoch is 1 day, we can add implicit data to our dataset dataset %>% gap_handler(epoch = "1 day") #we can also check whether there are irregular Datetimes in our dataset dataset %>% gap_handler(epoch = "1 day", behavior = "irregulars") #to get to the gaps, we can use the "gaps" behavior dataset %>% gap_handler(epoch = "1 day", behavior = "gaps") #finally, we can also get just the regular Datetimes dataset %>% gap_handler(epoch = "1 day", behavior = "regulars")
Create a gapless sequence of Datetimes. The Datetimes are determined by the minimum and maximum Datetime in the dataset and an epoch. The epoch can either be guessed from the dataset or specified by the user.
gapless_Datetimes( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", full.days = FALSE )
gapless_Datetimes( dataset, Datetime.colname = Datetime, epoch = "dominant.epoch", full.days = FALSE )
dataset |
A light logger dataset. Needs to be a dataframe. |
Datetime.colname |
The column that contains the datetime. Needs to be a
|
epoch |
The epoch to use for the gapless sequence. Can be either a
|
full.days |
If |
A tibble
with a gapless sequence of Datetime
as specified by
epoch
.
Other regularize:
dominant_epoch()
,
gap_finder()
,
gap_handler()
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8))) %>% dplyr::group_by(Id) dataset %>% gapless_Datetimes() dataset %>% dplyr::ungroup() %>% gapless_Datetimes() dataset %>% gapless_Datetimes(epoch = "1 day")
dataset <- tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"), Datetime = lubridate::as_datetime(1) + lubridate::days(c(0:2, 4, 6, 8))) %>% dplyr::group_by(Id) dataset %>% gapless_Datetimes() dataset %>% dplyr::ungroup() %>% gapless_Datetimes() dataset %>% gapless_Datetimes(epoch = "1 day")
gg_day()
will create a simple ggplot for every data in a dataset. The
result can further be manipulated like any ggplot. This will be sensible to
refine styling or guides.
gg_day( dataset, start.date = NULL, end.date = NULL, x.axis = Datetime, y.axis = MEDI, aes_col = NULL, aes_fill = NULL, group = Id, geom = "point", scales = c("fixed", "free_x", "free_y", "free"), x.axis.breaks = hms::hms(hours = seq(0, 24, by = 3)), y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)), y.scale = "symlog", y.scale.sc = FALSE, x.axis.label = "Time of Day", y.axis.label = "Illuminance (lx, MEDI)", format.day = "%d/%m", title = NULL, subtitle = NULL, interactive = FALSE, facetting = TRUE, jco_color = TRUE, ... )
gg_day( dataset, start.date = NULL, end.date = NULL, x.axis = Datetime, y.axis = MEDI, aes_col = NULL, aes_fill = NULL, group = Id, geom = "point", scales = c("fixed", "free_x", "free_y", "free"), x.axis.breaks = hms::hms(hours = seq(0, 24, by = 3)), y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)), y.scale = "symlog", y.scale.sc = FALSE, x.axis.label = "Time of Day", y.axis.label = "Illuminance (lx, MEDI)", format.day = "%d/%m", title = NULL, subtitle = NULL, interactive = FALSE, facetting = TRUE, jco_color = TRUE, ... )
dataset |
A light logger dataset. Expects a |
start.date , end.date
|
Choose an optional start or end date within your
|
x.axis , y.axis
|
column name that contains the datetime (x, defaults to
|
aes_col , aes_fill
|
optional arguments that define separate sets and
colors or fills them. Expects anything that works with the layer data
|
group |
Optional column name that defines separate sets. Useful for
certain geoms like |
geom |
What geom should be used for visualization? Expects a
|
scales |
For |
x.axis.breaks , y.axis.breaks
|
Where should breaks occur on the x and
y.axis? Expects a |
y.scale |
How should the y-axis be scaled?
|
y.scale.sc |
|
x.axis.label , y.axis.label
|
labels for the x- and y-axis. Expects a
|
format.day |
Label for each day. Default is |
title |
Plot title. Expects a |
subtitle |
Plot subtitle. Expects a |
interactive |
Should the plot be interactive? Expects a |
facetting |
Should an automated facet by day be applied? Default is
|
jco_color |
Should the |
... |
Other options that get passed to the main geom function. Can be used to adjust to adjust size, linewidth, or linetype. |
Besides plotting, the function creates two new variables from the given
Datetime
:
Day.data
is a factor that is used for facetting with ggplot2::facet_wrap()
. Make sure to use this variable, if you change the faceting manually. Also, the function checks, whether this variable already exists. If it does, it will only convert it to a factor and do the faceting on that variable.
Time.data
is an hms
created with hms::as_hms()
that is used for the x.axis
The default scaling of the y-axis is a symlog
scale, which is a logarithmic
scale that only starts scaling after a given threshold (default = 0). This
enables values of 0 in the plot, which are common in light logger data, and
even enables negative values, which might be sensible for non-light data. See
symlog_trans()
for details on tweaking this scale. The scale can also be
changed to a normal or logarithmic scale - see the y.scale argument for more.
The default scaling of the color and fill scales is discrete, with the
ggsci::scale_color_jco()
and ggsci::scale_fill_jco()
scales. To use a
continuous scale, use the jco_color = FALSE
setting. Both fill
and
color
aesthetics are set to NULL
by default. For most geoms, this is not
important, but geoms that automatically use those aesthetics (like
geom_bin2d, where fill = stat(count)) are affected by this. Manually adding
the required aesthetic (like aes_fill = ggplot2::stat(count)
will fix
this).
A ggplot object
#use `col`for separation of different sets plot <- gg_day( sample.data.environment, scales = "fixed", end.date = "2023-08-31", y.axis.label = "mEDI (lx)", aes_col = Id) plot #you can easily overwrite the color scale afterwards plot + ggplot2::scale_color_discrete() #or change the facetting plot + ggplot2::facet_wrap(~Day.data + Id)
#use `col`for separation of different sets plot <- gg_day( sample.data.environment, scales = "fixed", end.date = "2023-08-31", y.axis.label = "mEDI (lx)", aes_col = Id) plot #you can easily overwrite the color scale afterwards plot + ggplot2::scale_color_discrete() #or change the facetting plot + ggplot2::facet_wrap(~Day.data + Id)
gg_days()
will create a simple ggplot along the timeline. The result can
further be manipulated like any ggplot. This will be sensible to refine
styling or guides. Through the x.axis.limits
arguments, the plot can be
much refined to align several groups of differing datetime ranges. It uses
the Datetime_limits()
function to calculate the limits of the x-axis.
Another notable functions that are used are Datetime_breaks()
to calculate
the breaks of the x-axis.
gg_days( dataset, x.axis = Datetime, y.axis = MEDI, aes_col = NULL, aes_fill = NULL, group = NULL, geom = "line", scales = c("free_x", "free_y", "fixed", "free"), x.axis.breaks = Datetime_breaks, y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)), y.scale = "symlog", y.scale.sc = FALSE, x.axis.label = "Datetime", y.axis.label = "Illuminance (lx, MEDI)", x.axis.limits = Datetime_limits, x.axis.format = "%a %D", title = NULL, subtitle = NULL, interactive = FALSE, facetting = TRUE, jco_color = FALSE, ... )
gg_days( dataset, x.axis = Datetime, y.axis = MEDI, aes_col = NULL, aes_fill = NULL, group = NULL, geom = "line", scales = c("free_x", "free_y", "fixed", "free"), x.axis.breaks = Datetime_breaks, y.axis.breaks = c(-10^(5:0), 0, 10^(0:5)), y.scale = "symlog", y.scale.sc = FALSE, x.axis.label = "Datetime", y.axis.label = "Illuminance (lx, MEDI)", x.axis.limits = Datetime_limits, x.axis.format = "%a %D", title = NULL, subtitle = NULL, interactive = FALSE, facetting = TRUE, jco_color = FALSE, ... )
dataset |
A light logger dataset. Expects a |
x.axis , y.axis
|
column name that contains the datetime (x, defaults to
|
aes_col , aes_fill
|
optional input that defines separate sets and colors
or fills them. Expects anything that works with the layer data
|
group |
Optional column name that defines separate sets. Useful for
certain geoms like |
geom |
What geom should be used for visualization? Expects a
|
scales |
For |
x.axis.breaks |
The (major) breaks of the x-axis. Defaults to
|
y.axis.breaks |
Where should breaks occur on the y.axis? Expects a
|
y.scale |
How should the y-axis be scaled?
|
y.scale.sc |
|
x.axis.label , y.axis.label
|
labels for the x- and y-axis. Expects a
|
x.axis.limits |
The limits of the x-axis. Defaults to
|
x.axis.format |
The format of the x-axis labels. Defaults to |
title |
Plot title. Expects a |
subtitle |
Plot subtitle. Expects a |
interactive |
Should the plot be interactive? Expects a |
facetting |
Should an automated facet by grouping be applied? Default is
|
jco_color |
Should the |
... |
Other options that get passed to the main geom function. Can be used to adjust to adjust size, linewidth, or linetype. |
The default scaling of the y-axis is a symlog
scale, which is a logarithmic
scale that only starts scaling after a given threshold (default = 0). This
enables values of 0 in the plot, which are common in light logger data, and
even enables negative values, which might be sensible for non-light data. See
symlog_trans()
for details on tweaking this scale. The scale can also be
changed to a normal or logarithmic scale - see the y.scale argument for more.
A ggplot object
dataset <- sample.data.environment %>% aggregate_Datetime(unit = "5 mins") dataset %>% gg_days() #restrict the x-axis to 3 days dataset %>% gg_days( x.axis.limits = \(x) Datetime_limits(x, length = lubridate::ddays(3)) )
dataset <- sample.data.environment %>% aggregate_Datetime(unit = "5 mins") dataset %>% gg_days() #restrict the x-axis to 3 days dataset %>% gg_days( x.axis.limits = \(x) Datetime_limits(x, length = lubridate::ddays(3)) )
The function is by default opinionated, and will automatically select the best way to display the double date plot. However, the user can also manually select the type of double date plot to be displayed: repeating each day (default when there is only one day in all of the groups), or displaying consecutive days (default when there are multiple days in the groups).
gg_doubleplot( dataset, Datetime.colname = Datetime, type = c("auto", "repeat", "next"), geom = "ribbon", alpha = 0.5, col = "grey40", fill = "#EFC000FF", linewidth = 0.4, x.axis.breaks.next = Datetime_breaks, x.axis.format.next = "%a %D", x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "6 hours", shift = lubridate::duration(0, "hours")), x.axis.format.repeat = "%H:%M", ... )
gg_doubleplot( dataset, Datetime.colname = Datetime, type = c("auto", "repeat", "next"), geom = "ribbon", alpha = 0.5, col = "grey40", fill = "#EFC000FF", linewidth = 0.4, x.axis.breaks.next = Datetime_breaks, x.axis.format.next = "%a %D", x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "6 hours", shift = lubridate::duration(0, "hours")), x.axis.format.repeat = "%H:%M", ... )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
type |
One of "auto", "repeat", or "next". The default is "auto", which will automatically select the best way to display the double date plot based on the amount of days in the dataset ( |
geom |
The type of geom to be used in the plot. The default is "ribbon". |
alpha , linewidth
|
The alpha and linewidth setting of the geom. The default is 0.5 and 0.4, respectively. |
col , fill
|
The color and fill of the geom. The default is "#EFC000FF". If the parameters |
x.axis.breaks.next , x.axis.breaks.repeat
|
Datetime breaks when consecutive days are displayed ( |
x.axis.format.next , x.axis.format.repeat
|
Datetime label format when consecutive days are displayed ( |
... |
Arguments passed to |
gg_doubleplot()
is a wrapper function for gg_days()
, combined with an internal function to duplicate and reorganize dates in a dataset for a double plot view. This means that the same day is displayed multiple times within the plot in order to reveal pattern across days.
a ggplot object
#take only the Participant data from sample data, and three days library(dplyr) library(lubridate) library(ggplot2) sample.data <- sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Date(length = ddays(3)) #create a double plot with the default settings sample.data %>% gg_doubleplot() #repeat the same day in the plot sample.data %>% gg_doubleplot(type = "repeat") #use the function with more than one Id sample.data.environment %>% filter_Date(length = ddays(3)) %>% gg_doubleplot(aes_fill = Id, aes_col = Id) + facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left") #if data is already grouped by days, type = "repeat" will be automatic sample.data.environment %>% dplyr::group_by(Date = date(Datetime), .add = TRUE) %>% filter_Date(length = ddays(3)) %>% gg_doubleplot(aes_fill = Id, aes_col = Id) + guides(fill = "none", col = "none") + #remove the legend facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left") #combining `aggregate_Date()` with `gg_doubleplot()` easily creates a good #overview of the data sample.data.environment %>% aggregate_Date() %>% gg_doubleplot()
#take only the Participant data from sample data, and three days library(dplyr) library(lubridate) library(ggplot2) sample.data <- sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Date(length = ddays(3)) #create a double plot with the default settings sample.data %>% gg_doubleplot() #repeat the same day in the plot sample.data %>% gg_doubleplot(type = "repeat") #use the function with more than one Id sample.data.environment %>% filter_Date(length = ddays(3)) %>% gg_doubleplot(aes_fill = Id, aes_col = Id) + facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left") #if data is already grouped by days, type = "repeat" will be automatic sample.data.environment %>% dplyr::group_by(Date = date(Datetime), .add = TRUE) %>% filter_Date(length = ddays(3)) %>% gg_doubleplot(aes_fill = Id, aes_col = Id) + guides(fill = "none", col = "none") + #remove the legend facet_wrap(~ Date.data, ncol = 1, scales = "free_x", strip.position = "left") #combining `aggregate_Date()` with `gg_doubleplot()` easily creates a good #overview of the data sample.data.environment %>% aggregate_Date() %>% gg_doubleplot()
Plot an overview of dataset intervals with implicit missing data
gg_overview( dataset, Datetime.colname = Datetime, Id.colname = Id, gap.data = NULL, ..., interactive = FALSE )
gg_overview( dataset, Datetime.colname = Datetime, Id.colname = Id, gap.data = NULL, ..., interactive = FALSE )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Id.colname |
The column name of the Id column (default is |
gap.data |
Optionally provide a |
... |
Additional arguments given to the main |
interactive |
Should the plot be interactive? Expects a |
A ggplot
object
sample.data.environment %>% gg_overview()
sample.data.environment %>% gg_overview()
gg_photoperiod()
is a helper function to add photoperiod information to
plots generated with gg_day()
or gg_days()
. The function can either draw
on the dawn
and dusk
columns of the dataset or use the coordinates
and
solarDep
arguments to calculate the photoperiods. For better visibility,
the
gg_photoperiod( ggplot_obj, coordinates = NULL, alpha = 0.2, solarDep = 6, on.top = FALSE, ... )
gg_photoperiod( ggplot_obj, coordinates = NULL, alpha = 0.2, solarDep = 6, on.top = FALSE, ... )
ggplot_obj |
|
coordinates |
A two element numeric vector representing the latitude and
longitude of the location. If |
alpha |
A numerical value between 0 and 1 representing the transparency of the photoperiods. Default is 0.2. |
solarDep |
A numerical value representing the solar depression angle
between 90 and -90. This means a value of 6 equals -6 degrees above the
horizon. Default is 6, equalling |
on.top |
Logical scalar. If |
... |
Additional arguments given to the |
If used in combination with gg_doubleplot()
, with that function in the
type = "repeat"
setting (either manually set, or because there is only one
day of data per group present), photoperiods need to be added separately
through add_photoperiod()
, or the second photoperiod in each panel will be
off by one day. See the examples for more information.
In general, if the photoperiod setup is more complex, it makes sense to add it prior to plotting and make sure the photoperiods are correct.
a modified ggplot
object with the photoperiods added.
Other photoperiod:
photoperiod()
coordinates <- c(48.521637, 9.057645) #adding photoperiods to a ggplot sample.data.environment |> gg_days() |> gg_photoperiod(coordinates) #adding photoperiods prior to plotting sample.data.environment |> add_photoperiod(coordinates, solarDep = 0) |> gg_days() |> gg_photoperiod() #plotting photoperiods automatically works for both gg_day() and gg_days() sample.data.environment |> gg_day() |> gg_photoperiod(coordinates) #plotting for gg_doubleplot mostly works fine sample.data.environment |> filter_Date(length = "2 days") |> gg_doubleplot() |> gg_photoperiod(coordinates) #however, in cases where only one day of data per group is available, or the #type = "repeat" setting is used, the photoperiods need to be added #separately. Otherwise the second day will be off by one day in each panel. #The visual difference is subtle, and might not be visible at all, as #photoperiod only every changes by few minutes per day. #WRONG sample.data.environment |> filter_Date(length = "1 days") |> gg_doubleplot() |> gg_photoperiod(coordinates) #CORRECT sample.data.environment |> filter_Date(length = "1 days") |> add_photoperiod(coordinates) |> gg_doubleplot() |> gg_photoperiod()
coordinates <- c(48.521637, 9.057645) #adding photoperiods to a ggplot sample.data.environment |> gg_days() |> gg_photoperiod(coordinates) #adding photoperiods prior to plotting sample.data.environment |> add_photoperiod(coordinates, solarDep = 0) |> gg_days() |> gg_photoperiod() #plotting photoperiods automatically works for both gg_day() and gg_days() sample.data.environment |> gg_day() |> gg_photoperiod(coordinates) #plotting for gg_doubleplot mostly works fine sample.data.environment |> filter_Date(length = "2 days") |> gg_doubleplot() |> gg_photoperiod(coordinates) #however, in cases where only one day of data per group is available, or the #type = "repeat" setting is used, the photoperiods need to be added #separately. Otherwise the second day will be off by one day in each panel. #The visual difference is subtle, and might not be visible at all, as #photoperiod only every changes by few minutes per day. #WRONG sample.data.environment |> filter_Date(length = "1 days") |> gg_doubleplot() |> gg_photoperiod(coordinates) #CORRECT sample.data.environment |> filter_Date(length = "1 days") |> add_photoperiod(coordinates) |> gg_doubleplot() |> gg_photoperiod()
Adjust device imports or make your own
import_adjustment(import_expr)
import_adjustment(import_expr)
import_expr |
A named list of import expressions. The basis for
|
This function should only be used with some knowledge of how
expressions work in R. The minimal required output for an expression to work
as expected, it must lead to a data frame containing a Datetime
column with
the correct time zone. It has access to all arguments defined in the
description of import_Dataset()
. The ...
argument should be passed to
whatever csv reader function is used, so that it works as expected. Look at
ll_import_expr()$LYS
for a quite minimal example.
A list of import functions
#create a new import function for the LYS device, same as the old new_import <- import_adjustment(ll_import_expr()) #the new one is identical to the old one in terms of the function body identical(body(import$LYS), body(new_import$LYS)) #change the import expression for the LYS device to add a message at the top new_import_expr <- ll_import_expr() new_import_expr$LYS[[4]] <- rlang::expr({ cat("**This is a new import function**\n") data }) new_import <- import_adjustment(new_import_expr) filepath <- system.file("extdata/sample_data_LYS.csv", package = "LightLogR") #Now, a message is printed when the import function is called new_import <- new_import$LYS(filepath)
#create a new import function for the LYS device, same as the old new_import <- import_adjustment(ll_import_expr()) #the new one is identical to the old one in terms of the function body identical(body(import$LYS), body(new_import$LYS)) #change the import expression for the LYS device to add a message at the top new_import_expr <- ll_import_expr() new_import_expr$LYS[[4]] <- rlang::expr({ cat("**This is a new import function**\n") data }) new_import <- import_adjustment(new_import_expr) filepath <- system.file("extdata/sample_data_LYS.csv", package = "LightLogR") #Now, a message is printed when the import function is called new_import <- new_import$LYS(filepath)
Imports a dataset and does the necessary transformations to get the right
column formats. Unless specified otherwise, the function will set the
timezone of the data to UTC
. It will also enforce an Id
to separate
different datasets and will order/arrange the dataset within each Id
by
Datetime. See the Details and Devices section for more information and the
full list of arguments.
import_Dataset(device, ...) import
import_Dataset(device, ...) import
device |
From what device do you want to import? For a few devices,
there is a sample data file that you can use to test the function (see the
examples). See |
... |
Parameters that get handed down to the specific import functions |
An object of class list
of length 18.
There are specific and a general import function. The general import
function is described below, whereas the specific import functions take the
form of import$device()
. The general import function is a thin wrapper
around the specific import functions. The specific import functions take
the following arguments:
filename
: Filename(s) for the Dataset. Can also contain the filepath,
but path
must then be NULL
. Expects a character
. If the vector is
longer than 1
, multiple files will be read in into one Tibble.
path
: Optional path for the dataset(s). NULL
is the default. Expects
a character
.
n_max
: maximum number of lines to read. Default is Inf
.
tz
: Timezone of the data. "UTC"
is the default. Expects a
character
. You can look up the supported timezones with OlsonNames()
.
Id.colname
: Lets you specify a column for the id of a dataset. Expects a
symbol (Default is Id
). This column will be used for grouping
(dplyr::group_by()
).
auto.id
: If the Id.colname
column is not part of the dataset
, the Id
can be automatically extracted from the filename. The argument expects a
regular expression regex and will by default just give the whole filename
without file extension.
manual.id
: If this argument is not NULL
, and no Id
column is part
of the dataset
, this character
scalar will be used. We discourage the
use of this arguments when importing more than one file
silent
: If set to TRUE
, the function will not print a summary message
of the import or plot the overview. Default is FALSE
.
locale
: The locale controls defaults that vary from place to place.
dst_adjustment
: If a file crosses daylight savings time, but the device does not adjust time stamps accordingly, you can set this argument to TRUE
, to apply this shift manually. It is selective, so it will only be done in files that cross between DST and standard time. Default is FALSE
. Uses dst_change_handler()
to do the adjustment. Look there for more infos. It is not equipped to handle two jumps in one file (so back and forth between DST and standard time), but will work fine if jums occur in separate files.
auto.plot
: a logical on whether to call gg_overview()
after import. Default is TRUE
. But is set to FALSE
if the argument silent
is set to TRUE
.
...
: supply additional arguments to the readr import functions, like na
. Might also be used to supply arguments to the specific import functions, like column_names
for Actiwatch_Spectrum
devices. Those devices will always throw a helpful error message if you forget to supply the necessary arguments.
If the Id
column is already part of the dataset
it will just use this
column. If the column is not present it will add this column and fill it
with the filename of the importfile (see param auto.id
).
print_n
can be used if you want to see more rows from the observation intervals
remove_duplicates
can be used if identical observations are present within or across multiple files. The default is FALSE
. The function keeps only unique observations (=rows) if set to' TRUE'. This is a convenience implementation of dplyr::distinct()
.
Tibble/Dataframe with a POSIXct column for the datetime
The set of import functions provide a convenient way to
import light logger data that is then perfectly formatted to add metadata,
make visualizations and analyses. There are a number of devices supported,
where import should just work out of the box. To get an overview, you can
simply call the supported_devices()
dataset. The list will grow
continuously as the package is maintained.
supported_devices() #> [1] "ActLumus" "ActTrust" "Actiwatch_Spectrum" #> [4] "Actiwatch_Spectrum_de" "Circadian_Eye" "DeLux" #> [7] "GENEActiv_GGIR" "Kronowise" "LIMO" #> [10] "LYS" "LiDo" "LightWatcher" #> [13] "MotionWatch8" "OcuWEAR" "Speccy" #> [16] "SpectraWear" "VEET" "nanoLambda"
Manufacturer: Condor Instruments
Model: ActLumus
Implemented: Sep 2023
A sample file is provided with the package, it can be accessed through
system.file("extdata/205_actlumus_Log_1020_20230904101707532.txt.zip", package = "LightLogR")
. It does not need to be unzipped to be imported.
This sample file is a good example for a regular dataset without gaps
Manufacturer: LYS Technologies
Model: LYS Button
Implemented: Sep 2023
A sample file is provided with the package, it can be accessed through
system.file("extdata/sample_data_LYS.csv", package = "LightLogR")
. This
sample file is a good example for an irregular dataset.
Manufacturer: Philips Respironics
Model: Actiwatch Spectrum
Implemented: Nov 2023 / July 2024
Important note: The Actiwatch_Spectrum
function is for an international/english formatting. The Actiwatch_Spectrum_de
function is for a german formatting, which slightly differs in the datetime format, the column names, and the decimal separator.
Manufacturer: Condor Instruments
Model: ActTrust1, ActTrust2
Implemented: Mar 2024
This function works for both ActTrust 1 and 2 devices
Manufacturer: Monash University
Model: Speccy
Implemented: Feb 2024
Manufacturer: Intelligent Automation Inc
Model: DeLux
Implemented: Dec 2023
Manufacturer: University of Lucerne
Model: LiDo
Implemented: Nov 2023
Manufacturer: University of Manchester
Model: SpectraWear
Implemented: May 2024
Manufacturer: NanoLambda
Model: XL-500 BLE
Implemented: May 2024
Manufacturer: Object-Tracker
Model: LightWatcher
Implemented: June 2024
Manufacturer: Meta Reality Labs
Model: VEET
Implemented: July 2024
Required Argument: modality
A character scalar describing the
modality to be imported from. Can be one of "ALS"
(Ambient light sensor),
"IMU"
(Inertial Measurement Unit), "INF"
(Information),
"PHO"
(Spectral Sensor), "TOF"
(Time of Flight)
Manufacturer: Max-Planck-Institute for Biological Cybernetics, Tübingen
Model: melanopiQ Circadian Eye (Prototype)
Implemented: July 2024
Manufacturer: Kronohealth
Model: Kronowise
Implemented: July 2024
Manufacturer: Activeinsights
Model: GENEActiv
Note: This import function takes GENEActiv data that was preprocessed
through the GGIR package.
By default, GGIR
aggregates light data into intervals of 15 minutes. This
can be set by the windowsizes
argument in GGIR, which is a three-value
vector, where the second values is set to 900 seconds by default. To import
the preprocessed data with LightLogR
, the filename
argument requires a
path to the parent directory of the GGIR output folders, specifically the
meta
folder, which contains the light exposure data. Multiple filename
s
can be specified, each of which needs to be a path to a different GGIR
parent directory. GGIR exports can contain data from multiple participants,
these will always be imported fully by providing the parent directory. Use
the pattern
argument to extract sensible Id
s from the .RData
filenames within the meta/basic/ folder. As per the author,
Dr. Vincent van Hees, GGIR preprocessed data are
always in local time, provided the desiredtz
/configtz
are properly set
in GGIR. LightLogR
still requires a timezone to be set, but will not
timeshift the import data.
Manufacturer: CamNtech
Implemented: September 2024
Manufacturer: ENTPE
Implemented: September 2024
LIMO exports LIGHT
data and IMU
(inertia measurements, also UV) in separate files. Both can be read in with this function, but not at the same time. Please decide what type of data you need and provide the respective filenames.
Manufacturer: Ocutune
Implemented: September 2024
OcuWEAR data contains spectral data. Due to the format of the data file, the spectrum is not directly part of the tibble, but rather a list column of tibbles within the imported data, containing a Wavelength
(nm) and Intensity
(mW/m^2) column.
To import a file, simple specify the filename (and path) and feed it to the
import_Dataset
function. There are sample datasets for all devices.
The import functions provide a basic overview of the data after import, such as the intervals between measurements or the start and end dates.
filepath <- system.file("extdata/sample_data_LYS.csv", package = "LightLogR") dataset <- import_Dataset("LYS", filepath, auto.plot = FALSE) #> #> Successfully read in 11'422 observations across 1 Ids from 1 LYS-file(s). #> Timezone set is UTC. #> The system timezone is Europe/Berlin. Please correct if necessary! #> #> First Observation: 2023-06-21 00:00:12 #> Last Observation: 2023-06-22 23:59:48 #> Timespan: 2 days #> #> Observation intervals: #> Id interval.time n pct #> 1 sample_data_LYS 15s 10015 87.689% #> 2 sample_data_LYS 16s 1367 11.969% #> 3 sample_data_LYS 17s 23 0.201% #> 4 sample_data_LYS 18s 16 0.140%
Import functions can also be called directly:
filepath <- system.file("extdata/205_actlumus_Log_1020_20230904101707532.txt.zip", package = "LightLogR") dataset <- import$ActLumus(filepath, auto.plot = FALSE) #> #> Successfully read in 61'016 observations across 1 Ids from 1 ActLumus-file(s). #> Timezone set is UTC. #> The system timezone is Europe/Berlin. Please correct if necessary! #> #> First Observation: 2023-08-28 08:47:54 #> Last Observation: 2023-09-04 10:17:04 #> Timespan: 7.1 days #> #> Observation intervals: #> Id interval.time n pct #> 1 205_actlumus_Log_1020_20230904101707532.txt 10s 61015 100% dataset %>% gg_days()
dataset %>% dplyr::select(Datetime, TEMPERATURE, LIGHT, MEDI, Id) %>% dplyr::slice(1500:1505) #> # A tibble: 6 x 5 #> # Groups: Id [1] #> Datetime TEMPERATURE LIGHT MEDI Id #> <dttm> <dbl> <dbl> <dbl> <fct> #> 1 2023-08-28 12:57:44 26.9 212. 202. 205_actlumus_Log_1020_20230904101~ #> 2 2023-08-28 12:57:54 26.9 208. 199. 205_actlumus_Log_1020_20230904101~ #> 3 2023-08-28 12:58:04 26.9 205. 196. 205_actlumus_Log_1020_20230904101~ #> 4 2023-08-28 12:58:14 26.8 204. 194. 205_actlumus_Log_1020_20230904101~ #> 5 2023-08-28 12:58:24 26.9 203. 194. 205_actlumus_Log_1020_20230904101~ #> 6 2023-08-28 12:58:34 26.8 204. 195. 205_actlumus_Log_1020_20230904101~
Datetimes
of Statechanges
Auxiliary data greatly enhances data analysis. This function allows the
import of files that contain Statechanges
, i.e., specific time points of
when a State
(like sleep
or wake
) starts.
import_Statechanges( filename, path = NULL, sep = ",", dec = ".", structure = c("wide", "long"), Datetime.format = "ymdHMS", tz = "UTC", State.colnames, State.encoding = State.colnames, Datetime.column = Datetime, Id.colname, State.newname = State, Id.newname = Id, keep.all = FALSE, silent = FALSE )
import_Statechanges( filename, path = NULL, sep = ",", dec = ".", structure = c("wide", "long"), Datetime.format = "ymdHMS", tz = "UTC", State.colnames, State.encoding = State.colnames, Datetime.column = Datetime, Id.colname, State.newname = State, Id.newname = Id, keep.all = FALSE, silent = FALSE )
filename |
Filename(s) for the Dataset. Can also contain the filepath,
but |
path |
Optional path for the dataset(s). |
sep |
String that separates columns in the import file. Defaults to
|
dec |
String that indicates a decimal separator in the import file.
Defaults to |
structure |
String that specifies whether the import file is in the
|
Datetime.format |
String that specifies the format of the |
tz |
Timezone of the data. |
State.colnames |
Column name or vector of column names (the latter only
in the
|
State.encoding |
In the |
Datetime.column |
Symbol of the
|
Id.colname |
Symbol of the column that contains the |
State.newname |
Symbol of the column that will contain the |
Id.newname |
Column name used for renaming the |
keep.all |
Logical that specifies whether all columns should be
kept in the output. Defaults to |
silent |
Logical that specifies whether a summary of the
imported data should be shown. Defaults to |
Data can be present in the long or wide format.
In the wide
format, multiple Datetime
columns indicate the state through the column name. These get pivoted to the long
format and can be recoded through the State.encoding
argument.
In the long
format, one column indicates the State
, while the other gives the Datetime
.
a dataset with the ID
, State
, and Datetime
columns. May contain
additional columns if keep.all
is TRUE
.
#get the example file from within the package path <- system.file("extdata/", package = "LightLogR") file.sleep <- "205_sleepdiary_all_20230904.csv" #import Data in the wide format (sleep/wake times) import_Statechanges(file.sleep, path, Datetime.format = "dmyHM", State.colnames = c("sleep", "offset"), State.encoding = c("sleep", "wake"), Id.colname = record_id, sep = ";", dec = ",") #import in the long format (Comments on sleep) import_Statechanges(file.sleep, path, Datetime.format = "dmyHM", State.colnames = "comments", Datetime.column = sleep, Id.colname = record_id, sep = ";", dec = ",", structure = "long")
#get the example file from within the package path <- system.file("extdata/", package = "LightLogR") file.sleep <- "205_sleepdiary_all_20230904.csv" #import Data in the wide format (sleep/wake times) import_Statechanges(file.sleep, path, Datetime.format = "dmyHM", State.colnames = c("sleep", "offset"), State.encoding = c("sleep", "wake"), Id.colname = record_id, sep = ";", dec = ",") #import in the long format (Comments on sleep) import_Statechanges(file.sleep, path, Datetime.format = "dmyHM", State.colnames = "comments", Datetime.column = sleep, Id.colname = record_id, sep = ";", dec = ",", structure = "long")
This function calculates the variability of 24h light exposure patterns across multiple days. Calculated as the ratio of the variance of the average daily pattern to the total variance across all days. Calculated with mean hourly light levels. Ranges between 0 (Gaussian noise) and 1 (Perfect Stability).
interdaily_stability( Light.vector, Datetime.vector, na.rm = FALSE, as.df = FALSE )
interdaily_stability( Light.vector, Datetime.vector, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Datetime.vector |
Vector containing the time data. Must be POSIXct. |
na.rm |
Logical. Should missing values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Note that this metric will always be 1 if the data contains only one 24 h day.
Numeric value or dataframe with column 'IS'.
Van Someren, E. J. W., Swaab, D. F., Colenda, C. C., Cohen, W., McCall, W. V., & Rosenquist, P. B. (1999). Bright Light Therapy: Improved Sensitivity to Its Effects on Rest-Activity Rhythms in Alzheimer Patients by Application of Nonparametric Methods. Chronobiology International, 16(4), 505–518. doi:10.3109/07420529908998724
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
set.seed(1) N <- 24 * 7 # Calculate metric for seven 24 h days with two measurements per hour dataset1 <- tibble::tibble( Id = rep("A", N * 2), Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))), MEDI = sample(1:1000, N * 2) ) dataset1 %>% dplyr::summarise( "Interdaily stability" = interdaily_stability(MEDI, Datetime) )
set.seed(1) N <- 24 * 7 # Calculate metric for seven 24 h days with two measurements per hour dataset1 <- tibble::tibble( Id = rep("A", N * 2), Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))), MEDI = sample(1:1000, N * 2) ) dataset1 %>% dplyr::summarise( "Interdaily stability" = interdaily_stability(MEDI, Datetime) )
This function can make use of Interval
data that contain States
(like
"sleep"
, "wake"
, "wear"
) and add a column to a light logger dataset
,
where the State
of every Datetime
is specified, based on the
participant's Id
.
interval2state( dataset, State.interval.dataset, Datetime.colname = Datetime, State.colname = State, Interval.colname = Interval, Id.colname.dataset = Id, Id.colname.interval = Id, overwrite = FALSE, output.dataset = TRUE )
interval2state( dataset, State.interval.dataset, Datetime.colname = Datetime, State.colname = State, Interval.colname = Interval, Id.colname.dataset = Id, Id.colname.interval = Id, overwrite = FALSE, output.dataset = TRUE )
dataset |
A light logger dataset. Expects a |
State.interval.dataset |
Name of the dataset that contains |
Datetime.colname |
column name that contains the datetime. Defaults to
|
State.colname , Interval.colname
|
Column names of the |
Id.colname.dataset , Id.colname.interval
|
Column names of the
participant's |
overwrite |
If |
output.dataset |
should the output be a |
One of
a data.frame
object identical to dataset
but with the state column added
a vector
with the states
#create a interval dataset library(tibble) library(dplyr) library(lubridate) library(rlang) library(purrr) states <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00", "2023-08-18 6:00:00", "2023-08-18 22:00:00", "2023-08-19 6:00:00", "2023-08-19 23:00:00", "2023-08-20 6:00:00", "2023-08-20 22:00:00"), State = rep(c("wake", "sleep"), 6), Wear = rep(c("wear", "no wear"), 6), Performance = rep(c(100, 0), 6), Id = "Participant") intervals <- sc2interval(states) #create a dataset with states dataset_with_states <- sample.data.environment %>% interval2state(State.interval.dataset = intervals) #visualize the states - note that the states are only added to the respective ID in the dataset library(ggplot2) ggplot(dataset_with_states, aes(x = Datetime, y = MEDI, color = State)) + geom_point() + facet_wrap(~Id, ncol = 1) #import multiple State columns from the interval dataset #interval2state will only add a single State column to the dataset, #which represents sleep/wake in our case dataset_with_states[8278:8283,] #if we want to add multiple columns we can either perfom the function #multiple times with different states: dataset_with_states2 <- dataset_with_states %>% interval2state(State.interval.dataset = intervals, State.colname = Wear) dataset_with_states2[8278:8283,] #or we can use `purrr::reduce` to add multiple columns at once dataset_with_states3 <- syms(c("State", "Wear", "Performance")) %>% reduce(\(x,y) interval2state(x, State.interval.dataset = intervals, State.colname = !!y), .init = sample.data.environment) #Note: # - the State.colnames have to be provided as symbols (`rlang::syms`) # - the reduce function requires a two argument function `\(x,y)`, where `x` # is the dataset to be continiously modified and `y` is the symbol of the # State column name to be added # - the `!!` operator from `rlang` is used to exchange `y` with each symbol # - the `.init` argument is the initial dataset to be modified #this results in all states being applied dataset_with_states3[8278:8283,]
#create a interval dataset library(tibble) library(dplyr) library(lubridate) library(rlang) library(purrr) states <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00", "2023-08-18 6:00:00", "2023-08-18 22:00:00", "2023-08-19 6:00:00", "2023-08-19 23:00:00", "2023-08-20 6:00:00", "2023-08-20 22:00:00"), State = rep(c("wake", "sleep"), 6), Wear = rep(c("wear", "no wear"), 6), Performance = rep(c(100, 0), 6), Id = "Participant") intervals <- sc2interval(states) #create a dataset with states dataset_with_states <- sample.data.environment %>% interval2state(State.interval.dataset = intervals) #visualize the states - note that the states are only added to the respective ID in the dataset library(ggplot2) ggplot(dataset_with_states, aes(x = Datetime, y = MEDI, color = State)) + geom_point() + facet_wrap(~Id, ncol = 1) #import multiple State columns from the interval dataset #interval2state will only add a single State column to the dataset, #which represents sleep/wake in our case dataset_with_states[8278:8283,] #if we want to add multiple columns we can either perfom the function #multiple times with different states: dataset_with_states2 <- dataset_with_states %>% interval2state(State.interval.dataset = intervals, State.colname = Wear) dataset_with_states2[8278:8283,] #or we can use `purrr::reduce` to add multiple columns at once dataset_with_states3 <- syms(c("State", "Wear", "Performance")) %>% reduce(\(x,y) interval2state(x, State.interval.dataset = intervals, State.colname = !!y), .init = sample.data.environment) #Note: # - the State.colnames have to be provided as symbols (`rlang::syms`) # - the reduce function requires a two argument function `\(x,y)`, where `x` # is the dataset to be continiously modified and `y` is the symbol of the # State column name to be added # - the `!!` operator from `rlang` is used to exchange `y` with each symbol # - the `.init` argument is the initial dataset to be modified #this results in all states being applied dataset_with_states3[8278:8283,]
This function calculates the variability of consecutive Light levels within a 24h day. Calculated as the ratio of the variance of the differences between consecutive Light levels to the total variance across the day. Calculated with mean hourly Light levels. Higher values indicate more fragmentation.
intradaily_variability( Light.vector, Datetime.vector, na.rm = FALSE, as.df = FALSE )
intradaily_variability( Light.vector, Datetime.vector, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Datetime.vector |
Vector containing the time data. Must be POSIXct. |
na.rm |
Logical. Should missing values be removed? Defaults to |
as.df |
Logical. Should the output be returned as a data frame? If |
Numeric value or dataframe with column 'IV'.
Van Someren, E. J. W., Swaab, D. F., Colenda, C. C., Cohen, W., McCall, W. V., & Rosenquist, P. B. (1999). Bright Light Therapy: Improved Sensitivity to Its Effects on Rest-Activity Rhythms in Alzheimer Patients by Application of Nonparametric Methods. Chronobiology International, 16(4), 505–518. doi:10.3109/07420529908998724
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
set.seed(1) N <- 24 * 2 # Calculate metric for two 24 h days with two measurements per hour dataset1 <- tibble::tibble( Id = rep("A", N * 2), Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))), MEDI = sample(1:1000, N * 2) ) dataset1 %>% dplyr::summarise( "Intradaily variability" = intradaily_variability(MEDI, Datetime) )
set.seed(1) N <- 24 * 2 # Calculate metric for two 24 h days with two measurements per hour dataset1 <- tibble::tibble( Id = rep("A", N * 2), Datetime = lubridate::as_datetime(0) + c(lubridate::minutes(seq(0, N * 60 - 30, 30))), MEDI = sample(1:1000, N * 2) ) dataset1 %>% dplyr::summarise( "Intradaily variability" = intradaily_variability(MEDI, Datetime) )
Join Light logging datasets that have a common structure. The least commonality are identical columns for Datetime
and Id
across all sets.
join_datasets( ..., Datetime.column = Datetime, Id.column = Id, add.origin = FALSE, debug = FALSE )
join_datasets( ..., Datetime.column = Datetime, Id.column = Id, add.origin = FALSE, debug = FALSE )
... |
|
Datetime.column , Id.column
|
Column names for the |
add.origin |
Should a column named |
debug |
Output changes to a tibble indicating which dataset is missing the respective |
One of
a data.frame
of joined datasets
a tibble
of datasets with missing columns. Only if debug = TRUE
#load in two datasets path <- system.file("extdata", package = "LightLogR") file.LL <- "205_actlumus_Log_1020_20230904101707532.txt.zip" file.env <- "cyepiamb_CW35_Log_1431_20230904081953614.txt.zip" dataset.LL <- import$ActLumus(file.LL, path, auto.id = "^(\\d{3})") dataset.env <- import$ActLumus(file.env, path, manual.id = "CW35") #join the datasets joined <- join_datasets(dataset.LL, dataset.env) #compare the number of rows nrow(dataset.LL) + nrow(dataset.env) == nrow(joined) #debug, when set to TRUE, will output a tibble of datasets with missing necessary columns dataset.LL <- dataset.LL %>% dplyr::select(-Datetime) join_datasets(dataset.LL, dataset.env, debug = TRUE)
#load in two datasets path <- system.file("extdata", package = "LightLogR") file.LL <- "205_actlumus_Log_1020_20230904101707532.txt.zip" file.env <- "cyepiamb_CW35_Log_1431_20230904081953614.txt.zip" dataset.LL <- import$ActLumus(file.LL, path, auto.id = "^(\\d{3})") dataset.env <- import$ActLumus(file.env, path, manual.id = "CW35") #join the datasets joined <- join_datasets(dataset.LL, dataset.env) #compare the number of rows nrow(dataset.LL) + nrow(dataset.env) == nrow(joined) #debug, when set to TRUE, will output a tibble of datasets with missing necessary columns dataset.LL <- dataset.LL %>% dplyr::select(-Datetime) join_datasets(dataset.LL, dataset.env, debug = TRUE)
Returns the import expression for all device in LightLogR.
ll_import_expr()
ll_import_expr()
These expressions are used to import and prepare data from specific devices.
The list is made explicit, so that a user, requiring slight changes to the
import functions, (e.g., because a timestamp is formatted differently) can
modify or add to the list. The list can be turned into a fully functional
import function through import_adjustment()
.
A list of import expressions for all supported devices
import_Dataset, import_Dataset
ll_import_expr()[1]
ll_import_expr()[1]
This function calculates the timing corresponding to half of the cumulative light exposure within the given time series.
midpointCE(Light.vector, Time.vector, na.rm = FALSE, as.df = FALSE)
midpointCE(Light.vector, Time.vector, na.rm = FALSE, as.df = FALSE)
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
na.rm |
Logical. Should missing values be removed for the calculation? If |
as.df |
Logical. Should the output be returned as a data frame? If |
Single column data frame or vector.
Shochat, T., Santhi, N., Herer, P., Flavell, S. A., Skeldon, A. C., & Dijk, D.-J. (2019). Sleep Timing in Late Autumn and Late Spring Associates With Light Exposure Rather Than Sun Time in College Students. Frontiers in Neuroscience, 13. doi:10.3389/fnins.2019.00882
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Datetime) ) # Dataset with HMS time vector dataset2 <- tibble::tibble( Id = rep("A", 24), Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Time) ) # Dataset with duration time vector dataset3 <- tibble::tibble( Id = rep("A", 24), Hour = lubridate::duration(0:23, "hours"), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset3 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Hour) )
dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset1 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Datetime) ) # Dataset with HMS time vector dataset2 <- tibble::tibble( Id = rep("A", 24), Time = hms::as_hms(lubridate::as_datetime(0) + lubridate::hours(0:23)), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset2 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Time) ) # Dataset with duration time vector dataset3 <- tibble::tibble( Id = rep("A", 24), Hour = lubridate::duration(0:23, "hours"), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) dataset3 %>% dplyr::reframe( "Midpoint of cmulative exposure" = midpointCE(MEDI, Hour) )
This is a niche helper function to normalize counts. Some sensors provide raw
counts and gain levels as part of their output. In some cases it is desirable
to compare counts between sensors, e.g., to gauge daylight outside by
comparing UV counts to photopic counts (a high ratio of UV/Pho indicates
outside daylight). Or to gauge daylight inside by comparing IR counts to
photopic counts (a high ratio of IR/Pho with a low ratio of UV/Pho indicates
daylight in the context of LED or fluorescent lighting). The user can provide
their own gain ratiotable, or use a table provided for a sensor in the
gain.ratio.table
dataset from LightLogR
.
normalize_counts(dataset, gain.columns, count.columns, gain.ratio.table)
normalize_counts(dataset, gain.columns, count.columns, gain.ratio.table)
dataset |
a |
gain.columns |
a |
count.columns |
a |
gain.ratio.table |
a two-column tibble containing |
an extended dataset with new columns containing normalized counts
example.table <- tibble::tibble( uvGain = c(4096, 1024, 2), visGain = c(4096, 4096, 4096), irGain = c(2,2,2), uvValue = c(692, 709, 658), visValue = c(128369, 129657, 128609), irValue = c(122193, 127113, 124837)) gain.columns = c("uvGain", "visGain", "irGain") count.columns = c("uvValue", "visValue", "irValue") example.table |> normalize_counts(gain.columns, count.columns, gain.ratio.tables$TSL2585)
example.table <- tibble::tibble( uvGain = c(4096, 1024, 2), visGain = c(4096, 4096, 4096), irGain = c(2,2,2), uvValue = c(692, 709, 658), visValue = c(128369, 129657, 128609), irValue = c(122193, 127113, 124837)) gain.columns = c("uvGain", "visGain", "irGain") count.columns = c("uvValue", "visValue", "irValue") example.table |> normalize_counts(gain.columns, count.columns, gain.ratio.tables$TSL2585)
number_states()
creates a new column in a dataset that takes a state column
and assigns a count value to each state, rising every time a state is
replaced by another state. E.g., a column with the states "day" and "night"
will produce a column indicating whether this is "day 1", "day 2", and so
forth, as will the "night" state with "night 1", "night 2", etc. Grouping
within the input dataset is respected, i.e., the count will reset for each
group.
number_states(dataset, state.colname, colname.extension = ".count")
number_states(dataset, state.colname, colname.extension = ".count")
dataset |
A |
state.colname |
Column name that contains the state. Expects a |
colname.extension |
The extension that is added to the state name to
create the new column. Defaults to |
The state column is not limited to two states, but can have as many states as
needed. Also, it does not matter in which time frames these states change, so
they do not necessarily conform to a 24-hour day. NA
values will be treated
as their own state.
Gaps in the data can lead to non-sensible outcomes, e.g. if there is no
in-between state/observation between a day state at "18:00:00" and a day
state at "6:00:00" - this would be counted as day 1
still. In these cases,
the gap_handler()
function can be useful to a priori add observations.
The input dataset
with an additional column that counts the
occurrences of each state. The new column will of type character
dataset <- tibble::tibble( state = c("day", "day", "day", "night", "night", "day", "day", "night", "night", "night", "day", "night") ) number_states(dataset, state) #example with photoperiods, calculating the mean values for each day and night coordinates <- c(48.52, 9.06) sample.data.environment |> add_photoperiod(coordinates) |> number_states(photoperiod.state) |> dplyr::group_by(photoperiod.state.count, .add = TRUE) |> dplyr::summarize(mean_MEDI = mean(MEDI)) |> tail(13)
dataset <- tibble::tibble( state = c("day", "day", "day", "night", "night", "day", "day", "night", "night", "night", "day", "night") ) number_states(dataset, state) #example with photoperiods, calculating the mean values for each day and night coordinates <- c(48.52, 9.06) sample.data.environment |> add_photoperiod(coordinates) |> number_states(photoperiod.state) |> dplyr::group_by(photoperiod.state.count, .add = TRUE) |> dplyr::summarize(mean_MEDI = mean(MEDI)) |> tail(13)
This function calculates the non-visual circadian response (nvRC). It takes into account the assumed response dynamics of the non-visual system and the circadian rhythm and processes the light exposure signal to quantify the effective circadian-weighted input to the non-visual system (see Details).
nvRC( MEDI.vector, Illuminance.vector, Time.vector, epoch = "dominant.epoch", sleep.onset = NULL )
nvRC( MEDI.vector, Illuminance.vector, Time.vector, epoch = "dominant.epoch", sleep.onset = NULL )
MEDI.vector |
Numeric vector containing the melanopic EDI data. |
Illuminance.vector |
Numeric vector containing the Illuminance data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
sleep.onset |
The time of habitual sleep onset. Can be HMS, numeric, or NULL.
If NULL (the default), then the data is assumed to start at habitual sleep onset.
If |
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
A numeric vector containing the nvRC data. The output has the same
length as Time.vector
.
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
dataset1 <- tibble::tibble( Id = rep("B", 60 * 48), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60), rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60) ) # Time.vector as POSIXct dataset1.nvRC <- dataset1 %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) # Time.vector as difftime dataset2 <- dataset1 %>% dplyr::mutate(Datetime = Datetime - lubridate::as_datetime(lubridate::dhours(22))) dataset2.nvRC <- dataset2 %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = lubridate::dhours(0)) )
dataset1 <- tibble::tibble( Id = rep("B", 60 * 48), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60), rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60) ) # Time.vector as POSIXct dataset1.nvRC <- dataset1 %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) # Time.vector as difftime dataset2 <- dataset1 %>% dplyr::mutate(Datetime = Datetime - lubridate::as_datetime(lubridate::dhours(22))) dataset2.nvRC <- dataset2 %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = lubridate::dhours(0)) )
These functions compare the non-visual circadian response (see nvRC
)
for measured personal light exposure to the nvRC for a reference light exposure pattern,
such as daylight.
nvRC_circadianDisturbance(nvRC, nvRC.ref, as.df = FALSE) nvRC_circadianBias(nvRC, nvRC.ref, as.df = FALSE) nvRC_relativeAmplitudeError(nvRC, nvRC.ref, as.df = FALSE)
nvRC_circadianDisturbance(nvRC, nvRC.ref, as.df = FALSE) nvRC_circadianBias(nvRC, nvRC.ref, as.df = FALSE) nvRC_relativeAmplitudeError(nvRC, nvRC.ref, as.df = FALSE)
nvRC |
Time series of non-visual circadian response
(see |
nvRC.ref |
Time series of non-visual circadian response
circadian response (see |
as.df |
Logical. Should the output be returned as a data frame? Defaults to TRUE. |
nvRC_circadianDisturbance()
calculates the circadian disturbance (CD).
It is expressed as
and quantifies the total difference between the measured circadian response and the circadian response to a reference profile.
nvRC_circadianBias()
calculates the circadian bias (CB).
It is expressed as
and provides a measure of the overall trend for the difference in circadian response, i.e. positive values for overestimating and negative for underestimating between the measured circadian response and the circadian response to a reference profile.
nvRC_relativeAmplitudeError()
calculates the relative amplitude error (RAE).
It is expressed as
and quantifies the difference between the maximum response achieved in a period to the reference signal.
A numeric value or single column data frame.
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
dataset1 <- tibble::tibble( Id = rep("B", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60), ) %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) dataset.reference <- tibble::tibble( Id = rep("Daylight", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*6), rep(10000, 12*60), rep(0, 60*6)), MEDI = Illuminance ) %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) # Circadian disturbance nvRC_circadianDisturbance(dataset1$nvRC, dataset.reference$nvRC) # Circadian bias nvRC_circadianBias(dataset1$nvRC, dataset.reference$nvRC) # Relative amplitude error nvRC_relativeAmplitudeError(dataset1$nvRC, dataset.reference$nvRC)
dataset1 <- tibble::tibble( Id = rep("B", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60), ) %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) dataset.reference <- tibble::tibble( Id = rep("Daylight", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*6), rep(10000, 12*60), rep(0, 60*6)), MEDI = Illuminance ) %>% dplyr::mutate( nvRC = nvRC(MEDI, Illuminance, Datetime, sleep.onset = hms::as_hms("22:00:00")) ) # Circadian disturbance nvRC_circadianDisturbance(dataset1$nvRC, dataset.reference$nvRC) # Circadian bias nvRC_circadianBias(dataset1$nvRC, dataset.reference$nvRC) # Relative amplitude error nvRC_relativeAmplitudeError(dataset1$nvRC, dataset.reference$nvRC)
This function calculates the non-visual direct response (nvRD). It takes into account the assumed response dynamics of the non-visual system and processes the light exposure signal to quantify the effective direct input to the non-visual system (see Details).
nvRD(MEDI.vector, Illuminance.vector, Time.vector, epoch = "dominant.epoch")
nvRD(MEDI.vector, Illuminance.vector, Time.vector, epoch = "dominant.epoch")
MEDI.vector |
Numeric vector containing the melanopic EDI data. |
Illuminance.vector |
Numeric vector containing the Illuminance data. |
Time.vector |
Vector containing the time data. Can be |
epoch |
The epoch at which the data was sampled. Can be either a
|
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
A numeric vector containing the nvRD data. The output has the same
length as Time.vector
.
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
# Dataset 1 with 24h measurement dataset1 <- tibble::tibble( Id = rep("A", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60) ) # Dataset 2 with 48h measurement dataset2 <- tibble::tibble( Id = rep("B", 60 * 48), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60), rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60) ) # Combined datasets dataset.combined <- rbind(dataset1, dataset2) # Calculate nvRD per ID dataset.combined.nvRD <- dataset.combined %>% dplyr::group_by(Id) %>% dplyr::mutate( nvRD = nvRD(MEDI, Illuminance, Datetime) )
# Dataset 1 with 24h measurement dataset1 <- tibble::tibble( Id = rep("A", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60) ) # Dataset 2 with 48h measurement dataset2 <- tibble::tibble( Id = rep("B", 60 * 48), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*48-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60), rep(0, 60*8), rep(sample(1:1000, 16, replace = TRUE), each = 60)), MEDI = Illuminance * rep(sample(0.5:1.5, 48, replace = TRUE), each = 60) ) # Combined datasets dataset.combined <- rbind(dataset1, dataset2) # Calculate nvRD per ID dataset.combined.nvRD <- dataset.combined %>% dplyr::group_by(Id) %>% dplyr::mutate( nvRD = nvRD(MEDI, Illuminance, Datetime) )
This function calculates the cumulative non-visual direct response (nvRD). This is basically the integral of the nvRD over the provided time period in hours. The unit of the resulting value thus is "nvRD*h".
nvRD_cumulative_response( nvRD, Time.vector, epoch = "dominant.epoch", as.df = FALSE )
nvRD_cumulative_response( nvRD, Time.vector, epoch = "dominant.epoch", as.df = FALSE )
nvRD |
Numeric vector containing the non-visual direct response.
See |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
as.df |
Logical. Should a data frame with be returned? If |
A numeric value or single column data frame.
Amundadottir, M.L. (2016). Light-driven model for identifying indicators of non-visual health potential in the built environment [Doctoral dissertation, EPFL]. EPFL infoscience. doi:10.5075/epfl-thesis-7146
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
dataset1 <- tibble::tibble( Id = rep("A", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 14, replace = TRUE), each = 60), rep(0, 60*2)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60) ) %>% dplyr::mutate( nvRD = nvRD(MEDI, Illuminance, Datetime) ) dataset1 %>% dplyr::summarise( "cumulative nvRD" = nvRD_cumulative_response(nvRD, Datetime) )
dataset1 <- tibble::tibble( Id = rep("A", 60 * 24), Datetime = lubridate::as_datetime(0) + lubridate::minutes(0:(60*24-1)), Illuminance = c(rep(0, 60*8), rep(sample(1:1000, 14, replace = TRUE), each = 60), rep(0, 60*2)), MEDI = Illuminance * rep(sample(0.5:1.5, 24, replace = TRUE), each = 60) ) %>% dplyr::mutate( nvRD = nvRD(MEDI, Illuminance, Datetime) ) dataset1 %>% dplyr::summarise( "cumulative nvRD" = nvRD_cumulative_response(nvRD, Datetime) )
This function finds the length of the longest continous period above/below a specified threshold light level or within a specified range of light levels.
period_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, epoch = "dominant.epoch", loop = FALSE, na.replace = FALSE, na.rm = FALSE, as.df = FALSE )
period_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, epoch = "dominant.epoch", loop = FALSE, na.replace = FALSE, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the period of light levels above or
below threshold should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the period of light levels within the two thresholds will be calculated. |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
loop |
Logical. Should the data be looped? Defaults to |
na.replace |
Logical. Should missing values (NA) be replaced
for the calculation? If |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
If |
as.df |
Logical. Should a data frame be returned? If |
A duration object (see duration
) as single value,
or single column data frame.
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
pulses_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
N <- 60 # Dataset with continous period of >250lx for 35min dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = c(sample(1:249, N-35, replace = TRUE), sample(250:1000, 35, replace = TRUE)) ) dataset1 %>% dplyr::reframe("Period >250lx" = period_above_threshold(MEDI, Datetime, threshold = 250)) dataset1 %>% dplyr::reframe("Period <250lx" = period_above_threshold(MEDI, Datetime, "below", threshold = 250)) # Dataset with continous period of 100-250lx for 20min dataset2 <- tibble::tibble( Id = rep("B", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = c(sample(c(1:99, 251-1000), N-20, replace = TRUE), sample(100:250, 20, replace = TRUE)), ) dataset2 %>% dplyr::reframe("Period 250lx" = period_above_threshold(MEDI, Datetime, threshold = c(100,250))) # Return data frame dataset1 %>% dplyr::reframe(period_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE))
N <- 60 # Dataset with continous period of >250lx for 35min dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = c(sample(1:249, N-35, replace = TRUE), sample(250:1000, 35, replace = TRUE)) ) dataset1 %>% dplyr::reframe("Period >250lx" = period_above_threshold(MEDI, Datetime, threshold = 250)) dataset1 %>% dplyr::reframe("Period <250lx" = period_above_threshold(MEDI, Datetime, "below", threshold = 250)) # Dataset with continous period of 100-250lx for 20min dataset2 <- tibble::tibble( Id = rep("B", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = c(sample(c(1:99, 251-1000), N-20, replace = TRUE), sample(100:250, 20, replace = TRUE)), ) dataset2 %>% dplyr::reframe("Period 250lx" = period_above_threshold(MEDI, Datetime, threshold = c(100,250))) # Return data frame dataset1 %>% dplyr::reframe(period_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE))
A family of functions to extract and add photoperiod information.
photoperiod()
creates a tibble
with the calculated times of dawn and dusk
for the given location and date. The function is a convenience wrapper for
suntools::crepuscule()
to calculate the times of dawn and dusk. By default,
civil dawn and dusk are calculated, but the function can be used to calculate
other times by changing the solarDep
parameter (e.g., 0 for sunrise/sunset,
12 for nautical, and 18 for astronomical).
Taking a light exposure dataset as input, extract_photoperiod()
calculates
the photoperiods and their boundary times for each unique day in the dataset,
given a location and boundary condition (i.e., the solar depression angle).
Basically, this is a convenience wrapper for photoperiod()
that takes a
light logger dataset and extracts unique dates and the time zone from the
dataset.
add_photoperiod()
adds photoperiod information to a light logger dataset.
Beyond the photoperiod information, it will categorize the
photoperiod.state
as "day"
or "night"
. If overwrite
is set to TRUE
,
the function will overwrite any columns with the same name.
solar_noon()
calculates the solar noon for a given location and date. The
function is a convenience wrapper for suntools::solarnoon()
. The function
has no companions like extract_photoperiod()
or add_photoperiod()
, but
will be extended, if there is sufficient interest.
photoperiod(coordinates, dates, tz, solarDep = 6) extract_photoperiod( dataset, coordinates, Datetime.colname = Datetime, solarDep = 6 ) add_photoperiod( dataset, coordinates, Datetime.colname = Datetime, solarDep = 6, overwrite = FALSE ) solar_noon(coordinates, dates, tz)
photoperiod(coordinates, dates, tz, solarDep = 6) extract_photoperiod( dataset, coordinates, Datetime.colname = Datetime, solarDep = 6 ) add_photoperiod( dataset, coordinates, Datetime.colname = Datetime, solarDep = 6, overwrite = FALSE ) solar_noon(coordinates, dates, tz)
coordinates |
A two element numeric vector representing the latitude and longitude of the location. Important note: Latitude is the first element and Longitude is the second element. |
dates |
A date of format |
tz |
Timezone of the data. Expects a |
solarDep |
A numerical value representing the solar depression angle
between 90 and -90. This means a value of 6 equals -6 degrees above the
horizon. Default is 6, equalling |
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
overwrite |
Logical scalar. If |
Please note that all functions of the photoperiod
family work with one
coordinate pair at a time. If you have multiple locations (and multiple time
zones), you need to run the function for each location separately. We suggest
using a nested dataframe structure, and employ the purrr
package to iterate
over the locations.
photoperiod()
returns a tibble
with the calculated times of dawn
and dusk for the given location and date, with the length equal to the
dates
input parameter . The tibble contains the following columns:
date
with the date of the calculation, stored as class Date
tz
with the timezone of the output, stored as class character
lat
and lon
with the latitude and longitude of the location, stored
as class numeric
solar.angle
with the negative solar depression angle, i.e. the sun
elevation above the horizon. stored as class numeric
dawn
and dusk
with the calculated datetimes, stored as class POSIXct
photoperiod
with the calculated photoperiod, stored as class difftime
.
extract_photoperiod()
returns a tibble
of the same structure as
photoperiod()
, but with a length equal to the number of unique dates in
the dataset.
add_photoperiod returns the input dataset with the added
photoperiod information. The information is appended with the following
columns: dawn
, dusk
, photoperiod
, and photoperiod.state
.
solar_noon()
returns a tibble
with the calculated solar noon
Other photoperiod:
gg_photoperiod()
#example für Tübingen, Germany coordinates <- c(48.521637, 9.057645) dates <- c("2023-06-01", "2025-08-23") tz <- "Europe/Berlin" #civil dawn/dusk photoperiod(coordinates, dates, tz) #sunrise/sunset photoperiod(coordinates, dates, tz, solarDep = 0) #extract_photoperiod sample.data.environment |> extract_photoperiod(coordinates) #add_photoperiod added_photoperiod <- sample.data.environment |> add_photoperiod(coordinates) added_photoperiod |> head() added_photoperiod |> filter_Date(length = "3 days") |> gg_days(aes_col = photoperiod.state, group = dplyr::consecutive_id(photoperiod.state), jco_color = TRUE ) added_photoperiod |> filter_Date(length = "3 days") |> gg_day(aes_col = Id) + ggplot2:: geom_rect( data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms()), ggplot2::aes(xmin = 0, xmax = dawn, ymin = -Inf, ymax = Inf), alpha = 0.1 ) + ggplot2:: geom_rect( data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dusk = mean(dusk) |> hms::as_hms()), ggplot2::aes(xmin = dusk, xmax = 24*60*60, ymin = -Inf, ymax = Inf), alpha = 0.1 ) added_photoperiod |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms()) #solar_noon() solar_noon(coordinates, dates, tz)
#example für Tübingen, Germany coordinates <- c(48.521637, 9.057645) dates <- c("2023-06-01", "2025-08-23") tz <- "Europe/Berlin" #civil dawn/dusk photoperiod(coordinates, dates, tz) #sunrise/sunset photoperiod(coordinates, dates, tz, solarDep = 0) #extract_photoperiod sample.data.environment |> extract_photoperiod(coordinates) #add_photoperiod added_photoperiod <- sample.data.environment |> add_photoperiod(coordinates) added_photoperiod |> head() added_photoperiod |> filter_Date(length = "3 days") |> gg_days(aes_col = photoperiod.state, group = dplyr::consecutive_id(photoperiod.state), jco_color = TRUE ) added_photoperiod |> filter_Date(length = "3 days") |> gg_day(aes_col = Id) + ggplot2:: geom_rect( data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms()), ggplot2::aes(xmin = 0, xmax = dawn, ymin = -Inf, ymax = Inf), alpha = 0.1 ) + ggplot2:: geom_rect( data = \(x) x |> dplyr::ungroup(Id) |> dplyr::summarize(dusk = mean(dusk) |> hms::as_hms()), ggplot2::aes(xmin = dusk, xmax = 24*60*60, ymin = -Inf, ymax = Inf), alpha = 0.1 ) added_photoperiod |> dplyr::summarize(dawn = mean(dawn) |> hms::as_hms()) #solar_noon() solar_noon(coordinates, dates, tz)
This function clusters the light data into continuous clusters (pulses) of light above/below a given threshold. Clustering may be fine-tuned by setting the minimum length of the clusters and by allowing brief interruptions to be included in a single cluster, with a specified maximum length of interruption episodes and proportion of total amount of interruptions to light above threshold.
pulses_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, min.length = "2 mins", max.interrupt = "8 mins", prop.interrupt = 0.25, epoch = "dominant.epoch", return.indices = FALSE, na.rm = FALSE, as.df = FALSE )
pulses_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, min.length = "2 mins", max.interrupt = "8 mins", prop.interrupt = 0.25, epoch = "dominant.epoch", return.indices = FALSE, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. Missing values will
be considered as |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the timing corresponding to light levels between the two thresholds will be calculated. |
min.length |
The minimum length of a pulse. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
max.interrupt |
Maximum length of each episode of interruptions. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
prop.interrupt |
Numeric value between |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
return.indices |
Logical. Should the cluster indices be returned? Only works if
|
na.rm |
Logical. Should missing values be removed for the calculation of
pulse metrics? Defaults to |
as.df |
Logical. Should a data frame be returned? If |
The timeseries is assumed to be regular. Missing values in the light data will be replaced by 0.
List or data frame with calculated values.
Wilson, J., Reid, K. J., Braun, R. I., Abbott, S. M., & Zee, P. C. (2018). Habitual light exposure relative to circadian timing in delayed sleep-wake phase disorder. Sleep, 41(11). doi:10.1093/sleep/zsy166
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
threshold_for_duration()
,
timing_above_threshold()
# Sample data data = sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Datetime(length = lubridate::days(1)) %>% dplyr::mutate( Time = hms::as_hms(Datetime), ) # Time vector as datetime data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE)) # Time vector as hms time data %>% dplyr::reframe(pulses_above_threshold(MEDI, Time, threshold = 250, as.df = TRUE)) # Pulses below threshold data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, "below", threshold = 250, as.df = TRUE)) # Pulses within threshold range data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = c(250,1000), as.df = TRUE))
# Sample data data = sample.data.environment %>% dplyr::filter(Id == "Participant") %>% filter_Datetime(length = lubridate::days(1)) %>% dplyr::mutate( Time = hms::as_hms(Datetime), ) # Time vector as datetime data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = 250, as.df = TRUE)) # Time vector as hms time data %>% dplyr::reframe(pulses_above_threshold(MEDI, Time, threshold = 250, as.df = TRUE)) # Pulses below threshold data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, "below", threshold = 250, as.df = TRUE)) # Pulses within threshold range data %>% dplyr::reframe(pulses_above_threshold(MEDI, Datetime, threshold = c(250,1000), as.df = TRUE))
A subset of data from a study at the TSCN-Lab using the ActLumus light logger. This dataset contains personal light exposure information for one participant over the course of six full days. The dataset is measured with a 10 second epoch and is complete (no missing values). Additionally environmental light data was captured with a second light logger mounted horizontally at the TUM university roof, without any obstructions (besides a transparent plastic halfdome). The epoch for this data is 30 seconds. This dataset allows for some interesting calculations based on available daylight at a given point in time.
sample.data.environment
sample.data.environment
sample.data.environment
A tibble with 69,120 rows and 3 columns:
POSIXct Datetime
melanopic EDI measurement data. Unit is lux.
A character
vector indicating whether the data is from the Participant
or from the Environment
.
Takes an input of datetimes
and Statechanges
and creates a column with
Intervals
. If full = TRUE
, it will also create intervals for the day
prior to the first state change and after the last. If output.dataset = FALSE
it will give a named vector, otherwise a tibble
. The state change
info requires a description or name of the state (like "sleep"
or "wake"
,
or "wear"
) that goes into effect at the given Datetime
. Works for grouped
data so that it does not mix up intervals between participants. Missing data
should be explicit if at all possible. Also, the maximum allowed length of an
interval can be set, so that implicit missing timestamps after a set period
of times can be enforced.
sc2interval( dataset, Datetime.colname = Datetime, Statechange.colname = State, State.colname = State, Interval.colname = Interval, full = TRUE, starting.state = NA, output.dataset = TRUE, Datetime.keep = FALSE, length.restriction = 60 * 60 * 24 )
sc2interval( dataset, Datetime.colname = Datetime, Statechange.colname = State, State.colname = State, Interval.colname = Interval, full = TRUE, starting.state = NA, output.dataset = TRUE, Datetime.keep = FALSE, length.restriction = 60 * 60 * 24 )
dataset |
A light logger dataset. Expects a |
Datetime.colname |
column name that contains the datetime. Defaults to
|
Statechange.colname , Interval.colname , State.colname
|
Column names that
do contain the name/description of the |
full , starting.state
|
These arguments handle the state on the first day
before the first state change and after the last state change on the last
day. If |
output.dataset |
should the output be a |
Datetime.keep |
If |
length.restriction |
If the length between intervals is too great, the
interval state can be set to |
One of
a data.frame
object identical to dataset
but with the interval instead of the datetime. The original Statechange
column now indicates the State
during the Interval
.
a named vector
with the intervals, where the names are the states
library(tibble) library(lubridate) library(dplyr) sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00"), State = rep(c("wake", "sleep"), 3), Id = "Participant") #intervals from sample sc2interval(sample) #compare sample (y) and intervals (x) sc2interval(sample) %>% mutate(Datetime = int_start(Interval)) %>% dplyr::left_join(sample, by = c("Id", "State"), relationship = "many-to-many") %>% head()
library(tibble) library(lubridate) library(dplyr) sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00"), State = rep(c("wake", "sleep"), 3), Id = "Participant") #intervals from sample sc2interval(sample) #compare sample (y) and intervals (x) sc2interval(sample) %>% mutate(Datetime = int_start(Interval)) %>% dplyr::left_join(sample, by = c("Id", "State"), relationship = "many-to-many") %>% head()
Takes a dataset with sleep/wake intervals and recodes them to Brown state
intervals. Specifically, it recodes the sleep
intervals to night
, reduces
wake
intervals by a specified evening.length
and recodes them to
evening
and day
intervals. The evening.length
is the time between day
and night
. The result can be used as input for interval2state()
and might
be used subsequently with Brown2reference()
.
sleep_int2Brown( dataset, Interval.colname = Interval, Sleep.colname = State, wake.state = "wake", sleep.state = "sleep", Brown.day = "day", Brown.evening = "evening", Brown.night = "night", evening.length = lubridate::dhours(3), Brown.state.colname = State.Brown, output.dataset = TRUE )
sleep_int2Brown( dataset, Interval.colname = Interval, Sleep.colname = State, wake.state = "wake", sleep.state = "sleep", Brown.day = "day", Brown.evening = "evening", Brown.night = "night", evening.length = lubridate::dhours(3), Brown.state.colname = State.Brown, output.dataset = TRUE )
dataset |
A dataset with sleep/wake intervals. |
Interval.colname |
The name of the column with the intervals. Defaults to |
Sleep.colname |
The name of the column with the sleep/wake states. Defaults to |
wake.state , sleep.state
|
The names of the wake and sleep states in the |
Brown.day , Brown.evening , Brown.night
|
The names of the Brown states that will be used. Defaults to |
evening.length |
The length of the evening interval in seconds. Can also use lubridate duration or period objects. Defaults to 3 hours. |
Brown.state.colname |
The name of the column with the newly created Brown states. Works as a simple renaming of the |
output.dataset |
Whether to return the whole |
The function will filter out any non-sleep intervals that are shorter than the specified evening.length
. This prevents problematic behaviour when the evening.length
is longer than the wake
intervals or, e.g., when the first state is sleep after midnight and there is a prior NA
interval from midnight till sleep. This behavior might, however, result in problematic results for specialized experimental setups with ultra short wake/sleep cycles. The sleep_int2Brown()
function would not be applicable in those cases anyways.
A dataset with the Brown states or a vector with the Brown states. The Brown states are created in a new column with the name specified in Brown.state.colname
. The dataset will have more rows than the original dataset, because the wake
intervals are split into day
and evening
intervals.
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001571
Other Brown:
Brown2reference()
,
Brown_check()
,
Brown_rec()
#create a sample dataset sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00"), State = rep(c("wake", "sleep"), 3), Id = "Participant") #intervals from sample sc2interval(sample) #recoded intervals sc2interval(sample) %>% sleep_int2Brown()
#create a sample dataset sample <- tibble::tibble(Datetime = c("2023-08-15 6:00:00", "2023-08-15 23:00:00", "2023-08-16 6:00:00", "2023-08-16 22:00:00", "2023-08-17 6:30:00", "2023-08-18 1:00:00"), State = rep(c("wake", "sleep"), 3), Id = "Participant") #intervals from sample sc2interval(sample) #recoded intervals sc2interval(sample) %>% sleep_int2Brown()
Returns a vector of all the supported devices in LightLogR.
supported_devices()
supported_devices()
These are all supported devices where there is a dedicated import function.
Import functions can be called either through import_Dataset()
with the
respective device = "device"
argument, or directly, e.g.,
import$ActLumus()
.
A character vector of all supported devices
supported_devices()
supported_devices()
To create a plot with positive and negative (unscaled) values on a log-transformed axis, the values need to be scaled accordingly. R or ggplot2 do not have a built-in function for this, but the following function can be used to create a transformation function for this purpose. The function was coded based on a post on stack overflow. The symlog
transformation is the standard transformation used e.g., in gg_day()
.
symlog_trans(base = 10, thr = 1, scale = 1)
symlog_trans(base = 10, thr = 1, scale = 1)
base |
Base for the logarithmic transformation. The default is 10. |
thr |
Threshold after which a logarithmic transformation is applied. If the absolute value is below this |
scale |
Scaling factor for logarithmically transformed values above the |
The symlog
transformation can be accessed either via the trans = "symlog"
argument in a scaling function, or via trans = symlog_trans()
. The latter allows setting the individual arguments.
a transformation function that can be used in ggplot2 or plotly to scale positive and negative values on a log scale.
This function's code is a straight copy from a post on stack overflow. The author of the answer is Julius Vainora, and the author of the question Brian B
dataset <- sample.data.environment %>% filter_Date(end = "2023-08-29") %>% dplyr::mutate(MEDI = dplyr::case_when( Id == "Environment" ~ -MEDI, .default = MEDI)) #basic application where transformation, breaks and labels are set manually dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = "symlog") #the same plot, but with breaks and labels set manually dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = "symlog", breaks = c(-10^(5:0), 0, 10^(0:5)), labels = function(x) format(x, scientific = FALSE, big.mark = " ")) #setting individual arguments of the symlog function manually allows #e.g., to emphasize values smaller than 1 dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = symlog_trans(thr = 0.01), breaks = c(-10^(5:-1), 0, 10^(-1:5)), labels = function(x) format(x, scientific = FALSE, big.mark = " "))
dataset <- sample.data.environment %>% filter_Date(end = "2023-08-29") %>% dplyr::mutate(MEDI = dplyr::case_when( Id == "Environment" ~ -MEDI, .default = MEDI)) #basic application where transformation, breaks and labels are set manually dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = "symlog") #the same plot, but with breaks and labels set manually dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = "symlog", breaks = c(-10^(5:0), 0, 10^(0:5)), labels = function(x) format(x, scientific = FALSE, big.mark = " ")) #setting individual arguments of the symlog function manually allows #e.g., to emphasize values smaller than 1 dataset %>% gg_day(aes_col = Id) + ggplot2::scale_y_continuous( trans = symlog_trans(thr = 0.01), breaks = c(-10^(5:-1), 0, 10^(-1:5)), labels = function(x) format(x, scientific = FALSE, big.mark = " "))
This function finds the threshold for which light levels are above/below for
a given duration. This function can be considered as the inverse of
duration_above_threshold
.
threshold_for_duration( Light.vector, Time.vector, duration, comparison = c("above", "below"), epoch = "dominant.epoch", na.rm = FALSE, as.df = FALSE )
threshold_for_duration( Light.vector, Time.vector, duration, comparison = c("above", "below"), epoch = "dominant.epoch", na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
duration |
The duration for which the threshold should be found. Can be either a
duration or a string. If it is a string, it needs to be a valid
duration string, e.g., |
comparison |
String specifying whether light levels above or below the threshold
should be considered. Can be either |
epoch |
The epoch at which the data was sampled. Can be either a
duration or a string. If it is a string, it needs to be
either |
na.rm |
Logical. Should missing values (NA) be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame with be returned? If |
Single numeric value or single column data frame.
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
timing_above_threshold()
N <- 60 # Dataset with 30 min < 250lx and 30min > 250lx dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2, replace = TRUE), sample(250:1000, N / 2, replace = TRUE))), ) dataset1 %>% dplyr::reframe("Threshold above which for 30 mins" = threshold_for_duration(MEDI, Datetime, duration = "30 mins")) dataset1 %>% dplyr::reframe("Threshold below which for 30 mins" = threshold_for_duration(MEDI, Datetime, duration = "30 mins", comparison = "below")) dataset1 %>% dplyr::reframe(threshold_for_duration(MEDI, Datetime, duration = "30 mins", as.df = TRUE))
N <- 60 # Dataset with 30 min < 250lx and 30min > 250lx dataset1 <- tibble::tibble( Id = rep("A", N), Datetime = lubridate::as_datetime(0) + lubridate::minutes(1:N), MEDI = sample(c(sample(1:249, N / 2, replace = TRUE), sample(250:1000, N / 2, replace = TRUE))), ) dataset1 %>% dplyr::reframe("Threshold above which for 30 mins" = threshold_for_duration(MEDI, Datetime, duration = "30 mins")) dataset1 %>% dplyr::reframe("Threshold below which for 30 mins" = threshold_for_duration(MEDI, Datetime, duration = "30 mins", comparison = "below")) dataset1 %>% dplyr::reframe(threshold_for_duration(MEDI, Datetime, duration = "30 mins", as.df = TRUE))
This function calculates the mean, first, and last timepoint (MLiT, FLiT, LLiT) where light levels are above or below a given threshold intensity within the given time interval.
timing_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, na.rm = FALSE, as.df = FALSE )
timing_above_threshold( Light.vector, Time.vector, comparison = c("above", "below"), threshold, na.rm = FALSE, as.df = FALSE )
Light.vector |
Numeric vector containing the light data. |
Time.vector |
Vector containing the time data. Can be POSIXct, hms, duration, or difftime. |
comparison |
String specifying whether the time above or below threshold
should be calculated. Can be either |
threshold |
Single numeric value or two numeric values specifying the threshold light level(s) to compare with. If a vector with two values is provided, the timing corresponding to light levels between the two thresholds will be calculated. |
na.rm |
Logical. Should missing values be removed for the calculation?
Defaults to |
as.df |
Logical. Should a data frame be returned? If |
List or dataframe with the three values: mean
, first
, and last
timing
above threshold. The output type corresponds to the type of Time.vector
,
e.g., if Time.vector
is HMS, the timing metrics will be also
HMS, and vice versa for POSIXct and numeric.
Reid, K. J., Santostasi, G., Baron, K. G., Wilson, J., Kang, J., & Zee, P. C. (2014). Timing and Intensity of Light Correlate with Body Weight in Adults. PLOS ONE, 9(4), e92251. doi:10.1371/journal.pone.0092251
Hartmeyer, S.L., Andersen, M. (2023). Towards a framework for light-dosimetry studies: Quantification metrics. Lighting Research & Technology. doi:10.1177/14771535231170500
Other metrics:
bright_dark_period()
,
centroidLE()
,
disparity_index()
,
duration_above_threshold()
,
exponential_moving_average()
,
frequency_crossing_threshold()
,
interdaily_stability()
,
intradaily_variability()
,
midpointCE()
,
nvRC()
,
nvRD()
,
nvRD_cumulative_response()
,
period_above_threshold()
,
pulses_above_threshold()
,
threshold_for_duration()
# Dataset with light > 250lx between 06:00 and 18:00 dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) # Above threshold dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, Datetime, "above", 250, as.df = TRUE)) # Below threshold dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, Datetime, "below", 10, as.df = TRUE)) # Input = HMS -> Output = HMS dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, hms::as_hms(Datetime), "above", 250, as.df = TRUE))
# Dataset with light > 250lx between 06:00 and 18:00 dataset1 <- tibble::tibble( Id = rep("A", 24), Datetime = lubridate::as_datetime(0) + lubridate::hours(0:23), MEDI = c(rep(1, 6), rep(250, 13), rep(1, 5)) ) # Above threshold dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, Datetime, "above", 250, as.df = TRUE)) # Below threshold dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, Datetime, "below", 10, as.df = TRUE)) # Input = HMS -> Output = HMS dataset1 %>% dplyr::reframe(timing_above_threshold(MEDI, hms::as_hms(Datetime), "above", 250, as.df = TRUE))