RSEE is an R package version of SEE. It was developed to make SEE more accessible to researchers, lab-technicians and students who work with R, and also to encourage more people to use R for data analysis.
RSEE contains all basic features and modules in SEE-workshop, as well as some new features:
This tutorial was written for quick but thorough learning of RSEE, for users who haven’t worked with any version of SEE before. The tutorial contains short and basic explanations of concepts and methods used in RSEE specifically and in SEE in general. Following it, you should be able to produce high quality data – a basic requirement for replicable studies. For more extensive information on SEE, check out the overview and publications.
The tutorial is intended for R users with basic knowledge in handling lists and data frames.
To install RSEE, open an R session and run:
install.packages("RSEE", repos=c("http://cs.tau.ac.il/~itamares/rsee/R","http://cran.r-project.org"))
In case of the first option, you can save the data anywhere you like through R, using the save() function.
According to user preference, RSEE can produce output in two ways:
In case of the first option, you can save the data anywhere you like through R, using the save()
function.
In case of the second option, the 12 output data files are saved in the same folder as the input tracking file. Therefore, to avoid confusion it is recommended to put each file in a separate subfolder of the working directory.
For this tutorial, create and select a working directory. For example, run:
dir.create("C:/RSEE tutorial")
setwd("C:/RSEE tutorial")
Now, create a subfolder for the tracking file you will use in the tutorial. For example:
dir.create("C:/RSEE tutorial/file_01")
RSEE analyzes raw XY tracking coordinates of a single animal. Different tracking programs produce coordinates in different file formats (*.txt
, *.csv
, *.xlsx
etc.).
RSEE requires a CSV (comma delimited) file. The file should be arranged in 4 columns in the following order:
* Note – the tracking sampling rate may differ from the filming sampling rate, especially in real-time tracking. For example – filming may be 25 Hz and tracking 5 Hz.
** Note – the XY coordinates should be the raw data, before smoothing. If your tracking software features path smoothing, disable it before exporting the data.
This is an example of a tracking CSV file, tracked at 25 Hz:
In case of forced exploration tests, preparation is complete.
In case of free exploration tests, in time segments where the animal stayed in its home cage, the coordinates should be constant at some random point outside the arena. RSEE will then be able to distinguish these segments from the entry segments.
In this example the home cage is located at X=215.316 , Y=82.559
.
To continue the tutorial, rename your file file_01.csv
and place it in the file_01
subfolder of your working directory.
see.smooth()
is the RSEE equivalent to SEE Path Smoother. It has three objectives:
This lesson should help understand key arguments of see.smooth(), and how to use them.
To access the function full documentation, run:
?RSEE::see.smooth
This section is more theoretical and provides useful information required to produce better results with see.smooth()
. If you wish to continue the practical tutorial right away, jump to the next section.
see.smooth()
arguments define general settings, LOESS settings for path smoothing and calculation of velocities and accelerations, and RRM settings for finding “arrests”. This tutorial covers the important arguments to define.
SamplingRate
– Numeric. The tracking sampling rate. In the example in lesson 1 SamplingRate = 25
.silent
– Boolean. If the tracking file columned are properly named according to lesson 1, then silent = FALSE
is unnecessary. silent = FALSE
means the user is asked if the names of the table columns in the tracking file are correct. If they were named properly, you can skip this enquiry by setting silent=TRUE
.outFilePrefix
– character string. As mentioned in lesson 1, RSEE output may be saved in data files (as in SEE-workshop). To do so, outFilePrefix
must be defined.outFilePrefix = "file_01"
will save the output data files in the tracking file directory, named file_01.txt
etc.Most LOESS defaults should not be changed unless you have the required knowledge in this method of smoothing. HalfWindowWidth = 0.4
seconds (default) should work fine for most sampling rates. Increasing it will result in stronger smoothing.
In sampling rates of 5 Hz or lower it is recommended to increase the HalfWindowWidth
value to 1 second.
This is an example of a path of mouse, smoothed with LOESS using the default arguments:
During tracking, adjacent XY coordinates may differ even though the animal hadn’t moved. For example, a stationary mouse’s body wobbles during grooming - causing its center, found by the tracking software, to change place with each frame. In order to analyze movement segments (lesson 3), we first need to isolate these non-movement segments (“arrests”). For that we use Running Medians.
Let us consider the arguments Arrests
and MovingMedCutOffValue
of see.smooth()
and hypothetically define Arrests = t
and MovingMedCutOffValue = d
for some \(t\),\(d\).
Then arrests are defined as time segments of length \(t\), where the distance between the two most distant XY coordinates after RRM is no more than \(d\).
Two should be considered when setting RRM arguments:
MovingMedCutOffValue
– Numeric. The larger the animal size compared to the arena size, the higher this value should be.Arrests
– should be 5 frames (default) for SamplingRate = 25
, and 6 frames for SamplingRate = 30
. If your tracking sampling rate is 5 Hz or lower, set 2 or 3 frames.If you followed the instruction in lesson 1, your tracking file’s full name and path should be "C:/RSEE tutorial/file_01/file_01.csv"
. For this example, name your see.smooth()
output object res, and run:
res <- see.smooth(Filename="C:/RSEE tutorial/file_01/file_01.csv",silent=TRUE)
Note: Set the function arguments according to your experiment properties. Using the wrong values will most likely lead to bad segmentation or arena builder calculations.
Now wait patiently until you get this delightful message:
Save res
in the same folder as your original CSV file by running:
save(res,file= C:/RSEE tutorial/file_01/res.RData")
res
is now a list of two data frames – res$smoothed
and res$original
:
res$original
– a data frame containing the original CSV table.res$smoothed
– a data frame containing the smoothed XY coordinates, the XY velocities and accelerations.In case of free exploration, the two additional columns are relevant:
hidden.zone
– equals 0 when the animal is inside the arena, and 1 in the home cage.
entry
– counts the number of entries made so far.
see.smooth()
output res
can now be used for segmentation.
Note: for information on the output data files produced when outFilePrefix
is defined, check out SEE-workshop tutorial.
see.segment()
is the RSEE equivalent to SEE Path Segmentor. It divides the motion segments, distinguished from arrests by see.smooth()
(lesson 2), into progression and lingering.plot.segmentation()
is complementary to see.segment()
, and provides visualization to better assess the quality of segmentation.
To access the functions full documentation, run:
?RSEE::see.segment
?RSEE::plot.segmentation
This section is more theoretical and provides useful information required to produce better results with see.segment()
. If you wish to continue the practical tutorial right away, jump to the next section.
In general, there is no need changing defaults of see.segment()
. However, there may be certain cases where NumberOfGaussians
should be changed from 2 to 3, or see.smooth()
should be run again with different MovingMedCutOffValue
.
A motion segment is defined as a time segment between two arrests. see.segment()
divides these segments into two groups – lingering, consisting of very slow movement or complete arrest, and progression, where the animal reaches high maximal speed.
see.segment()
calculates the maximal velocity per motion segment, and then applies the EM algorithm to fit 2 Gaussians to the frequency distribution (density) of these velocities after log transformation.
The intersection of the 2 Gaussians will be considered the maximal velocity threshold, and then:
This is an example of segmentation of the movement of a single mouse (represented by momentary velocity per frame) in some time frame during the experiment:
This is an example of visualization of the maximal velocities density and the EM fit, using plot.segmentation()
:
A proper density plot should have two distinct peaks – low max velocities (lingering) and high max velocities (progression). If so, RSEE will calculate the proper value.
The shape of the frequency distribution is affected directly by finding arrests. If the density plot is way off, run see.smooth()
again with different MovingMedCutOffValue
.
This in an example of setting the MovingMedCutoffValue
too high:
The argument defaults of see.segment()
hold in most cases, and the \(log(x+2)\) transformation should be kept.
In cases of three distinct peaks, there is an option to fit three Gaussians by setting NumberOfGaussian = 3
. Note that in this case RSEE calculates the threshold as the intersection between the left and the middle ones.
Continuing lesson 2, the list res
contains the original and the smoothed data.
For this example, name your see.segment()
output object res.segment
, and run:
res.segment <- see.segment(res$smoothed)
Now wait patiently until you get this message:
To assess the quality of segmentation, run:
plot.segmentation(res.segment)
If there are 2 distinct peaks in the density plot, and the Gaussian intersection is between them, then the segmentation is fine. If not, rerun see.smooth()
and see.segment()
according to the instructions in Using the correct arguments in see.segment().
Save res.segment
in the same folder as your original CSV file by running:
save(res.segment,file= C:/RSEE tutorial/file_01/res_segment.RData")
res.segment
is a list of twelve objects. This section describes the most useful:
res.segment$threshold
– the value of the Gaussian intersection (in log transformation)res.segment$thresholdNonTransformed
– the maximal velocity threshold between lingering and progression.res.segment$lingProgSegments
– a data frame that divides the experiment into segments of lingering (type = 0) and progression (type = 1), and the start and stop frame of each segment.spatial.spread
is the distance between the most distant XY coordinates during each segment.For additional information on the ouput of see.segment()
, refer to the function documentation.
res
and res.segment
can now be used for the arena builder.
see.buildArena()
is the RSEE equivalent to SEE Arena Builder. It has two major objectives:
This lesson should help understanding proper use of see.buildArena()
and its output.
To access the function full documentation, run:
?RSEE::see.buildArena
Note: see.buildArena()
is currently applicable for circular open field arenas. It will not produce sufficient results in square or rectangular arenas, plus mazes etc.
This section is more theoretical and provides useful information required to produce better results with see.buildArena()
. If you wish to continue the practical tutorial right away, jump to the next section – Running see.buildArena().
The arena builder algorithm estimates the arena boundaries, under two assumptions:
The algorithm finds the center of the arena, divides the arena into sectors (see.buildArena()
argument NumSectors
, default of 360 sectors) and calculates the coordinate of the arena boundary per sector.
While NumSectors = 360
usually produce sufficient results, we have found that NumSectors = 720 produce better results in many cases.
This is an example of see.buildArena()
output with NumSectors = 360
(left) and NumSectors = 10
(right), on the same smoothed path:
The algorithm may produce bad results if the animal does not travel along the wall. In that case, it is possible to define the boundaries manually, use the following settings in see.buildArena()
:
AutoArenaCenter = FALSE
– allows the user to define the arena center manually. You must then define ArenaCenterX
and ArenaCenterY
to your arena.RoundArenaWithRadius = TRUE
– defines the boundary coordinates per sector according to the radius set in ArenaRadius
.The wall/center separation algorithm does not depend on user definitions.
After defining arena center and boundaries, the algorithm calculates radial distance from the wall per frame, as well as the radial velocity (velocity component in the radial direction).
It then finds a threshold of radial distance from the wall, to separate the animal’s path into near-wall coordinates and center coordinates. The progression segments, found by see.segment()
, are divided into two types:
see.buildArena()
argument AddToRadius
adds a specified length in cm to the estimated radial distance during wall/center separation, to guarantee that remain positive. It is unnecessary to change it.
Continuing lesson 3, the list res
contains the original and the smoothed data, and res.segment
contains the segmentation data. For this example, name your see.buildArena()
output object res.arena
, and run:
res.arena <- see.buildArena(res$smoothed,res.segment$lingProgSegments)
Now wait patiently until you get this message:
Save res.arena
in the same folder as your original CSV file by running:
save(res.arena,file= C:/RSEE tutorial/file_01/res_arena.RData")
res.arena
is a list of ten objects. This section describes the most useful:
res.arena$xcenter
and res.arena$ycenter
– numeric. The XY coordinates of the arena center.res.arena$arenaPositions
– a data frame containing the XY coordinates of the arena boundary. The table contains NumSectors rows. z value is currently not applicable.res.arena$radDistanceThreshold
– numeric. The wall/center threshold (distance from the wall) in cm.res.arena$incursions
– a data frame containing start and stop frames for all wallcursions (type = 2) and incursions (type = 3).For additional information on the ouput of see.segment()
, refer to the function documentation.
The RSEE tutorial is now over. You should now have the knowledge to produce high quality data sets from your tracking files: * Smoothed animal path - see.smooth()
* Segmentation of the animal’s movement to progression and lingering - see.segment()
* Arena center and boundaries – see.buildArena()
* Incursions, wallcursions and radial distance threshold - see.buildArena()
.
The next step is to build your own statistical analysis based on this data, according to the requirements of your research.
For examples of studies which used SEE, refer to the publications section.
Supported by European Research Council under the uropean Community’s Seventh Framework Programme, ERC grant 294519 (PSARPS)