Introduction

RSEE is an R package version of SEE. It was developed to make SEE more accessible to researchers, lab-technicians and students who work with R, and also to encourage more people to use R for data analysis.

RSEE contains all basic features and modules in SEE-workshop, as well as some new features:

Return to the top



Getting Started

This tutorial was written for quick but thorough learning of RSEE, for users who haven’t worked with any version of SEE before. The tutorial contains short and basic explanations of concepts and methods used in RSEE specifically and in SEE in general. Following it, you should be able to produce high quality data – a basic requirement for replicable studies. For more extensive information on SEE, check out the overview and publications.

The tutorial is intended for R users with basic knowledge in handling lists and data frames.

Return to the top



Lesson 1 – Installation and file preparation

RSEE Installation

To install RSEE, open an R session and run:

install.packages("RSEE", repos=c("http://cs.tau.ac.il/~itamares/rsee/R","http://cran.r-project.org"))

In case of the first option, you can save the data anywhere you like through R, using the save() function.


Preparing a folder system

According to user preference, RSEE can produce output in two ways:

  1. R objects
  2. Data files, similar to SEE-workshop.

In case of the first option, you can save the data anywhere you like through R, using the save() function.

In case of the second option, the 12 output data files are saved in the same folder as the input tracking file. Therefore, to avoid confusion it is recommended to put each file in a separate subfolder of the working directory.

For this tutorial, create and select a working directory. For example, run:

dir.create("C:/RSEE tutorial")
setwd("C:/RSEE tutorial")

Now, create a subfolder for the tracking file you will use in the tutorial. For example:

dir.create("C:/RSEE tutorial/file_01")

Return to top of lesson 1


Preparing a tracking file for analysis

RSEE analyzes raw XY tracking coordinates of a single animal. Different tracking programs produce coordinates in different file formats (*.txt, *.csv, *.xlsx etc.).

RSEE requires a CSV (comma delimited) file. The file should be arranged in 4 columns in the following order:

  1. Time – frame number (a running index from 1 to the last frame of tracking)
  2. Time trial – the time stamp in seconds of each coordinates, according to tracking sampling rate* (25 Hz – 0.00, 0.04, 0.08,… 30 Hz – 0.033, 0.067, 0.010,…)
  3. X center** – X coordinate
  4. Y center** – Y coordinate

* Note – the tracking sampling rate may differ from the filming sampling rate, especially in real-time tracking. For example – filming may be 25 Hz and tracking 5 Hz.

** Note – the XY coordinates should be the raw data, before smoothing. If your tracking software features path smoothing, disable it before exporting the data.

This is an example of a tracking CSV file, tracked at 25 Hz:

Return to top of lesson 1


Forced exploration VS free exploration

In case of forced exploration tests, preparation is complete.

In case of free exploration tests, in time segments where the animal stayed in its home cage, the coordinates should be constant at some random point outside the arena. RSEE will then be able to distinguish these segments from the entry segments.

In this example the home cage is located at X=215.316 , Y=82.559.

To continue the tutorial, rename your file file_01.csv and place it in the file_01 subfolder of your working directory.

Return to top of lesson 1

Return to the top



Lesson 2 – see.smooth()

see.smooth() is the RSEE equivalent to SEE Path Smoother. It has three objectives:

  1. Smoothing the XY coordinates with LOESS.
  2. Calculating momentary velocities and accelerations.
  3. Finding stops in the animal’s movement (“arrests”) by smoothing the XY coordinates with Repeated Running Medians (RRM).

This lesson should help understand key arguments of see.smooth(), and how to use them.

To access the function full documentation, run:

?RSEE::see.smooth

Using the correct arguments in see.smooth()

This section is more theoretical and provides useful information required to produce better results with see.smooth(). If you wish to continue the practical tutorial right away, jump to the next section.

see.smooth() arguments define general settings, LOESS settings for path smoothing and calculation of velocities and accelerations, and RRM settings for finding “arrests”. This tutorial covers the important arguments to define.


General Settings

  • SamplingRate – Numeric. The tracking sampling rate. In the example in lesson 1 SamplingRate = 25.
  • silent – Boolean. If the tracking file columned are properly named according to lesson 1, then silent = FALSE is unnecessary. silent = FALSE means the user is asked if the names of the table columns in the tracking file are correct. If they were named properly, you can skip this enquiry by setting silent=TRUE.

  • outFilePrefix – character string. As mentioned in lesson 1, RSEE output may be saved in data files (as in SEE-workshop). To do so, outFilePrefix must be defined.
    For example:
    Setting outFilePrefix = "file_01" will save the output data files in the tracking file directory, named file_01.txt etc.

Return to top of lesson 2


LOESS settings

Most LOESS defaults should not be changed unless you have the required knowledge in this method of smoothing. HalfWindowWidth = 0.4 seconds (default) should work fine for most sampling rates. Increasing it will result in stronger smoothing.
In sampling rates of 5 Hz or lower it is recommended to increase the HalfWindowWidth value to 1 second.

This is an example of a path of mouse, smoothed with LOESS using the default arguments:

Return to top of lesson 2


What are arrests?

During tracking, adjacent XY coordinates may differ even though the animal hadn’t moved. For example, a stationary mouse’s body wobbles during grooming - causing its center, found by the tracking software, to change place with each frame. In order to analyze movement segments (lesson 3), we first need to isolate these non-movement segments (“arrests”). For that we use Running Medians.

Let us consider the arguments Arrests and MovingMedCutOffValue of see.smooth() and hypothetically define Arrests = t and MovingMedCutOffValue = d for some \(t\),\(d\).
Then arrests are defined as time segments of length \(t\), where the distance between the two most distant XY coordinates after RRM is no more than \(d\).

Return to top of lesson 2


RRM settings

Two should be considered when setting RRM arguments:

  • MovingMedCutOffValue – Numeric. The larger the animal size compared to the arena size, the higher this value should be.
    A tiny mouse in a relatively large arena – 1e-006 cm (default) is enough. A large rat in a small box – should be 0.01 to 0.1 cm, or in extreme cases even up to 1 cm.
  • Arrests – should be 5 frames (default) for SamplingRate = 25, and 6 frames for SamplingRate = 30. If your tracking sampling rate is 5 Hz or lower, set 2 or 3 frames.

Return to top of lesson 2


Running see.smooth()

If you followed the instruction in lesson 1, your tracking file’s full name and path should be "C:/RSEE tutorial/file_01/file_01.csv". For this example, name your see.smooth() output object res, and run:

res <- see.smooth(Filename="C:/RSEE tutorial/file_01/file_01.csv",silent=TRUE)

Note: Set the function arguments according to your experiment properties. Using the wrong values will most likely lead to bad segmentation or arena builder calculations.

Now wait patiently until you get this delightful message:

Save res in the same folder as your original CSV file by running:

save(res,file= C:/RSEE tutorial/file_01/res.RData")

Return to top of lesson 2


see.smooth() output

res is now a list of two data frames – res$smoothed and res$original:

  • res$original – a data frame containing the original CSV table.

  • res$smoothed – a data frame containing the smoothed XY coordinates, the XY velocities and accelerations.

In case of free exploration, the two additional columns are relevant:

  • hidden.zone – equals 0 when the animal is inside the arena, and 1 in the home cage.

  • entry – counts the number of entries made so far.

see.smooth() output res can now be used for segmentation.

Note: for information on the output data files produced when outFilePrefix is defined, check out SEE-workshop tutorial.

Return to top of lesson 2

Return to the top



Lesson 3 – see.segment() and plot.segmentation()

see.segment() is the RSEE equivalent to SEE Path Segmentor. It divides the motion segments, distinguished from arrests by see.smooth() (lesson 2), into progression and lingering.
plot.segmentation() is complementary to see.segment(), and provides visualization to better assess the quality of segmentation.

To access the functions full documentation, run:

?RSEE::see.segment
?RSEE::plot.segmentation

Using the correct arguments in see.segment()

This section is more theoretical and provides useful information required to produce better results with see.segment(). If you wish to continue the practical tutorial right away, jump to the next section.

In general, there is no need changing defaults of see.segment(). However, there may be certain cases where NumberOfGaussians should be changed from 2 to 3, or see.smooth() should be run again with different MovingMedCutOffValue.


Progression and lingering segments

A motion segment is defined as a time segment between two arrests. see.segment() divides these segments into two groups – lingering, consisting of very slow movement or complete arrest, and progression, where the animal reaches high maximal speed.

see.segment() calculates the maximal velocity per motion segment, and then applies the EM algorithm to fit 2 Gaussians to the frequency distribution (density) of these velocities after log transformation.

The intersection of the 2 Gaussians will be considered the maximal velocity threshold, and then:

  • Progression - all motion segments with maximal velocity higher than the threshold.
  • Lingering – all other segments, including arrests.

This is an example of segmentation of the movement of a single mouse (represented by momentary velocity per frame) in some time frame during the experiment:

This is an example of visualization of the maximal velocities density and the EM fit, using plot.segmentation():

Return to top of lesson 3


Finding the correct maximal velocity threshold

A proper density plot should have two distinct peaks – low max velocities (lingering) and high max velocities (progression). If so, RSEE will calculate the proper value.

The shape of the frequency distribution is affected directly by finding arrests. If the density plot is way off, run see.smooth() again with different MovingMedCutOffValue.

This in an example of setting the MovingMedCutoffValue too high:

The argument defaults of see.segment() hold in most cases, and the \(log(x+2)\) transformation should be kept.

In cases of three distinct peaks, there is an option to fit three Gaussians by setting NumberOfGaussian = 3. Note that in this case RSEE calculates the threshold as the intersection between the left and the middle ones.

Return to top of lesson 3


Running see.segment() and plot.segmentation()

Continuing lesson 2, the list res contains the original and the smoothed data.
For this example, name your see.segment() output object res.segment, and run:

res.segment <- see.segment(res$smoothed)

Now wait patiently until you get this message:

To assess the quality of segmentation, run:

plot.segmentation(res.segment)

If there are 2 distinct peaks in the density plot, and the Gaussian intersection is between them, then the segmentation is fine. If not, rerun see.smooth() and see.segment() according to the instructions in Using the correct arguments in see.segment().

Save res.segment in the same folder as your original CSV file by running:

save(res.segment,file= C:/RSEE tutorial/file_01/res_segment.RData")

Return to top of lesson 3


see.segment() output

res.segment is a list of twelve objects. This section describes the most useful:

  • res.segment$threshold – the value of the Gaussian intersection (in log transformation)
  • res.segment$thresholdNonTransformed – the maximal velocity threshold between lingering and progression.
  • res.segment$lingProgSegments – a data frame that divides the experiment into segments of lingering (type = 0) and progression (type = 1), and the start and stop frame of each segment.
    spatial.spread is the distance between the most distant XY coordinates during each segment.

For additional information on the ouput of see.segment(), refer to the function documentation.

res and res.segment can now be used for the arena builder.

Return to top of lesson 3

Return to the top



Lesson 4 – see.buildArena()

see.buildArena() is the RSEE equivalent to SEE Arena Builder. It has two major objectives:

  1. Calculating the boundaries of the arena according to the animal path.
  2. Wall/center separation.

This lesson should help understanding proper use of see.buildArena() and its output.

To access the function full documentation, run:

?RSEE::see.buildArena

Note: see.buildArena() is currently applicable for circular open field arenas. It will not produce sufficient results in square or rectangular arenas, plus mazes etc.


The Arena builder

This section is more theoretical and provides useful information required to produce better results with see.buildArena(). If you wish to continue the practical tutorial right away, jump to the next section – Running see.buildArena().


Calculating arena boundaries automatically

The arena builder algorithm estimates the arena boundaries, under two assumptions:

  • The animal tends to travel along the wall.
  • The shape of the arena is not a perfect circle.

The algorithm finds the center of the arena, divides the arena into sectors (see.buildArena() argument NumSectors, default of 360 sectors) and calculates the coordinate of the arena boundary per sector.

While NumSectors = 360 usually produce sufficient results, we have found that NumSectors = 720 produce better results in many cases.

This is an example of see.buildArena() output with NumSectors = 360 (left) and NumSectors = 10 (right), on the same smoothed path:

Return to top of lesson 4


Calculating the arena boundaries manually

The algorithm may produce bad results if the animal does not travel along the wall. In that case, it is possible to define the boundaries manually, use the following settings in see.buildArena():

  • AutoArenaCenter = FALSE – allows the user to define the arena center manually. You must then define ArenaCenterX and ArenaCenterY to your arena.
  • RoundArenaWithRadius = TRUE – defines the boundary coordinates per sector according to the radius set in ArenaRadius.

Return to top of lesson 4


Wall/center separation

The wall/center separation algorithm does not depend on user definitions.
After defining arena center and boundaries, the algorithm calculates radial distance from the wall per frame, as well as the radial velocity (velocity component in the radial direction).

It then finds a threshold of radial distance from the wall, to separate the animal’s path into near-wall coordinates and center coordinates. The progression segments, found by see.segment(), are divided into two types:

  • Incursions – progression segments in the center of arena (radial distance of the animal from the wall is larger than the threshold during the entire segment).
  • Wallcursions – progression segments near the wall (radial distance of the animal from the wall is smaller than the threshold during the entire segment).

see.buildArena() argument AddToRadius adds a specified length in cm to the estimated radial distance during wall/center separation, to guarantee that remain positive. It is unnecessary to change it.

Return to top of lesson 4


Running see.buildArena()

Continuing lesson 3, the list res contains the original and the smoothed data, and res.segment contains the segmentation data. For this example, name your see.buildArena() output object res.arena, and run:

res.arena <- see.buildArena(res$smoothed,res.segment$lingProgSegments)

Now wait patiently until you get this message:

Save res.arena in the same folder as your original CSV file by running:

save(res.arena,file= C:/RSEE tutorial/file_01/res_arena.RData")

Return to top of lesson 4


see.buildArena() output

res.arena is a list of ten objects. This section describes the most useful:

  • res.arena$xcenter and res.arena$ycenter – numeric. The XY coordinates of the arena center.
  • res.arena$arenaPositions – a data frame containing the XY coordinates of the arena boundary. The table contains NumSectors rows. z value is currently not applicable.

  • res.arena$radDistanceThreshold – numeric. The wall/center threshold (distance from the wall) in cm.
  • res.arena$incursions – a data frame containing start and stop frames for all wallcursions (type = 2) and incursions (type = 3).
    spatial.spread is the distance between the most distant XY coordinates during each segment.

For additional information on the ouput of see.segment(), refer to the function documentation.

Return to top of lesson 4

Return to the top



Summary

The RSEE tutorial is now over. You should now have the knowledge to produce high quality data sets from your tracking files: * Smoothed animal path - see.smooth() * Segmentation of the animal’s movement to progression and lingering - see.segment() * Arena center and boundaries – see.buildArena() * Incursions, wallcursions and radial distance threshold - see.buildArena().

The next step is to build your own statistical analysis based on this data, according to the requirements of your research.

For examples of studies which used SEE, refer to the publications section.

Return to the top



Supported by European Research Council under the uropean Community’s Seventh Framework Programme, ERC grant 294519 (PSARPS)